Kimchi 1: Product Search Enters the Post-Training Era
Post-training is reaching retrieval. Here is the formal construction—Plackett–Luce policy gradients under a frozen rubric judge—and why product search is the right place to build it.
A blend of the finest ingredients to always hit the spot.
From machine learning researchers to engineers, we're all fermenting ideas into reality—one dill-icious innovation at a time.
Two timezones, one kitchen. London ships product, Cape Town ships papers.
Open-notebook research. Methods, confidence intervals, and the experiments that didn't work.