Evaluating MCP Shopping Agents: Why Tool Design Beats Model Scale
We ran hundreds of shopping agent conversations across eight MCP routes and four model sizes. The bottleneck was never the model; it was the interface.
We're preserving the future of tech, one talented individual at a time.
Join the jar and become part of something truly dill-icious!
Don't see a position that fits? Send your github to:
apiVersion: v1
kind: Secret
metadata:
name: careers
type: Opaque
data:
email-address: Y2FyZWVyc0Bzb2xlbnlhLmFpand tell us how you can spice up our team!
Open-notebook research. Methods, confidence intervals, and the experiments that didn't work.