What is Zero-shot Discovery? Let's dive in.
24 August 2024 | 3 min
Zero-shot learning is field of machine learning that enables AI systems to recognize or classify things they've never seen before. Unlike traditional models that need many labeled examples, zero-shot learning systems can make educated guesses about new classes based on descriptions or other semantic information. For example, a zero-shot image classifier could potentially classify animals as stripy or spotty despite only being trained on dog breeds [CLIP], or Language models trained on sentence summarization across languages may be used to for translation between English and French [GPT2].
What is Discovery?
In e-commerce, discovery refers to the process by which customers find and learn about products or services through experiences such as:
- Product browsing: Allowing customers to explore different categories, collections, or curated lists of items.
- Search: Enabling users to find specific products using keywords or filters.
- Personalized recommendations: Suggesting items based on a user's browsing history, previous purchases, or preferences.
- Featured items: Highlighting new, popular, or seasonal products on the homepage or in promotional areas.
- Cross-selling and upselling: Presenting related or complementary items to customers based on their current interests.
- Visual search: Allowing users to search for products using images rather than text.
While these experiences have come to define modern e-commerce platforms, they are often disconnected from each other: with separate models, infrastructure and data pipelines required for each. Many of these recommendation-type systems are built using collaborative filtering, content-based filtering, or hybrid approaches, either relying on pre-defined product and user attributes - which are incomplete and expensive to maintain - or historical user-item interactions - which do not generalize well to new users or items.
In fast-moving industries like fashion, leading retailers maintain a 94% stock-out every 90 days, meaning most products do not have the data necessary for traditional interaction-based recommendation algorithms, and are not recommended to users [1, 2]. For high growth business, new users also lack similar data, and so are not cannot be served recommendations. This has caused costly fragmentation in the industry, with different teams and algorithms fused for both new and old users, and users waiting hours or days for batch systems to update their personalised product.
Aims for Zero-shot Approaches
The aim for Zero-shot Recommendations is to fix these challenges by allowing models to understand the underlying meaning of the data. Zero-shot Recommendations algorithms are able to generalize recommendations not only to new products, but new users as well, allow recommendations which are instant and dynamic [ZecRec (2021)].
While Zero-shot Recommendations algorithms are still at the cutting edge of research, they too have their own challenges. Zero-shot Recommendations algorithms capture only a portion of the discovery experience. Zero-shot Recommendation algorithms do not work for search, browsing or traditional item-item recommendations, and so still result in the duplication and fragmentation which has hampered growth in e-commerce. Even modern semantic and hybrid neural search engines do not handle the full range of discovery experiences, and so fail in personalizing search results to the user.
Zero-shot Discovery aims to unify these experiences, bridging Search, Recommendations, Browsing and Featured Items into a single algorithm. By taking a Zero-shot and Few-shot approach, Zero-shot Discovery algorithms can generalize to new users and items, and provide a unified approach across all discovery experiences. This is great for businesses as it reduces the complexity of their systems, and allows them to provide a more consistent and personalized experience to their users. It also allows businesses to grow faster, as they can provide a more personalized experience to new users, and take on new products without the need for additional complex model development and data collection.