Knowledge Graphs for Product Recommendations

and

Jun 17, 2024

∙ Paid

As we explored the potential of leveraging knowledge graphs in various applications, we found another use case of real-world usage that we are all very familiar with: e-commerce!

In the vast and ever-evolving world of e-commerce, delivering personalized and relevant product recommendations to customers is a crucial challenge. While traditional recommendation systems often rely on historical purchase data and collaborative filtering, they can struggle to capture the nuanced, commonsense knowledge that underpins many of our everyday purchasing decisions.

At left is a flow chart that begins with the query "Want shoes for pregnant women", with an arrow connecting it to the action "Bought a slip-resistant shoe", which is in turn connected to the commonsense triple <pregnant, require, slip-resistant>. At right are a selection of product pages for slip-resistant shoes.

Amazon COSMO Framework

That's where Amazon's COSMO framework comes into play. COSMO, or the "Common Sense Knowledge Generation and Serving System," is a groundbreaking approach to building commonsense knowledge graphs that can dramatically improve the performance of product recommendation engines.

At the heart of COSMO is the recognition that commonsense reasoning is essential for understanding the context and relevance of customer queries. If a customer searches for "shoes for pregnant women," for example, a recommendation system powered by COSMO would understand the implicit need for slip-resistant, comfortable footwear, rather than simply suggesting the most popular or highest-rated shoes.

To build this commonsense knowledge, COSMO leverages a recursive process that combines large language models (LLMs), human annotation, and machine learning. The system starts by mining customer behavior data, including query-purchase pairs and co-purchase patterns, to uncover the underlying relationships between products and the contexts in which they are used.

An LLM is then tasked with describing these relationships using a set of predefined categories, such as "used for," "capable of," "is a," and "cause." The resulting candidate relationships are filtered using a series of heuristics to remove low-quality or redundant entries, and a subset is sent to human annotators for evaluation based on plausibility and typicality.

Armed with the annotated data, COSMO trains a machine learning-based classifier to assign scores to the remaining relationship candidates, keeping only those that meet a certain threshold. These high-quality relationships are then encoded as instructions for the LLM, which is prompted to generate additional explanations and insights.

A cyclical flow chart that begins in the upper left with "user behavior", featuring icons that represent search, product views, ratings, and purchases. A right arrow labeled "prompt" connects the user behavior to a neural-network icon labeled "LLMs". A right arrow labeled "generate" connects the LLMs to a stacked-papers icon representing "knowledge". A downward arrow labeled "filter" connects "knowledge" to a box containing the words "rule-based filtering" and "similarity filtering". A left arrow labeled "annotate" connects the filtering box to a box labeled "Human feedback". A final left arrow connects "human feedback" to a box labeled "Instructions", which contains an example instructing the LLM to use the "capableOf" relation to explain the connection between the query "winter coat" and the product "long-sleeve puffer coat". The LLM's output is "Provide high-level warmth".

The final result is a comprehensive knowledge graph that captures the commonsense connections between products, their functions, audiences, and usage contexts. This graph can then be seamlessly integrated into product recommendation models, providing a powerful boost to their performance.

Performance

To evaluate the impact of COSMO, the researchers conducted a series of experiments using the Shopping Queries Data Set, a benchmark dataset created for the KDD Cup 2022 competition. They compared the performance of three recommendation models: a bi-encoder, a cross-encoder, and a cross-encoder enhanced with COSMO's commonsense knowledge.

Keep reading with a 7-day free trial

Subscribe to The MLnotes Newsletter to keep reading this post and get 7 days of free access to the full post archives.