Simple Science

Cutting edge science explained simply

# Computer Science# Information Retrieval# Computation and Language

Improving Online Search with EBRM

Introducing a new model to enhance online shopping search results.

― 6 min read


EBRM: A Game Changer inEBRM: A Game Changer inSearchwith entity-based relevance.Transforming online search efficiency
Table of Contents

Finding the right items when shopping online can be tough. With many choices available, it's essential for search systems to help users discover what they want quickly. A key part of this process is how well a system can match a user's search words with the products in its database.

The search system needs to be fast and precise. However, current systems sometimes struggle. Some models focus on speed but lose accuracy, while others provide good results but are slow. This article presents a new approach called the Entity-Based Relevance Model (EBRM) that aims to be both quick and accurate.

The Need for Improved Search

In today's world, many people shop online using popular platforms like Amazon and eBay. With millions of products available, users often type in short phrases to describe what they want, which can be vague. On the other hand, sellers usually write long and detailed product titles. This difference makes it hard for search systems to link user queries to the right items.

Traditional keyword-based methods, such as BM25 and TF-IDF, look at how often words appear in a query and in product titles to decide how relevant they are. But these methods often fall short due to differences in vocabulary between users and sellers. For instance, if a user searches for "gym weight," a model might mistakenly link this inquiry to an unrelated item because of vocabulary mismatches.

To tackle these issues, modern search systems have begun using neural networks that represent queries and items as dense vectors in a semantic space. These methods, especially those using Transformer-based models like BERT, have shown promise in retrieving relevant information.

The Shortcomings of Existing Models

However, there are limitations to these advanced models. Bi-encoders, which create separate representations for queries and items, can cache results for speed but may sacrifice accuracy. Cross-encoders offer better accuracy as they consider the full interaction between the query and the item but tend to be slow because they cannot precompute vectors.

Moreover, most existing models merely provide predictions without explaining how they arrived at those results. Humans can easily justify their decisions based on the items that match the query. For instance, if a user types "gym weight," they expect to see items like "dumbbells," which are specific types related to the query. Currently, if a search model makes an error, it can take a lot of time and effort to correct it.

Introducing the Entity-Based Relevance Model (EBRM)

To address these shortcomings, we introduce the Entity-Based Relevance Model (EBRM). This new approach focuses on understanding the entities-specific items or categories-in each product. By identifying these entities, we can break down the relevance problem from query-item pairs to query-entity pairs. This allows us to aggregate results for improved predictions.

EBRM consists of two main components: a query-entity relevance module and a prediction module that uses soft logic. The relevance module is trained to assess how well a query relates to specific entities within items. By using this method, we not only increase the accuracy of our predictions but also make the process easier to interpret. Users can understand why certain items are shown based on their matching entities.

How EBRM Works

In our model, we first identify relevant entities within product titles. For instance, if a user searches for "gym weights," the system recognizes that "dumbbell" is a matching entity. Through this focus on entities, we aim to determine if the query and an item are relevant by checking if there are any corresponding product type entities.

The model uses a soft logic layer to combine the results from the query-entity relevance predictions into a final query-item relevance score. This helps streamline the process and allows for caching, which speeds up online searches.

Importance of Entity Recognition

A significant aspect of EBRM is entity recognition. By recognizing specific entities within product titles, we can create a more meaningful connection between what users are searching for and the items available. This not only helps in accurate predictions but provides justification for predicted results.

For example, if a title reads "Best Dumbbells for Home Gym," the system should easily recognize "Dumbbells" as the relevant entity for the query "gym weights." This process allows the model to filter through the numerous item titles available in an e-commerce platform more effectively.

Training the Model

Training the EBRM requires a large amount of data, but rather than relying solely on painstakingly labeled data, our model also utilizes patterns from user behavior. By analyzing search logs from the platform, we can create pseudo-labeled data that reflects what users tend to click on. This method significantly reduces the amount of manual labeling needed while still training the model effectively.

We gather data from user interactions, where clicks and purchases provide insights into which items are relevant to specific queries. By analyzing these interactions, we can determine which items users are interested in and use this information to improve the model’s predictions.

Evaluating EBRM

To ensure EBRM's effectiveness, we conducted several experiments using both private datasets and publicly available datasets. The results showed that EBRM significantly outperformed traditional models in terms of accuracy and speed. It not only provides faster predictions but also improves the overall shopping experience by minimizing irrelevant results.

Through the evaluation process, it was noted that EBRM operates efficiently in real-world applications. The model’s ability to cache and retrieve entity relevance predictions allows it to handle user requests quickly, making it a valuable tool for online shopping platforms.

Real-world Application and Impact

EBRM was deployed in a real e-commerce environment, where it underwent A/B testing against existing search models. The results confirmed that EBRM improved search relevance by a notable percentage. This demonstrates its potential impact on enhancing the shopping experience for users.

Furthermore, the model’s low storage requirements are a significant advantage for e-commerce systems. With each item having a limited number of recognized entities, EBRM can operate without demanding excessive computational resources.

Conclusion

In summary, the Entity-Based Relevance Model provides a promising solution to the challenges faced by current online search systems. By focusing on entities, EBRM enhances the connection between user queries and product offerings. The model is not only accurate and fast but also interpretable, allowing users and system operators to understand the rationale behind its predictions.

As e-commerce continues to grow, effective search systems will be vital in helping users find the products they want. EBRM serves as a step towards achieving that goal by addressing the gaps present in traditional search methods and offering a robust framework for future enhancements in online shopping experiences.

Original Source

Title: Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

Abstract: Discovering the intended items of user queries from a massive repository of items is one of the main goals of an e-commerce search system. Relevance prediction is essential to the search system since it helps improve performance. When online serving a relevance model, the model is required to perform fast and accurate inference. Currently, the widely used models such as Bi-encoder and Cross-encoder have their limitations in accuracy or inference speed respectively. In this work, we propose a novel model called the Entity-Based Relevance Model (EBRM). We identify the entities contained in an item and decompose the QI (query-item) relevance problem into multiple QE (query-entity) relevance problems; we then aggregate their results to form the QI prediction using a soft logic formulation. The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy as well as cache QE predictions for fast online inference. Utilizing soft logic makes the prediction procedure interpretable and intervenable. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance. The proposed method is evaluated on labeled data from e-commerce websites. Empirical results show that it achieves promising improvements with computation efficiency.

Authors: Jiong Cai, Yong Jiang, Yue Zhang, Chengyue Jiang, Ke Yu, Jianhui Ji, Rong Xiao, Haihong Tang, Tao Wang, Zhongqiang Huang, Pengjun Xie, Fei Huang, Kewei Tu

Last Update: 2023-07-19 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2307.00370

Source PDF: https://arxiv.org/pdf/2307.00370

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles