Yandex

Yandex researchers introduce upgraded formula to enhance recommender accuracy by 7%

  • Yandex researchers have corrected the widely used logQ formula, improving how recommender systems learn user preferences.
  • The new formula enhances the quality of recommendations аnd improves ranking accuracy.
  • Tested on 300B+ interactions, the upgrade delivers an average 7% increase in recommendation accuracy on key ranking metrics.
  • The upgrade can help businesses deliver more personalized recommendations on e-commerce sites and streaming platforms, improving user satisfaction.

Yandex researchers have developed a solution to improve the accuracy of product and content recommendations by addressing errors in how recommender systems estimate user preferences. The updated formula corrects the widely used logQ algorithm by accounting for the distinction between user actions (for example, purchases or browsing) and randomly sampled alternatives from the catalog used for training. This mathematical approach significantly improves recommendation accuracy, even for massive catalogs and billions of user interactions.

Inaccuracies in recommender systems

Everyday services — from e-commerce to streaming platforms — rely on recommender systems to match users with items they are most likely to enjoy. A common approach is the two-tower neural network, which matches users with items. These models typically work in two stages: retrieval (selecting a smaller set of candidates from a massive catalog) and ranking (ordering those candidates by relevance).

The main challenge occurs during retrieval. As the amounts of data grow to hundreds of millions of items, calculating matches across all of them becomes computationally infeasible. To make training practical, the industry uses approximation methods that replace these costly calculations with simpler ones.

One widely used method is sampled softmax, which trains the model by comparing positive user actions (e.g., a purchase) against a small, randomly selected subset of negative items (those not interacted with). This approach saves computational resources but has a limitation: the random negative examples often fail to reflect real-world patterns, which can reduce the accuracy of recommendations.

Improving the logQ formula

The standard industry solution to address these shortcomings relies on a probability correction factor known as logQ. However, the original formula fails to account for the difference between positive examples and sampled negatives. As a result, models could misinterpret genuine user preferences as noise, undermining training.

Yandex researchers proposed an update to the logQ formula that corrects this limitation. During training, the model is shown one item that the user liked (e.g., pizza) alongside several random items (e.g., salad, porridge, broccoli). The improved formula explicitly treats the positive item as fixed and intentional, while still handling the others as random. This adjustment allows the model to properly weight user preferences and learn more accurate distinctions between relevant and irrelevant items.

The method improves both recommendation quality and ranking accuracy without requiring changes to model architecture, making it straightforward for businesses to adopt. In tests on an internal Yandex dataset with 300 billion parameters, as well as the widely used MovieLens and Steam datasets, the new approach achieved an average 7% increase in recommendation accuracy across key metrics such as Recall@20, NDCG@20, and Recall@1000.

This improvement represents a significant advancement for the IT industry and, in the long term, can considerably enhance the performance of video hosting platforms, online marketplaces, and other portals where users frequently rely on recommendations to discover products or media content.

ACM RecSys 2025

The research paper on the logQ improvement is currently being presented at ACM RecSys 2025, a leading international conference on recommender systems taking place from September 22 to 26 in the Czech Republic. The event features innovations from top universities and tech companies such as Amazon and Google.

Another Yandex paper presented at ACM RecSys 2025 highlights Yambda, one of the world's largest open datasets for recommender system development, which was downloaded more than 50,000 times in its first month.

About Yandex

Yandex is a global technology company that builds intelligent products and services powered by machine learning. Its goal is to help consumers and businesses better navigate the online and offline world. Since 1997, Yandex has delivered world-class, locally relevant search and information services and developed market-leading on-demand transportation services, navigation products, and other mobile applications for millions of consumers worldwide.

IPJSC “Yandex”

Head office
16, Leo Tolstoy St., Moscow, Russia 119021
Investor Relations
Public Relations
Corporate Secretary