Yandex

Yandex Research has unveiled a new neural network architecture, designed to work with tabular data, known as TabM

This technology enables rapid processing of large-scale data and delivers highly accurate predictions, a capability in high demand across business, research, and medicine. Models for tabular data are widely used for tasks such as optimizing supply chains, forecasting energy consumption, classifying patients by disease risk, and much more.

The Yandex Research team's development has already been put to use on Kaggle, Google's international platform for data science and machine learning competitions. Notably, the new architecture was utilized to improve the prediction of transplant survival rates for patients undergoing allogeneic Hematopoietic Cell Transplantation (HCT) in а competition hosted by the CIBMTR. Kaggle winners and prize recipients collectively earned $60,000 in prize money for solving these and other challenges using TabM.

TabM (short for Tabular Deep Learning model that makes Multiple predictions) is an effective implementation of a model ensemble approach. Several models in the ensemble analyze the data, and their predictions are then averaged. The TabM architecture allows for an optimal balance between forecast accuracy and efficient use of computing resources.

In tests across 46 datasets, TabM outperformed other methods—not only achieving the best average ranking (1.7 for TabM compared to 2.9 for its nearest competitor), but also demonstrating superior stability, which is crucial for real-world applications. Thanks to its ability to combine the strengths of multiple submodels and efficiently leverage computing resources, TabM can rival classic gradient boosting models such as CatBoost, XGBoost, and LightGBM—long considered the gold standard for tabular data.

The architecture is available to developers and researchers on GitHub, and the research paper is accessible on arXiv.

Since 2019, Yandex Research scientists have published eight research papers on deep learning models for tabular data, collectively receiving over 1,900 citations.

Notably, the TabM paper has been referenced by institutions such as the University of Mannheim (Germany), the National University of Singapore, Korea University, and the University of Illinois Urbana-Champaign.

Over the years, these papers have been accepted at top-tier artificial intelligence conferences, including NeurIPS, ICLR, and ICML.

IPJSC “Yandex”

Head office
16, Leo Tolstoy St., Moscow, Russia 119021
Investor Relations
Public Relations
Corporate Secretary