Internet, March 20, 2023. Yandex has published the source code for YTsaurus, a platform for storing and processing big data that is used in most Yandex services.
YTsaurus is suitable for a wide range of tasks, from analytics to training complex models with billions of parameters. For example, Yandex Search builds its search index with it, and self-driving cars use it to process ride data and improve algorithms. YTsaurus manages Yandex's supercomputers, distributing the load so that computing power is used as efficiently as possible.
“Yandex has been developing YTsaurus (or YT, as we call it internally) since 2010. We started building our own big data ecosystem, because no single solution on the market could meet all of our requirements. Now YTsaurus is one of the key elements of Yandex‘s internal infrastructure. Dozens of developers are working on the platform, and its capabilities are constantly expanding,’ says Maxim Babenko, head of the distributed computing technologies department.
YTsaurus is a fault-tolerant and highly scalable platform. It is deployed on tens of thousands of Yandex servers and processes exabytes of data; every second employee of the company works with it. YTsaurus can be used as a classic MapReduce system, but it also supports other popular approaches to data processing. For example, it has integrations with ClickHouse and Apache Spark; more information about its capabilities can be found in the Yandex blog on Medium.
‘YTsaurus has proven itself at Yandex, and now we‘ve made it available to everyone. Large companies that process huge amounts of data on thousands of servers with an ever-increasing load will benefit the most. We are confident that making it open-source will take it to a new stage of development, as has already happened with our other products,’ says Alexey Bashkeev, head of Yandex Cloud.
YTsaurus source code and documentation are available on GitHub. The code is distributed under the Apache 2.0 license, and anyone can use the platform or modify it for themselves.
Phone: +7 495 739-70-00