Yandex has launched the latest version of its image-to-text translation technology, powered by advanced neural networks. As one of the first to leverage large language models (LLMs) for visual translation, the Yandex solution now delivers more context-aware results. Whether you’re traveling abroad and need to translate a restaurant menu quickly or working with complex foreign-language technical documentation, this tool adapts to any scenario. Yandex has also improved the appearance of translations, making them more readable and faithful to the original layout. The update is available today in both Yandex Translator and Yandex Browser, with integration into Smart сamera coming soon.
The YandexGPT model family understands linguistic nuances and preserves the original tone and style, even capturing wordplay in advertising slogans or newspaper headlines. The new neural network technology achieves more precise phrasing when dealing with expressions with multiple meanings and avoids literal translations. This improves translation accuracy across various texts, from simple product ingredient lists to complex articles, encyclopedias, and technical manuals. For now, the LLM-based translation feature is optimized for processing text in English images.
Yandex has refined the visual rendering of translated text across dozens of languages. The system erases the original text from the image and replaces it with the translation, matching the font, size, and color to blend naturally with the visual context. Yandex algorithms also eliminate artifacts, ensuring the translated text looks like it belongs on the image. Thanks to better contrast, the result is often clearer and easier to read than the original. The technology can recognize partial words and accurately capture the meaning of fragmented text.
How YandexGPT learned to translate
To power this new photo translation feature, Yandex developed a custom model within the YandexGPT family, fine-tuned specifically for translating from English into Russian. This model was trained on a diverse dataset of original and translated text pairs. The Yandex team provided examples of both high-quality and flawed translations, helping the model refine its output by identifying and avoiding common pitfalls, such as adding unnecessary details.
To process user requests quickly, Yandex used a technique called model distillation. This method transfers knowledge from a larger, more complex "teacher" model to a smaller, more efficient "student" model. The student model replicates the teacher’s performance while operating with significantly reduced computational overhead.
Contacts
Yandex Press Office
pr@yandex-team.com