Yandex N.V. is an international public company founded in 1997. In Russia, its primary legal entity is OOO "YANDEX", registered in 2000 with its headquarters in Moscow. The company is known worldwide under the Yandex brand—an acronym for "Yet Another Indexer," reflecting its initial specialization in search technologies.The company gained fame as the creator of the largest search engine in the Russian-speaking internet and as a developer of a vast ecosystem of services: from a browser and maps to cloud solutions, analytics tools, and e-commerce platforms. In the field of artificial intelligence, Yandex has established itself as a developer of large language models for the Russian language and one of the leading research centers in machine learning in Russia.
In 2017, "Alice" became one of the first large-scale Russian-language assistants, relying on proprietary technology. The company then moved on to large LLMs, with the release of YaLM-100B in June 2022—the first open GPT-like model with 100 billion parameters, specifically optimized for the Russian language. The model was trained for 65 days on a cluster of 800 A100 graphics cards, processing 1.7 TB of text data. Since 2023, Yandex has been developing its Yandex Foundation Models service line in the cloud. Among its scientific discoveries, the Yandex Research team is known for developing innovative methods for the extreme compression of large language models, presented in the paper "Extreme Compression of Large Language Models via Additive Quantization." This technology allows for reducing the model size to 2-3 bits per parameter, cutting deployment costs by 8 times without significant loss of quality. CatBoost also deserves special attention—a classical ML algorithm that, from 2019 to the present day, has been leading in many Kaggle competitions, outperforming neural networks.
In 2025, the company introduced the YandexGPT 5 family. The lineup includes YandexGPT 5 Pro for complex business tasks and the open-source YandexGPT 5 Lite with 8 billion parameters for broad application. A key feature of the Lite version was its two-stage training process: a main pretraining on 15 trillion tokens and a "Powerup" stage on 320 billion high-quality data points, extending the context window to 32,000 tokens. It is one of the best models in terms of understanding the Russian language. Furthermore, Yandex's open-access Alchemist collection offers innovative tools for improving image generation quality, based on a systematic approach.