A compact yet powerful reasoning MoE model from OpenAI, featuring 20.9 billion total parameters (with 3.61 billion activated per token), capable of running on just 16GB of memory—making it ideal for local deployment using widely accessible consumer hardware. Despite its efficiency, it retains all advanced reasoning and tool-use capabilities, outperforming not only existing open-source models but also OpenAI's popular o3-mini on a range of key benchmarks. This makes gpt-oss-20b a strong choice for diverse research and product applications.
The flagship open reasoning model from OpenAI, carrying forward the company's best scientific advancements and achievements used in the renowned ChatGPT. This model features a unique MoE (Mixture of Experts) architecture with 116.8 billion parameters, yet activates only 5.1 billion parameters per token, and is equipped with numerous innovations that efficiently balance performance and resource consumption—enabling the model to run on a single 80GB GPU. GPT-OSS-120B supports a three-level reasoning system and, for the first time in open models, introduces an extended role hierarchy and output channels aligned with specific roles, collectively allowing users to precisely customize and control the model's behavior.
An updated version of Qwen3-30B-A3B with 30.5 billion total parameters (3.3B active) and an extended context length of 262,144, designed for generating instant and accurate responses without intermediate reasoning steps. An exceptionally efficient dialogue model capable of solving both technical and creative tasks—ideal for use in chatbots.
A high-quality agent-oriented model with 106B parameters, optimized for fast inference and moderate hardware requirements, while retaining key capabilities in hybrid reasoning and overall functionality. At launch, the model ranks 6th globally across 12 key benchmarks, demonstrating exceptional speed and outstanding performance in real-world development scenarios. Developers particularly highlight its effectiveness in frontend code autocompletion and code correction tasks.
A hybrid model with 355B parameters, combining advanced reasoning, programming with artifact generation, and agent capabilities within a unified MoE architecture featuring an increased number of hidden layers. At launch, the model ranks 3rd globally in average score across 12 key benchmarks. Particularly impressive are its abilities in generating complete web applications, interactive presentations, and complex code. Users need only describe to the model how the program should function and what outcome they expect.
The new flagship MoE model Qwen3-235B-A22B in the Qwen 3 series features enhanced "thinking" capabilities and an extended context length of 262K tokens. Operating exclusively in thinking mode, it achieves state-of-the-art performance among leading open and proprietary thinking models, surpassing many well-known brands in mathematical computations, programming, and logical reasoning tasks. An ideal choice for complex research tasks requiring advanced agent and analytical capabilities.
A compact MoE model with an architecture of 30.5B total parameters, of which only 3.3B are activated per token, specifically designed to assist in writing software code. The model features agent-like capabilities, supports a context length of 262,144 tokens, and demonstrates excellent performance at relatively low computational cost. These qualities make it an ideal choice for use as a programming assistant, a QA system within programming education platforms, and for integration into tools featuring code autocompletion.
Alibaba's flagship agent-based programming model featuring a Mixture-of-Experts architecture (480 billion total parameters, 35 billion active parameters) with native support for a 256K-token context. Qwen3-Coder's application scenarios cover the entire spectrum of modern software development—from building interactive web applications to modernizing legacy systems—including autonomous feature development spanning backend APIs, frontend components, and databases.
The updated flagship MoE model Qwen3, with 235B parameters (22B active), features a native context length of 256K and supports 119 languages. In its implementation, developers have abandoned the hybrid mode, so the model supports only non-thinking mode. However, improved refinement enables the model to significantly outperform competitors, delivering exceptional results in mathematics, programming, and logical reasoning. Furthermore, the FP8 version allows industrial-scale deployment with a 50% memory saving.
The first Russian language model with 32 billion parameters and a hybrid reasoning mode, combining revolutionary efficiency in processing the Russian language with the ability for deep analytical thinking to solve tasks of any complexity. The model provides twice the computational resource savings compared to foreign counterparts while delivering superior performance, opening new possibilities for autonomous AI agents.
An enormous MoE model containing 1 trillion parameters. The model is specifically designed for autonomous execution of complex tasks, tool usage, and interaction with external systems. Kimi K2 doesn't simply answer questions—it takes action. It represents a new generation of AI assistants capable of independently planning, executing, and monitoring multi-step processes without constant human involvement. This is precisely why developers recommend using the model in agent-based systems.
Powerful reasoning with maximum capabilities and minimal resource consumption. 456B parameters, a context window of 1,000,000 tokens, Lightning Attention — a novel approach to the attention mechanism, and an increased reasoning budget of 80,000 tokens.
This is ultimate performance for tackling the most complex research and product challenges in mathematics, programming, bioinformatics, law, finance, and beyond.
A large MoE model with 456B parameters, a massive context window of 1,000,000 tokens, and a reasoning budget of 40,000 tokens. Thanks to architectural innovations, the model is more resource-efficient compared to models of similar size, making it highly effective for a wide range of intelligent analysis tasks and agent-based applications.