A major update in DeepSeek-AI's LLM series, marking a significant step toward AI agent-oriented solutions. DeepSeek v3.1 is now a hybrid model supporting two intelligent modes (thinking/non-thinking), leading its class in accuracy and application flexibility. Performance improvements are evident across all benchmarks, with developers placing particular emphasis on enhanced tool usage efficiency. As a result, the model is ideally suited for complex analytical and research tasks, as well as enterprise-level agent systems.
An advanced open language model with 32 billion parameters, optimized for complex instruction following, dialogue, and agent-based scenarios, featuring uniquely flexible control over "thinking budget" and supporting a 512K context window. The model is ideally suited for customer consultation and support chatbots, processing long documents, legal files, scientific and technical reports, and, not least, for automating business processes—particularly through intelligent assistants.
Qwen-Image-Edit is an image editing model developed by Qwen, based on the 20B-parameter Qwen-Image architecture (Qwen2.5-VL + VAE Encoder).
A new, ultra-compact (270 million parameters) and high-performance model from the Gemma 3 family by Google DeepMind. This solution is designed for rapid local deployment and can run efficiently on-device, including embedded systems and web browsers. It was specifically created for use after fine-tuning for specific tasks, yet it can follow instructions and structure text effectively right out of the box. Ideal for fast text classification, data extraction, and other tasks where speed, accuracy, energy efficiency, and privacy are paramount.
A next-generation multimodal model that processes images, video, text, and graphical user interfaces. Its architecture is built upon the flagship MoE-based GLM-4.5 Air and supports Thinking Mode for deep reasoning and No-Thinking Mode for rapid responses. At launch, the model achieves leading performance on 41 out of 42 key benchmarks used to evaluate LLMs capable of processing visual and textual information.
A compact yet high-performance language model with 4B parameters, specialized in rapidly executing instructions without internal reasoning. The model outperforms GPT-4.1-nano across all key metrics and supports a context length of up to 262K tokens. Ideal for classification tasks, knowledge-base-powered response generation, and conversational assistants—essentially any scenario requiring high-speed query processing and precise adherence to instructions.
An upgraded hybrid Qwen3-4B model specialized in complex reasoning, featuring an extended context length of 262K tokens and operating exclusively in reasoning mode. Despite its 4 billion parameters, the model achieves an impressive score of 81.3 on the AIME25 math olympiad benchmark. It is ideal for local deployment, code debugging, analytical tasks, and any scenarios requiring step-by-step, thoughtful problem solving.
The flagship open reasoning model from OpenAI, carrying forward the company's best scientific advancements and achievements used in the renowned ChatGPT. This model features a unique MoE (Mixture of Experts) architecture with 116.8 billion parameters, yet activates only 5.1 billion parameters per token, and is equipped with numerous innovations that efficiently balance performance and resource consumption—enabling the model to run on a single 80GB GPU. GPT-OSS-120B supports a three-level reasoning system and, for the first time in open models, introduces an extended role hierarchy and output channels aligned with specific roles, collectively allowing users to precisely customize and control the model's behavior.
A compact yet powerful reasoning MoE model from OpenAI, featuring 20.9 billion total parameters (with 3.61 billion activated per token), capable of running on just 16GB of memory—making it ideal for local deployment using widely accessible consumer hardware. Despite its efficiency, it retains all advanced reasoning and tool-use capabilities, outperforming not only existing open-source models but also OpenAI's popular o3-mini on a range of key benchmarks. This makes gpt-oss-20b a strong choice for diverse research and product applications.
A multimodal model for generating and editing images based on text prompts, part of the Qwen series of models. It demonstrates significant improvements in accurately rendering complex text (including Chinese) and performing advanced image-editing operations. The model possesses generalized capabilities in both image creation and editing, with an emphasis on preserving font details, composition, and contextual harmony of the text.
Qwen3-30B-A3B-Thinking-2507 — an upgraded and specialized version of Qwen3-30B-A3B fine-tuned exclusively for reasoning tasks. With 30.5B total parameters (3.3B active), 128 experts (8 activated per token), and an extended context length of 262,144, this model stands as the ideal open-source solution among mid-sized models for applications requiring high-quality reasoning—whether for tool usage, agent-like capabilities, or generating well-structured, accurate responses to highly complex user queries.
An updated version of Qwen3-30B-A3B with 30.5 billion total parameters (3.3B active) and an extended context length of 262,144, designed for generating instant and accurate responses without intermediate reasoning steps. An exceptionally efficient dialogue model capable of solving both technical and creative tasks—ideal for use in chatbots.
The model is an image-to-video (I2V) generative diffusion model designed for high-quality video synthesis at 480P and 720P resolutions. It incorporates a Mixture-of-Experts (MoE) architecture with two specialized experts (high-noise and low-noise) to enhance model capacity while maintaining computational efficiency. The model supports both prompt-based and prompt-free video generation.
T2V-A14B model supports generating 5s videos at both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it delivers outstanding video generation quality. On new benchmark Wan-Bench 2.0, the model surpasses leading commercial models across most key evaluation dimensions.
It is a 5 billion-parameter text-to-video (TI2V) generative model designed for high-definition video generation at 720P resolution (1280×704 or 704×1280) with 24fps. Built using the Wan2.2-VAE architecture, it achieves a compression ratio of 16×16×4, enabling efficient deployment on consumer-grade GPUs like the RTX 4090. The model supports both text-to-video and image-to-video generation within a unified framework.
A hybrid model with 355B parameters, combining advanced reasoning, programming with artifact generation, and agent capabilities within a unified MoE architecture featuring an increased number of hidden layers. At launch, the model ranks 3rd globally in average score across 12 key benchmarks. Particularly impressive are its abilities in generating complete web applications, interactive presentations, and complex code. Users need only describe to the model how the program should function and what outcome they expect.
A high-quality agent-oriented model with 106B parameters, optimized for fast inference and moderate hardware requirements, while retaining key capabilities in hybrid reasoning and overall functionality. At launch, the model ranks 6th globally across 12 key benchmarks, demonstrating exceptional speed and outstanding performance in real-world development scenarios. Developers particularly highlight its effectiveness in frontend code autocompletion and code correction tasks.
The new flagship MoE model Qwen3-235B-A22B in the Qwen 3 series features enhanced "thinking" capabilities and an extended context length of 262K tokens. Operating exclusively in thinking mode, it achieves state-of-the-art performance among leading open and proprietary thinking models, surpassing many well-known brands in mathematical computations, programming, and logical reasoning tasks. An ideal choice for complex research tasks requiring advanced agent and analytical capabilities.
A compact MoE model with an architecture of 30.5B total parameters, of which only 3.3B are activated per token, specifically designed to assist in writing software code. The model features agent-like capabilities, supports a context length of 262,144 tokens, and demonstrates excellent performance at relatively low computational cost. These qualities make it an ideal choice for use as a programming assistant, a QA system within programming education platforms, and for integration into tools featuring code autocompletion.
Alibaba's flagship agent-based programming model featuring a Mixture-of-Experts architecture (480 billion total parameters, 35 billion active parameters) with native support for a 256K-token context. Qwen3-Coder's application scenarios cover the entire spectrum of modern software development—from building interactive web applications to modernizing legacy systems—including autonomous feature development spanning backend APIs, frontend components, and databases.