An open-source model built on a Mixture-of-Experts architecture with 1 trillion parameters, of which 32 billion are activated per token. The developers have implemented a "visual agentic intelligence" paradigm within it—a combination of visual perception, reasoning, and autonomous agents. The model is multimodal, presented in native INT4 quantization, and includes a unique Agent Swarm mechanism that orchestrates and enables the parallel operation of up to 100 sub-agents. This improves quality and reduces the execution time for complex tasks by an average factor of 4.5.
The largest open-source reasoning model from Moonshot AI at the time of its release, featuring a Mixture-of-Experts architecture (1 trillion parameters total, 32 billion active), capable of executing 200–300 consecutive tool calls without quality degradation while seamlessly interleaving function calls with reasoning chains. The model supports a 256K-token context window, incorporates native INT4 quantization for significantly accelerated inference with virtually no loss in accuracy, and employs Multi-Head Latent Attention (MLA) for highly efficient processing of long sequences. Kimi K2 Thinking sets new records among open-source models and outperforms leading commercial systems—including GPT-5 and Claude Sonnet 4.5—on a broad range of benchmarks.
An update to one of the largest MoE-LLMs with 1T parameters. The developers have extended the context length to 256K, focusing on frontend programming tasks, agent capabilities, and improved tool-calling functionality. As a result, the model shows significant gains in accuracy across several public benchmarks and competes strongly with the best proprietary solutions.
An enormous MoE model containing 1 trillion parameters. The model is specifically designed for autonomous execution of complex tasks, tool usage, and interaction with external systems. Kimi K2 doesn't simply answer questions—it takes action. It represents a new generation of AI assistants capable of independently planning, executing, and monitoring multi-step processes without constant human involvement. This is precisely why developers recommend using the model in agent-based systems.