Qwen3-Coder-30B-A3B-Instruct is an outstanding example of a high-quality large language model with advanced specialization in programming. This Mixture-of-Experts model has 30.5 billion total parameters, of which only 3.3 billion are activated per token, and out of 128 experts, only 8 are activated per token. The model comprises 48 hidden layers with grouped query attention (32 heads for Q and 4 for KV), delivering exceptional processing efficiency with minimal computational resource consumption. Native support for a 262,144-token context window—expandable up to 1 million tokens via Yarn—makes the model ideal for working with large code repositories within complex projects.
The key unique feature of Qwen3-Coder-30B-A3B-Instruct lies in its superior agent capabilities. The model does not merely generate code; it autonomously interacts with development tools, executes multi-step programming tasks, and is capable of solving complex problems without human intervention. On the LiveCodeBench v6 benchmark, the model achieves an impressive 66.0%, significantly outperforming the base version Qwen3-30B-A3B (57.4%). In AIME25 tasks (advanced mathematics for programming), it demonstrates 85.0% accuracy, surpassing Gemini-2.5-Flash-Thinking (72.0%) and confidently competing with much larger models. The model outperforms DeepSeek V3 on most coding tasks and delivers agent workflow performance comparable to Claude Sonnet 4, a remarkable achievement for an open-source solution.
Qwen3-Coder-30B-A3B-Instruct unlocks entirely new possibilities in software development. The model is integrated with popular agent-based programming platforms, including Qwen Code, CLINE, Roo Code, and Kilo Code, offering a unified function-calling format for seamless operation within CI/CD pipelines. Support for 358 programming languages makes it a universal reference tool for developers. The model particularly excels in repository-scale understanding scenarios, where it can analyze and modify massive codebases, automatically refactor legacy code, and create complex full-stack applications with minimal developer intervention.
| Model Name | Context | Type | GPU | Status | Link |
|---|
There are no public endpoints for this model yet.
Rent your own physically dedicated instance with hourly or long-term monthly billing.
We recommend deploying private instances in the following scenarios:
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
262,144.0 tensor |
4 | $0.96 | 1.164 | Launch | ||
262,144.0 tensor |
4 | $1.26 | 1.170 | Launch | ||
262,144.0 pipeline |
3 | $1.34 | 1.623 | Launch | ||
262,144.0 tensor |
4 | $1.57 | 2.381 | Launch | ||
262,144.0 pipeline |
3 | $2.29 | 1.745 | Launch | ||
262,144.0 tensor |
4 | $2.34 | 2.381 | Launch | ||
262,144.0 |
1 | $2.37 | 137.990 | 2.281 | Launch | |
262,144.0 pipeline |
3 | $2.83 | 1.740 | Launch | ||
262,144.0 tensor |
4 | $2.89 | 2.544 | Launch | ||
262,144.0 tensor |
2 | $2.93 | 1.543 | Launch | ||
262,144.0 tensor |
4 | $3.60 | 2.537 | Launch | ||
262,144.0 |
1 | $3.83 | 2.279 | Launch | ||
262,144.0 |
1 | $4.11 | 2.812 | Launch | ||
262,144.0 tensor |
2 | $4.61 | 5.215 | Launch | ||
262,144.0 |
1 | $4.74 | 4.603 | Launch | ||
262,144.0 tensor |
2 | $9.40 | 9.857 | Launch | ||
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
262,144.0 pipeline |
3 | $1.34 | 1.065 | Launch | ||
262,144.0 tensor |
4 | $1.62 | 1.824 | Launch | ||
262,144.0 pipeline |
6 | $1.65 | 1.015 | Launch | ||
262,144.0 pipeline |
3 | $2.29 | 1.187 | Launch | ||
262,144.0 tensor |
4 | $2.34 | 1.824 | Launch | ||
262,144.0 |
1 | $2.37 | 127.020 | 1.724 | Launch | |
262,144.0 pipeline |
3 | $2.83 | 1.183 | Launch | ||
262,144.0 tensor |
4 | $2.89 | 1.986 | Launch | ||
262,144.0 tensor |
4 | $3.60 | 1.980 | Launch | ||
262,144.0 |
1 | $3.83 | 1.721 | Launch | ||
262,144.0 |
1 | $4.11 | 2.255 | Launch | ||
262,144.0 pipeline |
3 | $4.34 | 2.083 | Launch | ||
262,144.0 tensor |
2 | $4.61 | 4.658 | Launch | ||
262,144.0 |
1 | $4.74 | 4.045 | Launch | ||
262,144.0 tensor |
4 | $5.74 | 3.181 | Launch | ||
262,144.0 tensor |
2 | $9.40 | 9.300 | Launch | ||
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
262,144.0 tensor |
8 | 2.011 | Launch | |||
262,144.0 pipeline |
6 | $3.50 | 1.454 | Launch | ||
262,144.0 |
1 | $4.11 | 1.095 | Launch | ||
262,144.0 tensor |
2 | $4.61 | 3.498 | Launch | ||
262,144.0 tensor |
8 | $4.61 | 1.848 | Launch | ||
262,144.0 |
1 | $4.74 | 2.885 | Launch | ||
262,144.0 tensor |
2 | $4.93 | 3.498 | Launch | ||
262,144.0 tensor |
4 | $5.74 | 2.021 | Launch | ||
262,144.0 pipeline |
6 | $5.83 | 1.610 | Launch | ||
262,144.0 tensor |
8 | $7.51 | 2.005 | Launch | ||
262,144.0 tensor |
2 | $7.84 | 3.492 | Launch | ||
262,144.0 tensor |
2 | $9.40 | 8.140 | Launch | ||
Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.