DeepSeek-R1-0528-Qwen3-8B is a compact 8-billion-parameter model created by distilling knowledge and reasoning capabilities from the flagship DeepSeek-R1-0528 into the Qwen3-8B base model. The model uses an architecture identical to Qwen3-8B, but incorporates the tokenizer from DeepSeek-R1-0528, ensuring compatibility with more advanced reasoning capabilities.
It demonstrates outstanding performance, achieving 86.0% on AIME 2024 — exceeding the base Qwen3-8B by 10% and matching the performance of the much larger Qwen3-235B-Thinking. These results, along with strong benchmark scores across other evaluations, place it among the leading open-source models in its class. The model serves as a great example of a well-implemented distillation process. Reasoning chains from DeepSeek-R1-0528 have been successfully transferred into a more compact architecture, opening new possibilities for academic research and industrial development of small, specialized models. Its compact size of 8B parameters makes it accessible for deployment on less powerful hardware while maintaining high-quality reasoning abilities.
DeepSeek-R1-0528-Qwen3-8B is ideally suited for educational applications, small-scale research projects, and any scenario where a capable reasoning-style answering model is needed, but deploying large reasoning models is not feasible.
Model Name | Context | Type | GPU | TPS | Status | Link |
---|
There are no public endpoints for this model yet.
Rent your own physically dedicated instance with hourly or long-term monthly billing.
We recommend deploying private instances in the following scenarios:
Name | vCPU | RAM, MB | Disk, GB | GPU | |||
---|---|---|---|---|---|---|---|
16 | 32768 | 160 | 2 | $0.57 | Launch | ||
16 | 32768 | 160 | 2 | $0.80 | Launch | ||
16 | 65536 | 160 | 2 | $0.93 | Launch | ||
16 | 65536 | 160 | 3 | $0.95 | Launch | ||
12 | 65536 | 160 | 1 | $1.20 | Launch | ||
16 | 65536 | 160 | 3 | $1.43 | Launch | ||
16 | 65536 | 160 | 1 | $1.59 | Launch | ||
16 | 65536 | 160 | 2 | $1.67 | Launch | ||
16 | 65536 | 160 | 2 | $2.19 | Launch | ||
16 | 65536 | 160 | 1 | $2.58 | Launch | ||
16 | 65536 | 160 | 1 | $5.11 | Launch |
Name | vCPU | RAM, MB | Disk, GB | GPU | |||
---|---|---|---|---|---|---|---|
16 | 32768 | 160 | 2 | $0.57 | Launch | ||
16 | 32768 | 160 | 2 | $0.80 | Launch | ||
16 | 65536 | 160 | 2 | $0.93 | Launch | ||
16 | 65536 | 160 | 3 | $0.95 | Launch | ||
12 | 65536 | 160 | 1 | $1.20 | Launch | ||
16 | 65536 | 160 | 3 | $1.43 | Launch | ||
16 | 65536 | 160 | 1 | $1.59 | Launch | ||
16 | 65536 | 160 | 2 | $1.67 | Launch | ||
16 | 65536 | 160 | 2 | $2.19 | Launch | ||
16 | 65536 | 160 | 1 | $2.58 | Launch | ||
16 | 65536 | 160 | 1 | $5.11 | Launch |
Name | vCPU | RAM, MB | Disk, GB | GPU | |||
---|---|---|---|---|---|---|---|
16 | 65536 | 160 | 2 | $0.93 | Launch | ||
16 | 65536 | 160 | 4 | $1.18 | Launch | ||
16 | 65536 | 160 | 4 | $1.48 | Launch | ||
16 | 65536 | 160 | 2 | $1.67 | Launch | ||
16 | 65536 | 160 | 4 | $1.82 | Launch | ||
16 | 65536 | 160 | 2 | $2.19 | Launch | ||
16 | 65536 | 160 | 1 | $2.58 | Launch | ||
16 | 65536 | 160 | 2 | $2.93 | Launch | ||
16 | 65536 | 160 | 1 | $5.11 | Launch |
Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.