DeepSeek-V3.2-Speciale

reasoning

DeepSeek-V3.2-Speciale has an architecture identical to the base DeepSeek-V3.2 version. The main difference between Speciale and the base version is its exclusive focus on reasoning capabilities, without support for agent functions or tool usage, as explicitly stated in the model card and technical documentation. During the reinforcement learning stage, it was trained exclusively on reasoning data with relaxed constraints on the length of reasoning generation. The dataset and reward methodology from DeepSeekMath-V2 were integrated into the training process to enhance its ability to derive mathematical proofs. This allowed the model to generate significantly longer chains of "thoughts," which is critical for solving Olympiad-level problems.

By sacrificing agent capabilities, the model set new standards for LLMs on the most challenging benchmarks. On AIME 2025, Speciale achieved 96.0% accuracy compared to 95.0% for Gemini-3.0-Pro and 94.6% for GPT-5. On HMMT February 2025, the result was 99.2%, which also surpasses all competing models. In the Codeforces rating, the model reached 2701—almost the same as Gemini-3.0-Pro (2708), while other models perform significantly lower. On LiveCodeBench, the model scored 88.7%, thus being second only to Gemini-3.0-Pro (90.7%). On IMOAnswerBench (problems at the level of the International Mathematical Olympiad), the result was

DeepSeek-V3.2-Speciale is recommended for research scenarios where token efficiency is secondary to solution quality, meaning it is intended for specific tasks requiring extremely deep reasoning: solving Olympiad-level mathematical problems (IMO, CMO, AIME, HMMT), competitive programming (Codeforces, IOI, ICPC), complex proofs and verification, combinatorial and algorithmic problems with extended solution search, and scientific research requiring multi-step logical inference. The model is not suitable for agent scenarios, working with tools, or production environments.


Announce Date: 01.12.2025
Parameters: 685B
Experts: 256
Activated at inference: 37B
Context: 164K
Layers: 61
Attention Type: DeepSeek Sparse Attention
VRAM requirements: 328.7 GB using 4 bits quantization
Developer: DeepSeek
Transformers Version: 4.44.2
License: MIT

Public endpoint

Use our pre-built public endpoints for free to test inference and explore DeepSeek-V3.2-Speciale capabilities. You can obtain an API access token on the token management page after registration and verification.
Model Name Context Type GPU TPS Status Link
There are no public endpoints for this model yet.

Private server

Rent your own physically dedicated instance with hourly or long-term monthly billing.

We recommend deploying private instances in the following scenarios:

  • maximize endpoint performance,
  • enable full context for long sequences,
  • ensure top-tier security for data processing in an isolated, dedicated environment,
  • use custom weights, such as fine-tuned models or LoRA adapters.

Recommended configurations for hosting DeepSeek-V3.2-Speciale

Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
teslaa100-6.44.512.480.nvlink
163,840.0
pipeline
44 524288 480 6 $14.10 Launch
h200-3.32.512.480
163,840.0
pipeline
32 524288 480 3 $14.36 Launch
teslaa100-8.44.512.480.nvlink
163,840.0
tensor
44 524288 480 8 $18.35 Launch
h200-4.32.768.480
163,840.0
tensor
32 786432 480 4 $19.23 Launch
Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
h200-6.52.896.960
163,840.0
pipeline
52 917504 960 6 $28.39 Launch
h200-8.52.1024.960
163,840.0
tensor
52 1048576 960 8 $37.37 Launch
There are no configurations for this model, context and quantization yet.

Related models

Need help?

Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.