DeepSeek-R1-0528

reasoning

DeepSeek-R1-0528 is a significant update to the original DeepSeek-R1 model, released on May 28, 2025, as a "minor trial update." However, the new version demonstrates such substantial improvements in reasoning quality that it outperforms the base version by more than 10 percentage points across nearly all benchmarks and confidently competes with current leading proprietary solutions. For example, on the AIME 2025 test, accuracy increased from 70% in the previous version to 87.5% in the current one. This achievement was made possible by enhanced depth of thinking during the reasoning process: whereas the earlier model used an average of 12K tokens per question, the new version averages 23K tokens per question.

A key distinction of DeepSeek-R1-0528 from its predecessor is the significant improvement in Chain-of-Thought (CoT) behavior, which has become more structured and deeper. The model is now capable of prolonged information processing. In addition to enhanced reasoning and logical inference capabilities, this version also offers a reduced hallucination rate, improved function calling support, and a better web development experience.

DeepSeek-R1-0528 represents a major step forward in the development of open reasoning models. The combination of improved reasoning capabilities, high performance in programming tasks, and availability under an open license makes this model an uncompromising choice as an engine for research and commercial projects where truly complex AI-driven problem-solving is required.


Announce Date: 28.05.2025
Parameters: 685B
Experts: 16
Activated: 37B
Context: 164K
Attention Type: Multi-head Latent Attention
VRAM requirements: 329.7 GB using 4 bits quantization
Developer: DeepSeek
Transformers Version: 4.46.3
License: MIT

Public endpoint

Use our pre-built public endpoints to test inference and explore DeepSeek-R1-0528 capabilities.
Model Name Context Type GPU TPS Status Link
There are no public endpoints for this model yet.

Private server

Rent your own physically dedicated instance with hourly or long-term monthly billing.

We recommend deploying private instances in the following scenarios:

  • maximize endpoint performance,
  • enable full context for long sequences,
  • ensure top-tier security for data processing in an isolated, dedicated environment,
  • use custom weights, such as fine-tuned models or LoRA adapters.

Recommended configurations for hosting DeepSeek-R1-0528

Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour
Prices:
Name vCPU RAM, MB Disk, GB GPU Price, hour

Related models

Need help?

Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.