Products

Cloud servers

Cloud servers with per-second billing. Isolated resources will give maximum performance for your project.

GPU servers

Cloud servers with modern RTX and Tesla graphics accelerators for games, rendering, streaming, working with 3D graphics, and artificial intelligence.

Tesla H200

Tesla H100

RTX 5090

RTX 4090

RTX 3090

RTX 3080

Tesla A100

RTX A5000

Tesla A10

RTX 2080 Ti

Tesla A2

Tesla T4

Tesla V100

All GPU servers

CPU servers

The cloud servers with high-performance Intel Xeon Gold 2nd and 3rd generation CPU are available for 100% of the processor time.
SSD servers NVMe servers
All CPU servers

Dedicated servers

Rent a physically dedicated server for a long term with a monthly payment. Configure it using modern components: Intel Xeon Gold 2nd and 3rd generation processors, up to 10 of the latest RTX and Tesla video accelerators, and up to 8192 GB of RAM per server, SSD and NVMe disks for data centers.

Select a dedicated server

Marketplace

Use popular and modern applications as effective tools for organizing your project. Save time with pre-configured images that already have all the necessary components installed.

Forget about manually downloading and installing the software — just deploy a virtual server with a ready-made image.
Neural networks 3D CUDA Docker / NGC For games Windows images Linux images
All pre-installed images
Features
Prices
FAQ
Contact
Login

Models

Our catalog features the most popular open-source AI models from developers worldwide, including large language models (LLMs), multimodal, and diffusion models. Try any model in one place — we’ve made it easy for you.
To explore and test a model, you can query it through our public endpoint. For production use, fine-tuning, or custom weights, we recommend renting a virtual or a dedicated GPU server.

GLM-4.5V

A next-generation multimodal model that processes images, video, text, and graphical user interfaces. Its architecture is built upon the flagship MoE-based GLM-4.5 Air and supports Thinking Mode for deep reasoning and No-Thinking Mode for rapid responses. At launch, the model achieves leading performance on 41 out of 42 key benchmarks used to evaluate LLMs capable of processing visual and textual information.

reasoning

multimodal

11.08.2025

Qwen3-4B-Instruct-2507

A compact yet high-performance language model with 4B parameters, specialized in rapidly executing instructions without internal reasoning. The model outperforms GPT-4.1-nano across all key metrics and supports a context length of up to 262K tokens. Ideal for classification tasks, knowledge-base-powered response generation, and conversational assistants—essentially any scenario requiring high-speed query processing and precise adherence to instructions.

07.08.2025

Qwen3-4B-Thinking-2507

An upgraded hybrid Qwen3-4B model specialized in complex reasoning, featuring an extended context length of 262K tokens and operating exclusively in reasoning mode. Despite its 4 billion parameters, the model achieves an impressive score of 81.3 on the AIME25 math olympiad benchmark. It is ideal for local deployment, code debugging, analytical tasks, and any scenarios requiring step-by-step, thoughtful problem solving.

reasoning

07.08.2025

gpt-oss-20b

A compact yet powerful reasoning MoE model from OpenAI, featuring 20.9 billion total parameters (with 3.61 billion activated per token), capable of running on just 16GB of memory—making it ideal for local deployment using widely accessible consumer hardware. Despite its efficiency, it retains all advanced reasoning and tool-use capabilities, outperforming not only existing open-source models but also OpenAI's popular o3-mini on a range of key benchmarks. This makes gpt-oss-20b a strong choice for diverse research and product applications.

reasoning

try online

05.08.2025

gpt-oss-120b

The flagship open reasoning model from OpenAI, carrying forward the company's best scientific advancements and achievements used in the renowned ChatGPT. This model features a unique MoE (Mixture of Experts) architecture with 116.8 billion parameters, yet activates only 5.1 billion parameters per token, and is equipped with numerous innovations that efficiently balance performance and resource consumption—enabling the model to run on a single 80GB GPU. GPT-OSS-120B supports a three-level reasoning system and, for the first time in open models, introduces an extended role hierarchy and output channels aligned with specific roles, collectively allowing users to precisely customize and control the model's behavior.

reasoning

try online

05.08.2025

Qwen-Image

A multimodal model for generating and editing images based on text prompts, part of the Qwen series of models. It demonstrates significant improvements in accurately rendering complex text (including Chinese) and performing advanced image-editing operations. The model possesses generalized capabilities in both image creation and editing, with an emphasis on preserving font details, composition, and contextual harmony of the text.

04.08.2025

Qwen3-30B-A3B-Thinking-2507

Qwen3-30B-A3B-Thinking-2507 — an upgraded and specialized version of Qwen3-30B-A3B fine-tuned exclusively for reasoning tasks. With 30.5B total parameters (3.3B active), 128 experts (8 activated per token), and an extended context length of 262,144, this model stands as the ideal open-source solution among mid-sized models for applications requiring high-quality reasoning—whether for tool usage, agent-like capabilities, or generating well-structured, accurate responses to highly complex user queries.

reasoning

29.07.2025

Qwen3-30B-A3B-Instruct-2507

An updated version of Qwen3-30B-A3B with 30.5 billion total parameters (3.3B active) and an extended context length of 262,144, designed for generating instant and accurate responses without intermediate reasoning steps. An exceptionally efficient dialogue model capable of solving both technical and creative tasks—ideal for use in chatbots.

29.07.2025

Wan2.2-T2V-A14B-Diffusers

T2V-A14B model supports generating 5s videos at both 480P and 720P resolutions. Built with a Mixture-of-Experts (MoE) architecture, it delivers outstanding video generation quality. On new benchmark Wan-Bench 2.0, the model surpasses leading commercial models across most key evaluation dimensions.

28.07.2025

GLM-4.5-Air

A high-quality agent-oriented model with 106B parameters, optimized for fast inference and moderate hardware requirements, while retaining key capabilities in hybrid reasoning and overall functionality. At launch, the model ranks 6th globally across 12 key benchmarks, demonstrating exceptional speed and outstanding performance in real-world development scenarios. Developers particularly highlight its effectiveness in frontend code autocompletion and code correction tasks.

reasoning

28.07.2025

GLM-4.5

A hybrid model with 355B parameters, combining advanced reasoning, programming with artifact generation, and agent capabilities within a unified MoE architecture featuring an increased number of hidden layers. At launch, the model ranks 3rd globally in average score across 12 key benchmarks. Particularly impressive are its abilities in generating complete web applications, interactive presentations, and complex code. Users need only describe to the model how the program should function and what outcome they expect.

reasoning

28.07.2025

Qwen3-235B-A22B-Thinking-2507

The new flagship MoE model Qwen3-235B-A22B in the Qwen 3 series features enhanced "thinking" capabilities and an extended context length of 262K tokens. Operating exclusively in thinking mode, it achieves state-of-the-art performance among leading open and proprietary thinking models, surpassing many well-known brands in mathematical computations, programming, and logical reasoning tasks. An ideal choice for complex research tasks requiring advanced agent and analytical capabilities.

reasoning

25.07.2025

Qwen3-Coder-30B-A3B-Instruct

A compact MoE model with an architecture of 30.5B total parameters, of which only 3.3B are activated per token, specifically designed to assist in writing software code. The model features agent-like capabilities, supports a context length of 262,144 tokens, and demonstrates excellent performance at relatively low computational cost. These qualities make it an ideal choice for use as a programming assistant, a QA system within programming education platforms, and for integration into tools featuring code autocompletion.

22.07.2025

Qwen3-Coder-480B-A35B-Instruct

Alibaba's flagship agent-based programming model featuring a Mixture-of-Experts architecture (480 billion total parameters, 35 billion active parameters) with native support for a 256K-token context. Qwen3-Coder's application scenarios cover the entire spectrum of modern software development—from building interactive web applications to modernizing legacy systems—including autonomous feature development spanning backend APIs, frontend components, and databases.

22.07.2025

Qwen3-235B-A22B-Instruct-2507

The updated flagship MoE model Qwen3, with 235B parameters (22B active), features a native context length of 256K and supports 119 languages. In its implementation, developers have abandoned the hybrid mode, so the model supports only non-thinking mode. However, improved refinement enables the model to significantly outperform competitors, delivering exceptional results in mathematics, programming, and logical reasoning. Furthermore, the FP8 version allows industrial-scale deployment with a 50% memory saving.

try online

21.07.2025

T-pro-2.0

The first Russian language model with 32 billion parameters and a hybrid reasoning mode, combining revolutionary efficiency in processing the Russian language with the ability for deep analytical thinking to solve tasks of any complexity. The model provides twice the computational resource savings compared to foreign counterparts while delivering superior performance, opening new possibilities for autonomous AI agents.

reasoning

18.07.2025

Kimi-K2

An enormous MoE model containing 1 trillion parameters. The model is specifically designed for autonomous execution of complex tasks, tool usage, and interaction with external systems. Kimi K2 doesn't simply answer questions—it takes action. It represents a new generation of AI assistants capable of independently planning, executing, and monitoring multi-step processes without constant human involvement. This is precisely why developers recommend using the model in agent-based systems.

11.07.2025

MiniMax-M1-80k

Powerful reasoning with maximum capabilities and minimal resource consumption. 456B parameters, a context window of 1,000,000 tokens, Lightning Attention — a novel approach to the attention mechanism, and an increased reasoning budget of 80,000 tokens.
This is ultimate performance for tackling the most complex research and product challenges in mathematics, programming, bioinformatics, law, finance, and beyond.

reasoning

16.06.2025

MiniMax-M1-40k

A large MoE model with 456B parameters, a massive context window of 1,000,000 tokens, and a reasoning budget of 40,000 tokens. Thanks to architectural innovations, the model is more resource-efficient compared to models of similar size, making it highly effective for a wide range of intelligent analysis tasks and agent-based applications.

reasoning

16.06.2025

DeepSeek-R1-0528

DeepSeek-R1-0528 is the first major update to the popular DeepSeek R1 series, released on May 28, 2025. The developers revised their approach to depth of thought, and the number of parameters increased to 685 billion, resulting in an improvement of more than 10 percentage points across nearly all significant benchmarks compared to the version released on January 22, 2025.

reasoning

28.05.2025