Gemma 3 12B is a well-balanced mid-sized multimodal language model developed by Google DeepMind, designed to tackle narrow, specialized professional tasks. With 12 billion parameters, the model combines high performance with computational efficiency and supports a wide range of capabilities—from text analysis to image processing. Gemma 3 12B converts visual data into tokens, enabling deep understanding of images. The "Pan&Scan" technology allows adaptive processing of images with any aspect ratio, preserving detail when scaling up to a resolution of 896×896.
Another key feature is the expanded context window of up to 128K tokens. This enables the model to process lengthy legal documents and scientific articles in a single request without losing context. Multilingual support covers more than 140 languages, including Russian, while the enhanced tokenizer from Gemini 2.0 ensures high-quality translation, text generation, and cross-lingual analysis. Additionally, developer-supported quantization makes it possible to run the model even on consumer-grade GPUs with minimal loss in quality.
As a result, Gemma 3 12B is a versatile tool for data analysis, document processing, and information extraction from visual sources—with the ability to run locally and scalable integration into modern AI infrastructures.
| Model Name | Context | Type | GPU | Status | Link |
|---|
There are no public endpoints for this model yet.
Rent your own physically dedicated instance with hourly or long-term monthly billing.
We recommend deploying private instances in the following scenarios:
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
131,072.0 pipeline |
3 | $0.88 | 1.904 | Launch | ||
131,072.0 tensor |
2 | $0.93 | 2.321 | Launch | ||
131,072.0 tensor |
4 | $0.96 | 2.837 | Launch | ||
131,072.0 pipeline |
3 | $1.06 | 1.914 | Launch | ||
131,072.0 tensor |
4 | $1.12 | 1.476 | Launch | ||
131,072.0 tensor |
2 | $1.23 | 2.321 | Launch | ||
131,072.0 tensor |
4 | $1.26 | 2.851 | Launch | ||
131,072.0 tensor |
2 | $1.56 | 90.410 | 2.502 | Launch | |
131,072.0 |
1 | $1.59 | 1.465 | Launch | ||
131,072.0 tensor |
4 | $1.82 | 1.153 | Launch | ||
131,072.0 tensor |
2 | $1.92 | 2.495 | Launch | ||
131,072.0 |
1 | $2.37 | 5.541 | Launch | ||
131,072.0 |
1 | $3.83 | 5.535 | Launch | ||
131,072.0 |
1 | $4.11 | 6.718 | Launch | ||
131,072.0 tensor |
2 | $4.61 | 11.979 | Launch | ||
131,072.0 |
1 | $4.74 | 10.693 | Launch | ||
131,072.0 tensor |
2 | $9.40 | 22.283 | Launch | ||
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
131,072.0 pipeline |
3 | $0.88 | 1.628 | Launch | ||
131,072.0 tensor |
2 | $0.93 | 2.045 | Launch | ||
131,072.0 tensor |
4 | $0.96 | 2.561 | Launch | ||
131,072.0 pipeline |
3 | $1.06 | 1.638 | Launch | ||
131,072.0 tensor |
4 | $1.12 | 1.200 | Launch | ||
131,072.0 tensor |
2 | $1.23 | 2.045 | Launch | ||
131,072.0 tensor |
4 | $1.26 | 2.575 | Launch | ||
131,072.0 tensor |
2 | $1.56 | 2.226 | Launch | ||
131,072.0 |
1 | $1.59 | 1.189 | Launch | ||
131,072.0 tensor |
2 | $1.92 | 2.219 | Launch | ||
131,072.0 |
1 | $2.37 | 5.265 | Launch | ||
131,072.0 |
1 | $3.83 | 5.259 | Launch | ||
131,072.0 |
1 | $4.11 | 6.442 | Launch | ||
131,072.0 tensor |
2 | $4.61 | 11.703 | Launch | ||
131,072.0 |
1 | $4.74 | 10.417 | Launch | ||
131,072.0 tensor |
2 | $9.40 | 22.007 | Launch | ||
| Name | GPU | TPS | Max Concurrency | |||
|---|---|---|---|---|---|---|
131,072.0 tensor |
2 | $0.93 | 1.119 | Launch | ||
131,072.0 tensor |
4 | $0.96 | 1.635 | Launch | ||
131,072.0 tensor |
2 | $1.23 | 1.119 | Launch | ||
131,072.0 tensor |
4 | $1.26 | 1.649 | Launch | ||
131,072.0 tensor |
2 | $1.56 | 1.299 | Launch | ||
131,072.0 tensor |
2 | $1.92 | 1.293 | Launch | ||
131,072.0 |
1 | $2.37 | 4.339 | Launch | ||
131,072.0 tensor |
2 | $2.93 | 2.625 | Launch | ||
131,072.0 |
1 | $3.83 | 4.333 | Launch | ||
131,072.0 |
1 | $4.11 | 5.516 | Launch | ||
131,072.0 tensor |
2 | $4.61 | 10.777 | Launch | ||
131,072.0 |
1 | $4.74 | 9.491 | Launch | ||
131,072.0 tensor |
2 | $9.40 | 21.081 | Launch | ||
Contact our dedicated neural networks support team at nn@immers.cloud or send your request to the sales department at sale@immers.cloud.