Performance improvement in modern AI inference tasks
Graphics servers with Tesla A2
All graphics servers with Tesla A2 are based on two Intel Xeon Gold 3rd generation 6336Y CPUs with a base clock frequency of 2.4 GHz and a maximum clock frequency with Turbo Boost technology of 3.6 GHz.
Each processor contains two Intel® AVX-512 units and supports Intel® AVX-512 Deep Learning Boost functions. This set of instructions speeds up multiplication and addition operations with reduced accuracy, which are used in many internal cycles of the deep learning algorithm.
Each server has up to 4096 GB of DDR4 ECC Reg 3200 MHz RAM. Local storage with a total capacity of 1920 GB is organized on Intel® solid-state drives, designed specifically for data centers.
GPU Tesla A2
The Tesla A2 graphics accelerator is optimized for inference tasks and provides up to 1.3 times greater performance for smart cities, industry and retail tasks.
Video memory capacity
16 GB
Type of video memory
GDDR6
Memory bandwidth
200 Gb/s
Encode/decode
1 encoder, 2 decoder (+AV1 decode)
GPU performance benchmarks
Performance benchmarks results in a virtual environment for 1 Tesla A2 graphics card.
OctaneBench 2020
up to
120
pts
Matrix multiply example
350
GFlop/s
Hashcat bcrypt
10 200
H/s
Basic configurations with Tesla A2 16 GB
Prices:
Subscribe to the availability notification
Specify the number of required flavors . When they become available, you will receive a notification by email.
OK
Cancel
You successfully subscribed on notification.
You already subscribed on notification.
You already have reached the limit of TeslaA2 flavors.
Each physical core or GPU adapter is assigned only to a single client. It means that:
Available vCPU time is 100%;
Physical pass-through of GPU inside a virtual server;
Less storage and network load on hypervisors, more storage and network performance for a client.
Up to 75 000 IOPS1 for the RANDOM READ and up to 20 000 IOPS for the RANDOM WRITE for the Virtual Servers with local SSDs.
Up to 70 000 IOPS1 for the RANDOM READ and up to 60 000 IOPS for the RANDOM WRITE for the Virtual Servers with block storage Volumes.
You can be sure that Virtual Servers are not sharing vCPU or GPU between each other.
IOPS — Input/Output Operations Per Second.
Answers to frequently asked questions
What is the minimum rental period for a virtual GPU-server?
You can rent a virtual server for any period. Make a payment for any amount starting from 1.1 $ and work within the prepaid balance. When the work is completed, delete the server to stop spending money.
How quickly can I get started with a virtual GPU-server?
You can create GPU-servers yourself under the control panel, choosing the hardware configuration and operating system. The ordered capacities are available for use within a few minutes.
What operating systems can be installed on a virtual GPU-server?
You can choose from basic images: Windows Server 2019, Windows Server 2022, Ubuntu, Debian, CentOS, Fedora, OpenSUSE. Or use a pre-configured image from the Marketplace.
All operating systems are installed automatically when the GPU-server is created.
How to connect to a virtual GPU-server?
By default, we provide connection to Windows-based servers via RDP, and for Linux-based servers-via SSH.
You can configure any connection method that is convenient to you yourself.
Is it possible to rent a virtual GPU-server with an custom configuration?
Yes, it is possible. Contact our round-the-clock support service (https://t.me/immerscloudsupport) and tell us what configuration you need.
A bit more about us
Per-second billing
and free VM pause (shelve). You pay for the actual use of your VMs
24/7/365
Tech support is always available via chat and responds within minutes
Free traffic
Speed up to 2 Gb/s without extra charge for incoming and outgoing traffic
Our data centers
Built according to the TIER III standard
100% of power is yours
We do not share resources you purchased with other users