Nvidia and Google has revealed a partnership to offer the Accelerator-Optimized VM (A2) instance family powered by Nvidia’s A100 compute GPU.
The new A2 VM instance family is aimed at compute-intensive applications, including AI, data analytics, and scientific computing.
Introduced in mid-May, NVIDIA’s A100 accelerator features 6912 CUDA cores and is equipped with 40 GB of HBM2 memory offering up to 1.6 TB/s of memory bandwidth. The device provides up to 9.7 TFLOPS FP32 performance, up to 19.5 TFLOPS FP64 Tensor performance, and up 624 TOPs INT8 Tensor performance.
Google and Nvidia expect the new A100-based GPUs to boost training and inference computing performance by up 20 times over previous-generation processors.
The Accelerator-Optimized VM (A2) instance family is available in alpha upon request and will be offered with different levels of performance for customers with various requirements. Those with demanding workloads requiring up to 312 TFLOPS of FP64 performance or up to 20 POPS of INT4 performance will be able to get the A2-megagpu-16G instance powered by 16 NVIDIA A100 GPUs interconnected using the NVSwitch with 640 GB of HBM2 memory, 96 vCPUs, and 1.3 TB of system memory. Those who do not need that much compute horsepower can get the A2-highgpu-1G instance powered by a single A100 accelerator and a 12-core Intel Cascade Lake vCPU.
Public availability of Google’s A2 VMs is expected later this year. Google says that NVIDIA’s A100 GPU accelerators will be supported by the Google Kubernetes Engine (GKE), Cloud AI Platform, and other Google Cloud services shortly.
Google Cloud is the first cloud compute provider to offer Nvidia’s A100-powered services. The company uses NVIDIA’s HGX A100 servers designed in-house and built under supervision of the GPU company.