VMware and Nvidia launch GPU virtualization platform for enterprises
Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
VMware in collaboration with Nvidia today announced an AI-Ready Enterprise platform is now available as part of an update to its core virtual machine software.
Announced last fall at the VMworld 2020 conference, the alliance between the two companies spans a range of initiatives that revolve around deploying VMware virtual machine software on top of graphical processor units (GPUs) from Nvidia.
The AI-Ready Enterprise platform is designed to be deployed on on-premises NVIDIA-Certified Systems based on NVIDIA A100 Tensor Core GPUs that are being made available by Dell Technologies, Hewlett-Packard Enterprise (HPE), Supermicro, Gigabyte, and Inspur.
Those platforms are required to run vSphere 7 Update 2, which in addition to adding support for Nvidia GPUs adds the ability to employ vSphere Lifecycle Manager to see image and manage vSphere running instances of the Tanzu distribution of Kubernetes from VMware. Support for Kubernetes is crucial to the joint effort because most AI workloads are deployed at containers. An update to vSphere with Tanzu announced today adds support for VMware NSX Advanced Load Balancer Essentials to provide Level-4 load balancing for Kubernetes clusters running a distribution based on the 1.19 release of Kubernetes.
Separately, vSphere 7 Update 2 adds support for Confidential Containers for vSphere Pods on servers based on the AMD EPYC processor that makes use of Secure Encrypted Virtualization-Encrypted State (SEV-ES) software, in addition to key management software dubbed vSphere Native Key Provider.
VMware today is also adding suspend to memory capability to its ESXi hypervisor to minimize upgrade times and maintenance windows. A vSphere High Availability capability is now also persistent memory (PMEM)-aware, and support for select Hitachi Vantara UCP servers has been added.
The virtual storage software VMware provides is also being updated to support HCI Mesh software for disaggregating compute and storage nodes and vSphere Proactive High Availability software that enables any application state and associated data it has stored to be migrated to another host. The vSAN 7 Update 2 release also adds additional data durability capabilities across multiple clusters and tools to more easily identify the root cause of a potential issue.
VMware is making a case for deploying virtual machine software to enable multiple workloads to share the same GPU processor in the same way virtual machine software is widely employed on x86 processors. GPUs are significantly more expensive than x86 processors, which creates an economic incentive to employ virtual machine software. The work between the two companies has optimized VMware software where the overhead added is indistinguishable from a bare-metal GPU system, said Lee Caswell, VP of marketing for the Cloud Platform Business Unit at VMware.
That effort will not only help democratize AI, it will also encourage enterprise IT organizations that have standardized on VMware to adopt GPU-based systems, Caswell said. That approach provides the added benefit of making AI workloads accessible for the average IT generalist to manage, added Caswell. “We want to reduce both the perceived and real risks,” Caswell said.
Nvidia is trying to reduce the amount of time it takes to deploy an AI workload in a production environment from an average of 80 weeks to eight weeks, said Justin Boitano, VP and GM of Enterprise and Edge Computing at Nvidia. Part of that effort requires deployed IT infrastructure in a way that is familiar to the average IT administrator, Boitano noted. “We want to make it turnkey for IT admins,” he said.
Most AI workloads today are deployed by data science teams that are just starting to define and employ a set of best machine learning operations (MLOps) processes. It’s not clear what role traditional IT administrators will play in those processes. VMware, however, is clearly betting that as AI workloads become more commonly deployed across the enterprise MLOps will just become an extension of existing IT management processes.
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more
Source: Read Full Article