Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Nvidia isn’t the only player in the AI accelerator market. Intel is making significant strides in this field with its Intel Gaudi 2 technology, according to fresh research from Databricks.
New research released by Databricks today shows that Intel’s Gaudi 2 offers strong competition to Nvidia’s AI accelerators. Databricks found that for large language model (LLM) inference, Gaudi 2 matched the decoding latency of Nvidia’s H100 systems and outperformed the Nvidia A100. Gaudi 2 also showed higher memory bandwidth utilization than both the H100 and A100.
However, Nvidia still has an edge in training performance on its top-tier accelerators. When using the Databricks MosaicML LLM foundry for training, Gaudi 2 achieved the second-fastest single-node LLM training performance, following Nvidia’s H100, achieving over 260 TFLOPS per chip. According to Databricks, based on public cloud pricing, Gaudi 2 offers the best dollar-per-performance ratio for both training and inference compared to Nvidia’s A100 and H100.
Intel has provided its own benchmarks for Gaudi 2 through the MLcommons MLperf benchmark for both training and inference. Databricks’ new data offers further validation for Intel’s Gaudi technology from an independent party.
Intel’s Gaudi technology, acquired through the purchase of AI chip startup Habana Labs in 2019 for $2 billion, has continued to advance. Both Intel and Nvidia regularly participate in the MLcommons MLPerf benchmarks, which are updated several times a year. In the latest MLPerf 3.1 training benchmarks released in November, both companies claimed new speed records for LLM training. Earlier, in September, the MLPerf 3.1 inference benchmarks also showed strong competition between Nvidia and Intel.
Despite the importance of benchmarks like MLPerf, many customers prefer to conduct their own tests to ensure the hardware and software stack meets their specific needs. According to Eitan Medina, COO at Habana Labs, while MLPerf results show a level of maturity needed to submit results, customers primarily rely on their own testing for business decisions.
Looking ahead, Intel is preparing to launch the Gaudi 3 AI accelerator in 2024. Gaudi 3, developed with a 5-nanometer process, will deliver four times the processing power and double the network bandwidth compared to Gaudi 2, which uses a 7-nanometer process. Gaudi 3 is set to enter mass production in 2024.
Beyond Gaudi 3, Intel is working on future generations that will merge its high-performance computing (HPC) and AI accelerator technology, anticipated around 2025. Intel also continues to emphasize the value of its CPU technologies for AI inference workloads. Recently, the company announced its 5th Gen Xeon processors with AI acceleration.
Overall, Intel aims to offer a range of solutions, combining Gaudi accelerators with CPUs to handle data preparation and workloads with high compute density, thus maintaining a comprehensive strategy to meet various AI needs.