Groq was founded in 2016. As early as 2021, Groq was called the "strongest challenger to NVIDIA". In 2021, Groq raised $300 million led by well-known investment firms such as Tiger Global Management and D1 Capital, bringing its total funding to $367 million.
In August 2023, Groq introduced the Groq LPU, which can run a 70 billion parameter enterprise-level language model at a record-breaking speed of over 100 tokens per second. Groq estimates it has a speed advantage of 10 to 100 times over other systems.
Groq's founder and CEO Jonathan Ross once said: "Artificial intelligence is constrained by existing systems, many of which are being followed by newcomers or gradually improved. No matter how much money you invest in this issue, traditional architectures like GPUs and CPUs struggle to meet the rapidly growing demands of artificial intelligence and machine learning... Our mission is more disruptive: Groq seeks to unlock the potential of artificial intelligence by reducing computing costs to zero."
Experts Question the Cost-Effectiveness and Competitiveness of Groq LPU
Associate Professor He Hu from the School of Integrated Circuits at Tsinghua University stated that LPU chips belong to the reasoning category and are not competing in the same field as the currently in-demand GPUs, which are primarily used for training large models. In terms of reasoning chips, LPUs may have achieved relatively high performance, but their operating costs are not low. High-performance, low-cost reasoning chips could reduce inference costs and broaden the application scope of AI large models. Their market prospects largely depend on market choices for inference needs and are less about technological competition.
As the name suggests, training chips are mainly used for training large models, while reasoning chips are mainly for AI applications. The industry believes that as various sectors welcome vertical large models and AI large model applications gradually take shape, computing power for inference will receive as much attention as that for training.
However, even for inference, some experts have calculated based on the memory capacity and throughput of large model operations of LPU and GPU that, in terms of cost-effectiveness and energy efficiency, LPU cannot compete with Nvidia's GPU.
Jia Yangqing, a former AI scientist at Facebook and former Vice President of Technology at Alibaba, analyzed on an overseas social media platform that the Groq LPU has a very small memory capacity (230MB). Simple calculations show that running a model with 70 billion parameters would require 305 Groq cards, equivalent to using 8 Nvidia H100 cards. Looking at the current prices, this means that for the same throughput, the hardware cost of Groq LPU is about 40 times that of H100, and the energy cost is about 10 times as much.
A leader from a top domestic AI chip company also agrees with the aforementioned calculation. He believes that unlike GPUs, which use HBM (High Bandwidth Memory), LPUs use SRAM (Static Random-Access Memory) for storage, which means many cards are required to run a large model.
Tencent's chip expert Yao Jinxin even bluntly stated: "Nvidia's absolute lead in this AI wave has the world eagerly anticipating challengers. Each eye-catching article is initially believed, not only for this reason but also because of the 'tricks' used in comparisons, deliberately ignoring other factors, and comparing with a single dimension."