Nvidia blocks domestic AI chips, and the comprehensive computing power of the "China Special Edition" H20 is 80% lower than that of H100

巴比特_

These three new Nvidia AI chips are not “improved versions”, but “shrunken versions”, of which HGX H20 is limited in terms of bandwidth and computing speed, and the price of H20 is expected to decrease, but it will still be higher than the domestic AI chip 910B.

Original source: Titanium Media

Author: Lin Zhijia

Image source: Generated by Unbounded AI

On November 10, it was recently reported that chip giant NVIDIA (NVIDIA) will launch three AI chips for the Chinese market based on H100 in response to the latest chip export controls in the United States.

According to the specification document, Nvidia will soon launch new products to Chinese customers named HGX H20, L20 PCle, and L2 PCle, which are based on Nvidia’s Hopper and Ada Lovelace architectures. Judging from the specifications and naming, the three products are aimed at training, inference and edge scenarios, and will be announced on November 16 at the earliest, the product sample delivery time is from November to December this year, and the mass production time is from December this year to January next year.

**Titanium Media App learned from a number of NVIDIA industry chain companies that the above news is true. **

Titanium Media App also exclusively learned that NVIDIA’s three AI chips are not “improved versions”, but “shrunken versions”. Theoretically, the overall computing power is about 80% lower than that of the NVIDIA H100 GPU chip, that is, H20 is equal to 20% of the comprehensive computing power performance of H100, and the addition of HBM video memory and NVLink interconnection modules improves the cost of computing power. Therefore, although the price of HGX H20 will decrease compared to H100, it is expected that the price of this product will still be higher than that of the domestic AI chip 910B.

"This is equivalent to widening the highway lanes, but the toll gate entrance is not widened, limiting traffic. Also technically, through the lock of hardware and software, the performance of the chip can be precisely controlled, and there is no need to replace the production line on a large scale, even if the hardware is upgraded, the performance can still be adjusted as needed. At present, the new H20 has ‘stuck’ performance from the source. An industry source explained the new H20 chip, “For example, it used to take 20 days to run a task with H100, but now it may take 100 days for H20 to run again.” ”

Despite the release of a new round of chip restrictions in the United States, Nvidia does not seem to have given up on China’s huge AI computing power market.

So, can domestic chips be replaced?Titanium Media App learned that after testing, ** at present, in terms of large model inference, the domestic AI chip 910B can only reach about 60%-70% of A100, and the model training of the cluster is unsustainable;At the same time, 910B is much higher than Nvidia A100/H100 series products in terms of computing power consumption and heating, and is not compatible with CUDA, which is difficult to fully meet the model training needs of long-term intelligent computing centers. **

**So far, Nvidia has not officially made any comment on this. **

It is reported that on October 17 this year, the Bureau of Industry and Security (BIS) of the U.S. Department of Commerce issued new export control rules for chips, imposing new export controls on semiconductor products, including Nvidia’s high-performance AI chips, and the restrictions have taken effect on October 23. Nvidia’s filing with the U.S. SEC shows that the banned products that take immediate effect include A800, H800 and L40S, the most powerful AI chips.

In addition, the L40 and RTX 4090 chip processors retain the original 30-day window.

On October 31, there was news that Nvidia may be forced to cancel an order for advanced chips worth $5 billion, and Nvidia’s stock price fell sharply due to the news. Previously, Nvidia’s A800 and H800 exclusively supplied for China could not be sold normally in the Chinese market due to the new regulations in the United States, and these two chips were called the “castrated versions” of A100 and H100, and Nvidia reduced the performance of the chip in order to comply with the previous regulations in the United States.

On October 31, Zhang Xin, spokesman of the China Council for the Promotion of International Trade, said that the newly issued semiconductor export control rules issued by the United States to China have further tightened restrictions on the export of artificial intelligence-related chips and semiconductor manufacturing equipment to China, and included a number of Chinese entities in the “entity list” of export control. These measures by the United States have seriously violated the principles of market economy and international economic and trade rules, and exacerbated the risk of tearing and fragmentation of the global semiconductor supply chain. The ban on chip exports to China implemented by the United States since the second half of 2022 is profoundly changing global supply and demand, causing an imbalance in chip supply in 2023, affecting the pattern of the world chip industry and harming the interests of enterprises in various countries, including Chinese companies.

Comparison of the performance parameters of NVIDIA HGX H20, L20, L2 and other products

**Titanium Media App has learned,**The new HGX H20, L20, and L2 AI chip products are based on NVIDIA’s Hopper and Ada architectures respectively, which are suitable for cloud training, cloud inference, and edge inference.

Among them, the AI inference products of the latter two L20 and L2 have similar “domestic substitution” and CUDA-compatible solutions, while HGX H20 is based on H100 and AI training chip products through firmware castration, mainly replacing A100/H800.

According to the documents, the new H20 has CoWoS advanced packaging technology, and has added an HBM3 (high-performance memory) to 96GB, but the cost has also increased by $240; The FP16 dense computing power of H20 reaches 148TFLOPS (trillion floating point operations per second), which is about 15% of the computing power of H100, so additional algorithms and personnel costs need to be added. NVLink will be upgraded from 400GB/s to 900GB/s, so the interconnection rate will be greatly upgraded.

According to the evaluation, H100/H800 is the mainstream practice scheme of computing power clusters. Among them, the theoretical limit of H100 is 50,000 cards and the maximum computing power is 100,000 P, the maximum practice cluster of H800 is 20,000-30,000 cards, with a total of 40,000 P computing power, and the maximum practice cluster of A100 is 16,000 cards and the maximum computing power is 9600P.

However, the theoretical limit of the new H20 chip is 50,000 cards, but the computing power of each card is 0.148P, and the total computing power is nearly 7400P, which is lower than that of H100/H800 and A100. Therefore, the scale of the H20 cluster is far from the theoretical scale of H100, and based on the estimation of computing power and communication balance, the reasonable median overall computing power is about 3000P, and more costs and more computing power need to be added to complete the training of the 100-billion-level parameter model.

**Two semiconductor industry experts told Titanium Media App that based on the estimation of current performance parameters, it is very likely that Nvidia B100 GPU products will no longer be sold to the Chinese market next year. **

On the whole, if a large model enterprise wants to carry out large model training with parameters such as GPT-4, the scale of the computing power cluster is the core, and at present, only H800 and H100 can be competent for large model training, while the performance of the domestic 910B is between A100 and H100, which is just a “backup choice of last resort”.

Now the new H20 launched by Nvidia is more suitable for vertical model training and inference, which cannot meet the training needs of trillion-level large models, but the overall performance is slightly higher than that of 910B, coupled with NVIDIA’s CUDA ecology, thus blocking the only choice path for domestic cards in China’s AI chip market in the future under the US chip restriction order.

According to the latest financial report, in the quarter ended July 30, more than 85% of Nvidia’s $13.5 billion sales came from the United States and China, and only about 14% of sales came from other countries and regions.

Affected by the H20 news, as of the close of the U.S. stock market on November 9, Nvidia’s stock price rose slightly by 0.81% to close at $469.5 per share. In the past five trading days, Nvidia has risen by more than 10%, with the latest market value of $1.16 trillion.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments