Page Nav




Trending News


U.S. Sanctions Drive Chinese Firms To Advance In AI

Chinese tech companies have been forced to find a workaround to develop their own semiconductor technologies / Baidu. U.S. sanctions are spu...

Chinese tech companies have been forced to find a workaround to develop their own semiconductor technologies / Baidu.
U.S. sanctions are spurring Chinese tech companies to accelerate research to develop cutting-edge artificial intelligence without relying on the latest American chips. 

The U.S. sanctions imposed on several Chinese tech companies like Huawei and ZTE have led to increased investment in domestic chip manufacturing, with Chinese companies such as SMIC (Semiconductor Manufacturing International Corporation) and Huawei's HiSilicon investing heavily in developing their own semiconductor technologies.

Recent research papers and interviews with employees found that Chinese companies are studying techniques that could allow them to achieve state-of-the-art AI performance with fewer or less powerful semiconductors. 

U.S. restrictions on semiconductor exports mean that Chinese AI developers no longer have access to the Nvidia A100 chips favored by the industry.

Chinese researchers are also researching how to combine different types of chips to avoid relying on any one type of hardware. Chinese telecommunications provider Huawei Technologies, search firm Baidu and e-commerce giant Alibaba Group are among those seeking ways to milk more utility out of existing computer chips. 

Using these workarounds to catch up with American AI leaders remains a significant challenge, researchers and analysts said. Some experiments have shown promise, however, and if advanced successfully, the research could allow Chinese tech firms to both weather American sanctions and make them more resilient to future restrictions, they said. 

Huawei and Baidu declined to comment. Alibaba didn’t respond to a request for comment. As the race heats up to commercialize ChatGPT-like models, companies globally are in need of more powerful chips and seeking ways to squeeze more out of them to drive down the exploding costs of AI development. 

For Chinese companies, the issue is more critical: U.S. sanctions have cut them off from the most advanced chips made by the likes of Nvidia and they have rapidly consumed existing American chip stocks to create their own ChatGPT equivalents, say employees, AI researchers and industry analysts.

“You can just tell, reading between the lines, that they’re trying to find any compute under the sun to compensate for the lack of top-tier hardware,” said Susan Zhang, an AI researcher at Meta Platforms who specializes in AI infrastructure and large language models. In the AI industry, computing refers to the amount of computing power available in a set of chips.

Beijing’s highest decision-making body said last month that China should encourage innovation in the development of artificial general intelligence. After the Commerce Department imposed sweeping restrictions on supplying chips to China last October, the Biden administration has indicated it could implement further sanctions.

Huawei to support its devices after Google Android bar / SCP.
Chinese companies are cut off from Nvidia’s A100 chips, the most popular within the industry for AI development, and the next generation version, the H100 released in March, which offers more computational power. Nvidia created downgraded versions of its chips for the Chinese market, called the A800 and H800, respectively, to meet sanction requirements. Both modified chips reduce the capacity of a chip to communicate with others.

The products provide an effective alternative for developing small-scale AI models, such as those used in the recommendation algorithm driving ByteDance’s short-video app TikTok. But the handicap throttles the development of larger AI models, which require the coordination of hundreds or thousands of chips. 

Chinese companies such as Alibaba and Baidu, which stockpiled A100s before the sanctions, have heavily restricted the use of foreign advanced chips internally, reserving them for the most computationally intensive tasks, according to people familiar with the matter. Baidu suspended use of its A100s across teams, including its self-driving unit, to pool them for the development of its ChatGPT equivalent, Ernie Bot, before its launch date, the Journal previously reported.

Baidu has sought in recent years to incorporate domestic chips into its AI development, including Hygon Information Technology’s DCU and Huawei’s AI training chip Ascend, as well as its own called Kunlun, according to open-source research papers and people familiar with the matter. Many of the domestic chips remain unreliable for training large-scale models, however, because they are prone to crashing, some of the people said. 

One of the newest chatbots from the Palo Alto-based company / Baidu.
OpenAI released ChatGPT a month after the chip sanctions were announced. The launch triggered a global frenzy to develop generative AI, software that can produce text and images and requires an unprecedented amount of computational power to develop. UBS analysts estimate that it takes between 5,000 and 10,000 A100 chips to train these kinds of large AI models. OpenAI didn’t respond to a request for comment.

A survey by a Chinese-government-linked semiconductor industry association released at a recent closed-door industry conference showed the supply constraints, finding that there were around 40,000 to 50,000 A100s in China available for training large-scale AI models, according to a person who attended the meeting. The association didn’t respond to a request for comment.

Many Chinese firms are now trying to combine three or four less-advanced chips, including the A800 and H800, to simulate the performance of one of Nvidia’s most powerful processors, according to Yang You, a professor at the National University of Singapore who runs an AI infrastructure company, HPC-AI Tech. In April, Tencent unveiled a new computing cluster—a set of connected chips for large-scale AI model training using Nvidia’s H800s. 

This approach can be costly: If a U.S. firm needs 1,000 H100s to train a large language model, a Chinese firm could need 3,000 or more H800s to achieve the same results, Mr. You said.  That is driving some firms to accelerate the development of techniques to train large-scale AI models across different types of chips, Mr. You said, an area of research that was already common among Chinese firms with limited hardware resources that were keen on cutting costs. 

Alibaba, Baidu, and Huawei have sought to use various combinations of A100s, older generation Nvidia chips known as V100s and P100s as well as Huawei Ascends, papers show. By contrast, using multiple types of chips is rarely seen among U.S. companies because of the technical challenges of getting them to work reliably, AI experts said. “This is a last-ditch resort,” Meta’s Ms. Zhang said. 

The technological giant, Alibaba created by Jack Ma is one of the most powerful holding companies in the world / SCP.
In parallel, Chinese firms have sought to use various software techniques to reduce the computational intensity of training large-scale AI models, an approach that has accelerated globally, including among U.S. companies. Unlike U.S. companies, however, Chinese companies have been more aggressive in combining multiple software techniques together, papers show.

While many of these methods are still being ironed out in the global research community and are difficult to implement, Chinese researchers have seen some success. In a paper in March, Huawei researchers demonstrated how they could use such techniques to train its latest-generation large language model using only the company’s Ascend chips and without Nvidia chips. 

Despite some shortcomings, the model, known as PanGu-Σ, reached state-of-the-art performance on a few Chinese-language tasks, including reading comprehension and grammar challenges, the researchers wrote in the paper. Dylan Patel, the chief analyst at semiconductor research and consulting firm SemiAnalysis, said Chinese researchers’ pain points will only exacerbate without access to the new Nvidia H100, which includes an extra performance-boosting feature especially helpful for training ChatGPT-like models. 

But a paper last year from Baidu and Peng Cheng Laboratory, a Shenzhen-based research institute, showed researchers were training large language models in a way that would make the feature unnecessary. Mr. Patel said it looked promising even though the research was in its early stages. “If it works well, they can effectively circumvent the sanctions,” he said.