How small Chinese AI start-up DeepSeek shocked Silicon Valley
This week’s small Chinese artificial intelligence lab was stunned by the world, revealing the technical recipe for its reduction, turning its recovery leader into the ambitions of high-tech in the United States.
Founded by Hedge Fund Manager Liang Wenfeng released its R1 model, explaining about how to build a large language model, which can automatically learn and improve automatically.
US companies, including Openai and Google, deepen pioneering models, a relatively new field of AI research, which is trying to make models compose cognitive abilities. Openai of San Francisco released in December The full version of its O1 model But secretly kept his methods.
The issuance of Deepseek’s R1 in the Silicon Valley was a debate on whether better US charity companies, including the meta and humans, can protect their technical edge.
Meanwhile, Liang has become at the central point of national pride. This week he was the only one Ai The leader was elected to participate in the meeting with entrepreneurs published with the second powerful leader of the country Lee Kang. Entrepreneurs were told: “Efforts to break down through basic basic technologies.”
In 2021, Liang began to buy thousands of NVIDIA Graphic Processing Units in a high flying of her Quant Trading Fund. Industry insiders viewed it as a billionaire’s eccentric actions looking for a new hobby.
“When we first met him, he was a very sparse boy with a terrible hairstyle to train his own models to build a 10,000 chip cluster. We didn’t take him seriously, “said one of the business partners in Langang.
“He could not imagine his vision than to say. I want to build that and it will be a change in the game. We thought it was possible only from the giants, such as pods and alibaba, “the man added.
LIANG status as a foreign field in AI Field was an unexpected source of strength. Highly flying. He created a fortune using AI and algorithms to detect examples that could affect the price of shares. His team became crazy about NVIDIA chips to make money trading shares. In 2023, he started Deepseek, announcing his intention to develop ai.
“Lang built a team of exclusive infrastructure, which truly understands how the chips worked,” said one founder of “cold Slm”. “He took with him his best people, from the fence fund to deep.”
After banning from Washington, NVIDIA after exporting its most powerful chips to China, local AI had to find innovative ways, the calculation force of a limited amount of offshore chips.
“Deepseek engineers know how to open this GPUS potential, even if they are not in the art of art,” said one AI researcher near the company.
Industry insiders say that Deepseek’s unique attention to research makes it a dangerous competitor, as it is ready to share its advance than to protect them for commercial interests. Deepseek has not collected money from external means or has taken significant steps to monetize its models.
“Deepseek escapes like an early days of deeper,” said the AI ​​investor in Beijing. “It is purely focused on research and engineering.”
Liang, who is personally involved in Deepseek’s research, uses the best salary to pay the best salaries for the best AI talent. Tiktok’s owner, Deepseek is known to give the maximum remuneration available to AI engineers in China with the staff located in Hangzhou and Beijing offices.
“Deepseek’s offices are experiencing a university university for serious researchers,” said the business partner. “The team believes in the vision of Liang. Show the world that the Chinese can be creative and to build something from scratch. ”
Deepseek and high flyer did not answer the comment request.
Langseek has created Deepseek as a unique “Local” company, staff from the best universities of Chinese schools, peking, Tsinghua and Behangua than experts from US institutions.
In an interview with the internal press last year, he said that his main team “did not have those who return from abroad.” They are all local. A number of a number we need to develop the best talent. ” Deepseek’s identity as a purely Chinese LLM company, it won at home plaudits.
Deepseek claimed that it used only 2,048 NVIDIA H800s and $ 5.6 ml to train the 671 billion parameters, which Openai and Google spent relatively dimensions.
Ritvik Gupta, a researcher at the University of Berkeley, said that Deepseek’s latest modeling messages show that “there is no word about AI.
“The first person to train models must cost a lot of resources to get there,” he said. “But the second motor can achieve it cheaper and quickly.”
Gupta added that China had a much higher talented pool of systems than the United States who understands how to train and exercise more models.
Industrial deceivers say that Deepseek has made impressive results with limited resources, it remains an open question whether it can continue to be competitive, as it can develop because the industry is developing.
Returns a high flight, his big wanderer, which was postponed in 2024, who accuses a single person in the focus of the founder.
Its opponents are not yet standing. They build the next generation of NVIDIA Blackwell Chips Mega “Cluster”, creating the calculation force that threatens to create a presentation gap with Chinese competitors.
This week Openai said it was Creating a joint venture Japan with Softbank, called Stargate, plans to spend at least $ 100 billion in AI infrastructure in the United States. Elon Musk Xai is massively expanding its Colossus supercomputer, which contains more than 1MN GPU to train its Grok AI models.
“Deepseek has one of China’s largest advanced clusters,” said Liang businessman. “They have enough capacity now, but not a lot.”
Additional report by Venji Ding in Beijing