Deepseek: Everything you need to know about the AI ​​Chatbot app

Rate this post


Deepseek became viral.

Chinese AI Laboratory Deepseek invaded the main consciousness this week after Its chatbot app rose to the top of the Apple App Store chartsS Deepseek’s AI models that were trained using computing techniques, have led Wall Street analystsand technologists – To question whether the US can retain its role in the AI ​​competition and whether the search for AI chips will maintain.

But where did Deepseek come from and how did he rise to international glory so quickly?

The origin of the depepeek merchant

Deepseek is supported by the high quality Management capital, a Chinese quantitative hedge fund that uses AI to inform its commercial solutions.

Ai enthusiast Liang Venfeng was the co-founder of High-Flyer in 2015. Venfeng, who, according to reports Focused on the development and implementation of AI algorithms.

In 2023, High-Flyer began Deepseek as a laboratory dedicated to the study of AI instruments separate from its financial business. With High-Flyer as one of its investors, the laboratory has turned into its own company, also called Deepseek.

Since the first day, Deepseek has been building his own Datacenter data clusters for model training. But like other AI companies in China, Deepseek is affected by US export bansS To train one of its latest models, the company was forced to use NVIDIA H800 chips, a less powerful version of chip, H100 available to US companies.

It is said that the Deepseek technical team is distorted by Young. The company It is reported to be aggressively recruiting Doctoral researchers at AI from best Chinese universities. Deepseek also hires people without any origin in computer science To help his technology understand a better wide range of topics, according to the New York Times.

Deepseek’s strong models

Deepseek unveiled its first set of models-DEPEPEK CODER, Depepeek Llm and Deepseek Chat-on November 2023. But it wasn’t until last spring when the launch launched its family from the next generation of Deepeek-V2 that the AI ​​industry began to notice.

Deepseek-V2, a general-purpose text and images, performed well in various AIs, was far cheaper to implement comparable models at the time. He forced Deepseek’s internal competition, including Bytedance and Alibaba, to reduce the prices of using some of their models and make others completely free.

Deepseek-V3Started in December 2024, it added only to Deepseek’s fame.

According to the internal testing of Deepseek, Deepseek V3 is superior to both models that can be downloaded as Meta Llama and “closed” models to which access only through API is made, such as Openai GPT-4OS

Equally impressive is the Deepseek R1 Reflections Model. Issued in January, according to Deepseek R1 performs as well as the O1 O1 model of key indicatorsS

As a model for reasoning, R1 is effectively checked, which helps it to avoid some of the traits that usually discard the models. Reflection models take a little longer-generally seconds to minutes longer-to get solutions compared to a typical non-discovery model. Up it is that they tend to be more reliable in areas such as physics, science and mathematics.

There is a disadvantage of other models of R1, Deepseek V3 and Deepseek. Being developed by China Ai, they are subject to comparative From the Internet regulator of China, to ensure that its answers “embody basic socialist values.” In the Deepseek Chatbot app, for example, R1 will not answer questions about Tiananman Square or Taiwan’s autonomy.

Devastating approach

If Deepseek has a business model, it is not clear what this model is, exactly. The company appreciates its products and services well below the market value – and gives others for free.

The way Deepseek tells him that the breakthroughs of efficiency have allowed him to maintain the competitiveness of final costs. Some experts dispute However, the numbers that the company provided.

Whatever the case, the developers have taken in the DEPEPEEK models, which are not open source, as the phrase is usually understood but is available under permits licenses that allow commercial use. According to Clem Delangue, CEO of Hugging Face, one of the platforms hosting Deepseek models, Hug developers have created over 500 “derivatives” models of R1 which has raised 2.5 million downloads combined.

Deepseek’s success against bigger and more established rivals has been Described as “UpenDing AI” and Rising to the “New Age of the AI ​​Brinkmanship”. The company’s success was at least partly responsible for Causing NVIDIA stock price will decrease by 18% on Mondayand for Extract a public response By the Executive Director of Openai Sam Altman.

As for what the future of Deepseek may have, it is not clear. Improved models are given. But it seems that the US government is increasing what it perceives as a harmful foreign influenceS

TechCrunch has a newsletter focused on AI! Sign up here To get it in your incoming mail every Wednesday.

 
Report

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *