Deepseek is fired

Rate this post


It’s been a little over a week from Deepseek Notify the world of AIS The introduction of her model with an open weight-evidently trained on some of the specialized computing chips that the leaders of the energy industry – shock waves inside Openai. Employees not only claimed that they saw that Deepseek has “inappropriately distilled” Openai models to create their own, but the success of the startup Wall Street calls into question whether companies like Openai wildly review Compute.

“Deepseek R1 is the moment of Sputnik of AI,” writes Mark Andresen, one of the most influential and provocative inventors of the Silicon Valley, on xS

In response, Openai is preparing to launch a new model today, before its initially planned schedule. The O3-Mini will debut in both API and chat. Sources say there are reference to O1 at 4-level. In other words, he is fast, cheap, smart and intended to crush Deepseek.

The moment was galvanized by Openai employees. Inside the company, there is a feeling that – especially as a Deepseek, it dominates the conversation – Openai must become more effective or risk behind its most competitor.

Part of the problem stems from the origin of Openai as a non -profit research organization before it becomes a profit -seeking power plant. The continued power struggle between research and product groups, according to employees, led to a break between teams working on sophisticated reasoning and those working on chat. (Openai spokesman Nico Felix says this is “incorrect” and notes that the leaders of these teams, Chief Product Officer Kevin Wil and Director General of Research Mark Chen, “meet every week and work in close cooperation to are aligned with the priorities of products and research. “)

Some inside Openai want the company to build a single chat product, a model that can tell if the issue requires sophisticated reasoning. This has not happened so far. Instead, the Chatgpt drop-down menu prompts users to decide whether they want to use GPT-4O (“great for most questions”) or O1 (“uses extended reasoning”).

Some employees claim that while the chat carries the lion’s share of Openai’s revenue, O1 receives more attention – and computing resources – from leadership. “Management is not interested in chat,” says a former employee who worked in chat (you knew). “Everyone wants to work on the O1 because it is sexy, but the code base is not built for experiments, so there is no inertia.” The former employee requested that he remains anonymous, citing a non -disclosure agreement.

Openai spent years experimenting with reinforcement, learning to refine the model, which eventually became the modern system of reasoning called O1. (Strengthening training is a process that trains AI models with a system of penalties and awards.) Deepseek has built from the amplifier work that Openai is a pioneer to create its system of advanced reasoning called R1. “They took advantage of the knowledge that strengthening the training applied to the language models works,” says a former Openai researcher who is not authorized to speak publicly about the company.

“Deepseek is similar to what we did in Openai,” says another former Openai researcher, “But they did it with better data and a cleaner stack.”

Openai employees say the research that has entered the O1 are done in a code base called “Berry” stack built for speed. “There were compromises-exposed rigor for bandwidth,” says a former employee with a direct knowledge of the situation.

These compromises made sense for the O1, which was essentially a huge experiment, negative limits to the code base. They didn’t make that much sense for a chat, a product used by millions of users, which is built on a different, more reliable stack. When the O1 started and became a product, the cracks began to appear in the internal processes of Openai. “It was like:” Why do we do this in the experimental code base, should we not do this in the main code for researching products? “, Explains the employee. “There was a great retreat inside.”

 
Report

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *