Cerebrals just announced 6 new AI data centers that process 40m tokens per second – and this can be bad news for NVIDIA

Rate this post

Join our daily and weekly newsletters for the latest updates and exclusive content of a leading AI coverage industry. Learn more


Cerebral systemsStarting AI hardware that is constantly challenging Nvidia dominance In the artificial intelligence market, on Tuesday, a significant expansion of its imprint on the data center and two major partnerships for enterprises that position the company to become a leading provider of high -speed AI conclusions services.

The company will add six new data centers in North America and Europe, increasing its capacity for conclusions twenty times to over 40 million tokens per second. Expansion includes facilities in Dallas, Minneapolis, Oklahoma City, Montreal, New York and France, with 85% of the total capacity located in the United States.

“This year, our goal is to really satisfy the whole search and the whole new search we expect, will come online as a result of new models such as Llama 4 and new Deepeek models,” says James Wang, director of product marketing at Cerebras, in an interview with Venturebeat. “This is our huge growth initiative this year to satisfy the almost unlimited demand, which we see from all over the country for tokens of the conclusion.”

The expansion of the data center is the company’s ambitious bet that the market for high-speed AI conclusions-the process, in which the trained AI models generate results for applications in the real world-will grow dramatically as companies are looking for faster alternatives to GPU-based solutions by NVIDIA.

Cerebras plans to expand from 2 million to over 40 million tokens per second to Q4 2025 at eight North America and Europe Data Centers. (Credit: Cerebra)

Strategic partnerships that carry high -speed AI of developers and financial analysts

Along with the expansion of infrastructure, Cerebras has announced partnerships with HugThe popular AI and AI developer platform AlphasenseMarket intelligence platform widely used in the financial services industry.

Thehe Hug Integration will allow its five million developers to have access Cerebral With one click, without having to register for the cerebrals separately. This is a major channel for the distribution of heads, especially for developers working with open source models such as Call 3.3 70bS

“The hug is the kind of GitHub of AI and the center of all AI developments with an open source,” Wang explained. “The integration is super beautiful and native. You just appear on their list of suppliers of conclusions. You just check the box and then you can use CEREBRAS immediately. “

The Alphasense partnership is a significant profit for the customers of the enterprise, with the financial intelligence platform moving from what Wang described as “Global, The Best AI AI Seller of AI” to Cerebras. The company that serves approximately 85% of Fortune 100 companies uses Cerebras to speed up its AI search capabilities for market intelligence.

“This is a huge profit from customers and a very big contract for us,” Wang said. “We accelerate them with 10x, so what it took five seconds or more before the base of the brains.”

Mistral’s Le Chat, powered by Cerebras, processes 1100 tokens per second – significantly ahead of competitors such as Gemini, Chatgpt and Claude. (Credit: Cerebra)

How Cerebras wins the AI ​​speed race as reasoning models slow down

Cerebrals are positioned as a specialist in high -speed conclusions, claiming their A waffle-scale engine processor (WSE-3) It can launch AI models 10 to 70 times faster than GPU based solutions. This advantage of speed is becoming more and more a price, as AI models are developing for more complex possibilities for reasoning.

“If you are listening to Jenson’s remarks, the reasoning is the next big thing, even according to NVIDIA,” Wang said, citing NVIDIA Jensen Huang CEO. “But what he doesn’t tell you is that reasoning makes the whole thing work 10 times more slowly, because the model must think and generate a bunch of internal monologue before giving you the final answer.”

This delay creates the opportunity for Cerebras, whose specialized hardware is designed to accelerate these more complex AI loads. The company has already provided high -profile customers including Perplexity ai and Mistral you havewho use Cerebras to supply their AI demand and assist products respectively.

“We help the bewilderment to become the fastest AI search engine in the world. It’s just not possible otherwise, “Wang said. “We help Mistral to achieve the same feat. Now they have a reason for people to subscribe to Le Chat Pro, while before that your model is probably not the same level of avant-garde as GPT-4. “

The Cerebras hardware provides a speed of output up to 13 times faster than GPU solutions in popular AI models such as Llama 3.3 70B and Deepseek R1 70B. (Credit: Cerebra)

The captivating economy behind Cerebras’s challenge to Openai and Nvidia

Cerebras relies that the combination of speed and price will make its conclusive services attractive even for companies that already use leading models such as GPT-4.

Wang indicated that Meta Call 3.3 70bAn open source model that Cerebras optimizes for its hardware now evaluates the same intelligence tests as Openai GPT-4 while costing significantly less to perform.

“Anyone who uses GPT-4 today can just move to Llama 3.3 70B as a replacement for dropping out,” he explained. “The price for the GPT-4 is (about) $ 4.40 in mixed terms. And Llama 3.3 is like 60 cents. We’re about 60 cents, aren’t we? So you reduce costs by almost order. And if you use Cerebras, you increase the speed by another order. “

Inside the Cerebras Tornado Data Centers built for AI stability

The company makes significant investments in sustainable infrastructure as part of its expansion. His Oklahoma City facility, which should come online in June 2025, is intended to withstand extreme meteorological events.

“Oklahoma, as you know, is a tornado zone. So this data center is actually evaluated and designed to be completely resistant to tornado and seismic activity, “Wang said. “This will withstand the strongest tornado ever recorded in a record. If this thing just goes through, this thing will simply continue to send Llama tokens to developers. “

The Oklahoma City facility operated in partnership with Scale Datacenter will house over 300 cerebral CS-3 systems and has triple unnecessary power plants and personalized water cooling solutions specifically designed for brain waffles scale systems.

Built to withstand extreme time, this facility will have over 300 Cerebras CS-3 systems when opened in June 2025, including unnecessary powerful and specialized cooling systems. (Credit: Cerebra)

From skepticism to market leadership: how cerebrals prove their value

The expansion and partnerships announced today are an important point for Cerebras, which works to prove the AI ​​hardware market dominated by NvidiaS

“I think what was a reasonable skepticism about the assimilation of customers, maybe when we first started, I think this is already completely put in bed, just given the variety of logos we have,” Wang said.

The company is aimed at three specific areas where the quick conclusion provides the highest value: real-time voice and video processing, appointment and application coding models.

“Coding is one of these types of reasoning and regular questions and answers that may take maybe 30 seconds to a minute to generate the entire code,” Wang explained. “The speed is directly proportional to the performance of developers. So speed matters. “

Focusing on high-speed conclusions and does not compete in all AI loads, Cerebras has found a niche in which he can claim leadership even the largest cloud suppliers.

“In principle, no one competes against AWS and Azure on their scale. Obviously, we do not reach the full scale like them, but if we can repeat a key segment … On the front of high -speed conclusions, we will have more capacity than them, “Wang said.

Why the central expansion of cerebrasses is important for the sovereignty of AI and future loads

The expansion comes at a time when the AI ​​industry is increasingly focused on the possibilities of the conclusion, as companies pass from experiments with generative AI to implement it in production applications where the speed and efficiency of costs are crucial.

With 85% of its capacity for conclusions located in the United States, Cerebras is also positioned as a key participant in the development of internal AI infrastructure at a time when technological sovereignty has become a national priority.

“Cerebras is a turbocharger for the future of AI’s US leadership with incomparable efficiency, scale and efficiency – these new global data centers will serve as a backbone for the next wave of AI innovation,” says Cerebras Systems Coo, COO, in the company’s statement.

As models for reasoning such as Deepseek R1 and O3 of Openai Become more common, the search for faster concluding solutions is likely to increase. These models, which can take minutes to generate traditional hardware, work almost directly on Cerebras Systems, according to the company.

For technical persons evaluating AI infrastructure options, the expansion of Cerebras is a significant new alternative to GPU -based solutions, especially for applications where response time is crucial to consumer experience.

Whether the company can really cause NVIDIA dominance on the wider AI hardware market remains to be seen, but its focus on high-speed conclusions and essential investment in infrastructure shows a clear strategy for extracting a valuable segment from the fast-growing AI landscape.


 
Report

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *