Google Ironwood’s new chip is 24x more powerful than the fastest supercomputer in the world
Join our daily and weekly newsletters for the latest updates and exclusive content of a leading AI coverage industry. Learn more
Google Cloud revealed its seventh generation Tension (Tpu) called Iron On Wednesday, a personalized AI Accelerator, which is claimed by the company, supplies more than 24 times larger than the computing power of the fastest supercomputer in the world when it is expanded to a scale.
The new chip announced on Google Cloud Next ’25It is a significant rotation in decades of Google’s AI Chip Development Strategy. While previous generations TPU have been designed mainly for both training and output, Ironwood is the first created specifically for the conclusions-the process of implementing trained AI models for predicting or generating answers.
“Ironwood is built in support of this next phase of generative AI and its huge computing and communication requirements,” says Amen Wahdat, Google Vice President and General Manager of ML, Systems and Cloud AI, in a virtual press conference before the event. “This is what” the epoch of the conclusion “, where AI agents will actively retrieve and generate data to provide joint presentation and answers, not just data.”
Demolition of Computing Barriers: Inside 42.5 EXAFLOPS ARONWOOD of AI Muscle
The technical specifications of Iron are striking. When scaled to 9 216 chips per pod, Ironwood supplies 42.5 Executing Power – Dwarf Captain‘S 1.7 Exaflops, currently the fastest supercomputer in the world. Each individual Ironwood chip provides a peak calculation of 4 614 teraflop.
Ironwood also has significant memory and bandwidth improvements. Each chip comes with 192GB of high bandwidth (HBM), six times more than TrilliumTPU from the previous generation of Google announced last year. The width of the memory reaches 7.2 terrabite per second per chip, 4.5 times an improvement over Trillium.
Perhaps the most important in the age of energy-limited data centers, Iron supplies twice as large as the performance of watt compared to Trilliumand is nearly 30 times more efficient than the Power of Google’s first cloud TPU from 2018.
“At a time when the available power is one of the restrictions on providing AI capabilities, we supply significantly more watt capacity to load customers,” Wahdat explained.
From building models to “thinking machines”: why Google Focus Focus now matters
The focus on the conclusion, not on the training, is a significant point of folding at the AI ​​timeline. For years, the industry has been aimed at building increasingly massive foundation models, with companies competing mainly on the size of parameters and training opportunities. Google optimization to optimize conclusions suggests that we are entering a new phase, where the implementation efficiency and reasoning opportunities occupy centrally.
This transition makes sense. The training occurs once, but conclusions operations occur billions of times a day as users interact with AI systems. The AI ​​economy is increasingly tied to the cost of conclusions, especially as the models are becoming increasingly complex and computational intensive.
During the press conference, Vahdat revealed that Google had observed a 10-fold increase in demand for AI in the last eight years, a conversion factor of 100 million in total. No quantity Moore’s law Progression can satisfy this growth curve without specialized architectures such as Ironwood.
Particularly remarkable is the focus on “thinking models” that perform complex tasks for reasoning rather than simply recognizing images. This suggests that Google sees the future of AI not only in larger models, but also in models that can destroy problems, reason through multiple steps, and essentially simulate human-like mental processes.
Gemini Thinking Coine: How Next Generation Google Models use Extended Hardware
Google positioned Ironwood as the basis for its most modern AI models including Twins 2.5which the company describes as “thinking opportunities that are embedded.”
At the Google Conference, also announced Twins 2.5 flashA more cost-effective version of its leading model that “corrects the depth of reasoning based on the complexity of prompted.” While the Gemini 2.5 Pro is intended for complex use cases such as drug detection and financial modeling, the Gemini 2.5 Flash is positioned for daily applications where responsiveness is crucial.
The company also demonstrates its full set of generative media models, including text to image, text-to-video and recently announced text to music called LyryS A demonstration showed how these tools can be used together to create a complete promotional video for a concert.
Beyond Silicon: The overall Google infrastructure strategy includes network and software
Iron is just one part of the broader Google infrastructure strategy. The company also announced CloudA managed network of wide-ranging networks that provides business with access to the private network infrastructure of Google Planet-Studies.
“Cloud Wan is a fully managed, viable and secure spine of the network, which provides up to 40% improved network efficiency, while reducing total ownership costs by the same 40%,” Wahdat said.
Google also expands its AI loading software offers including TrailsIts implementation of machine learning developed by Google Deepmind. Google Cloud trails allow customers to scathing a model serving hundreds of TPU.
AI Economics: How Google’s $ 12 billion cloud business plans to win the War of Efficiency
These hardware and software messages come to a crucial moment for Google Cloud that reported $ 12 billion in Q4 2024. RevenueIt has increased 30% compared to the year, in its latest profit report.
The economy of implementing AI is increasingly becoming a distinctive factor in cloud wars. Google is confronted with intense competition from Microsoft Azurewhich has used its Openai Partnership in a great market position and Amazon Web Servicesthat continues to expand its Tria and Integrity Chip suggestions.
What releases Google’s approach is its vertical integration. While rivals have partnerships with chip manufacturers or acquired startup companies, Google has been developing internal TPUs for more than a decade. This gives the company unmatched control over its AI stack, from silicon to software to services.
Carrying out this technology to corporate clients, Google relies that its difficult experience in building search chips, Gmail and YouTube will become a competitive advantage in the enterprise market. The strategy is clear: they offer the same infrastructure that supplies your own AI on Google, on a scale, to anyone who wants to pay for it.
Multiangent Ecosystem: Google’s AI Systems Plan that work together
Beyond the hardware, Google outlined a vision for AI focused around multi -agricultural systems. The company has announced Agent Development Kit (ADK) This allows developers to build systems where many AI agents can work together.
The most important thing is that Google has announced the “Operational Compatibility Protocol from Agent Agent” (A2A), which enables AI agents built on different frames and by different suppliers to communicate with each other.
“2025 will be a transition year in which the generative AI is shifted from the answer to single questions to solving complex problems through agreed systems,” Vahdat predicts.
Google has partnered with more than 50 industrial leaders including Salesforce., Servicenowand SapTo improve this standard for interoperability.
Verification of the reality of the enterprise: What power and efficiency of Ironwood for your II strategy
For AI enterprises, these messages can significantly reduce the costs and complexity of implementing complex AI models. Improved Ironwood efficiency can make advanced reflection models more economical, while the agent’s interoperability protocol can help the business avoid the supplier’s lock.
The influence of these achievements should not be underestimated in the real world. Many organizations are reluctant to implement modern AI models because of excessive costs for infrastructure and energy consumption. If Google can fulfill its promises of wadding, we could see a new wave of reception of AI in industries that have remained on the sidelines so far.
The approach with many agents is just as important for businesses, covered by the complexity of deploying AI in various systems and suppliers. By standardizing how AI systems communicate, Google tries to break down silos that has a limited EI impact on AI.
During the press conference, Google stressed that over 400 customer stories will be shared on the Next ’25, showing a real impact of business from its AI innovation.
Silicon Weapons Competition: Will Google’s AI Future Remove on Customs and Open Standards?
As AI continues to progress, the infrastructure feeds, it will become increasingly critical. Google’s investment in specialized hardware such as Ironwood, combined with its agent’s operational compatibility initiatives, suggests that the company is positioned for a future, with AI becoming more distributed, more complicated and deeper integrated into business operations.
“Leading models of thinking like the Gemini 2.5 and Alphafold, which has won Alphafold, are all running on TPU today,” said Wahdat. “With Ironwood, we can’t wait to see what AI breakthroughs are triggered by our own Google Cloud developers and clients when it becomes available later this year.”
Strategic consequences extend beyond Google’s own business. By insisting open standards in agents communication while maintaining its own benefits in hardware, Google is trying to delicate balancing law. The company wants the wider ecosystem to flourish (with Google infrastructure below) while maintaining competitive differentiation.
How quickly competitors respond to Google’s hardware progress and whether the industry is gathering around the proposed standards for agent’s operational compatibility will be key factors to observe in the coming months. If history is a guide, we can expect Microsoft and Amazon to oppose their own strategies for optimizing the conclusions, potentially creating a three-way race to build the most effective infrastructure stack of AI.