Hands with Gemini 2.5 Pro: Why can it be the most useful model of reasoning so far
Join our daily and weekly newsletters for the latest updates and exclusive content of a leading AI coverage industry. Learn more
Unfortunately for Googlethe exit of his most leading language of the language, Twins 2.5 Prois buried under Ghibli AI Image Storm Studio This sucks the air from the AI ​​space. And maybe he is afraid of his previous unsuccessful launches, Google cautiously presented He, as the “most intelligent model of AI,” instead of the approach of other AI laboratories that introduce their new models as the best in the world.
However, practical experiments with real world examples show that the Gemini 2.5 Pro is really impressive and can now be the best model for reasoning. This opens the way for many new applications and possibly put Google at the head of the AI ​​generative race.

A long context with good encoding options
The exclusive characteristic of the Gemini 2.5 Pro is its very long context window and the length of the output. The model can process up to 1 million tokens (soon and 2 million), which makes it possible to fit multiple long documents and entire codes of codes in the prompted when needed. The model also has an output limit of 64,000 tokens instead of about 8,000 for other Gemini models.
The long context window also allows for extended calls, as any interaction with a reasoning model can generate tens of thousands of tokens, especially if it includes code, images and video (I came across this problem with CLUude 3.7 Sonnet, which has a 200,000 token window).
For example, software engineer Simon Willison uses the Gemini 2.5 Pro to create a new feature for his website. Willison sad in her blog“It crunched Through My Entire Codebase and Figured Out All of the Place I Needed to Change -18 Files in Total, As You Can See in the Resulting Pr. Three Minutes Per File I Had to Modify.
Impressive multimodal reasoning
The Gemini 2.5 Pro also has impressive skills for reflection on unstructured text, images and video. For example, I provided him the text of my recent article about Searching for sampling sampling and prompted it to create SVG graphics that depicts the algorithm described in the text. The Gemini 2.5 Pro properly retrieves key information from the article and created a block -scheme for the sampling and search process, even to get the conditional steps correctly. (For reference, the same task has undertaken multiple interactions with CLUDE 3.7 SONNET and I ultimately kicked out the marker’s limit.)

The depicted image had some visual errors (arrows are wrong). He can use a facelift, so I then tested the Gemini 2.5 Pro with a multimodal prompted, providing him with a screenshot of the depicted SVG file with the code and promoted it to improve it. The results were impressive. It corrects the arrows and improved the visual quality of the diagram.

Other users have had similar experience with multimodal prompts. For example, in Their tests, Datacamp, play the example of the runner game, presented on the Google blog, then provided the code and video of the Gemini 2.5 Pro game and prompted it to make some changes to the game code. The model can think over the visualizations, find the part of the code that needs to be changed, and make the right modifications.
However, it is worth noting that, like other generative models, the Gemini 2.5 Pro is inclined to make mistakes such as changing unrelated files and code segments. The more precise your instructions are, the less the risk of the model will make incorrect changes.
Analysis of data with a useful trace of reasoning
I finally tested gemini 2.5 Pro on my Classically messy data analysis test For reasoning models. I provided him with a file containing a mix of plain text and raw HTML data that I had copied and placed from different pages of stock history in Yahoo! Finance. I then urged him to calculate the value of a portfolio, which will invest $ 140 at the beginning of each month, distributed evenly in the magnificent 7 shares, from January 2024 to the last date in the file.
The model correctly identifies which stocks should choose from the file (Amazon, Apple, NVIDIA, Microsoft, Tesla, Alphabet and Meta), derive financial information from HTML data and calculate the value of each investment based on the price of shares at the beginning of each month. He responded to a well -formatted stock table and portfolio value every month and provides a breakdown of how much the whole investment costs at the end of the period.

More importantly, I have found that the trace of reasoning is very useful. It is unclear if Google reveals the tokens of the raw thought chain (COT) for the Gemini 2.5 Pro, but the reflection trace is very detailed. You can clearly see how the model is thinking about the data, you retrieve different bits of information and calculate the results before generating the answer. This can help to eliminate the behavior of the model and direct it in the right direction when making mistakes.

Considerations for an enterprise?
A concern for the Gemini 2.5 Pro is that it is only available in reasoning mode, which means that the model always goes through the “thinking” process even for very simple prompts that can be answered directly.
Gemini 2.5 Pro is currently in the publicization. Once the full model has been released and pricing information is available, we will have a better understanding of how much it will cost to build corporate applications on the model. As the cost of conclusions continue to fall, we can expect it to become practical on scale.
The Gemini 2.5 Pro may not have had the most spread debut, but its capabilities require attention. Its massive context window, the impressive multimodal reflections and a detailed chain of reasoning offer tangible advantages for complex workloads of enterprises, from the refactor of the code base to a nuanced data analysis.