Deepmind claims,
The AI system, developed by Google Deepmind, the leading Laboratory for Google Research AI, seems to have surpassed the average gold medalist in solving geometry problems in an international mathematics competition.
The system called Alphageometry2 is an enhanced version of a system, Alphageometry, This DeepMind released last JanuaryS In a A recently published studyDeepmind researchers behind Alphageometry2 say AI can solve 84% of all geometry problems over the last 25 years in the International Mathematical Olympiad (IMO), a mathematical competition for high school students.
Why is DeepMind interested in a high -level school mathematical competition? Well, the laboratory believes that the key to more capable AI can be hidden in the discovery of new ways to solve challenging geometry problems -more special Euclidean problems with geometryS
Proving mathematical theorems or logically explaining why the theorem (eg Pythagoras theorem) is true, requires both reasoning and the ability to choose from a number of possible steps to a decision. These problems for solving problems could-if Deepmind is the right-it is a useful component of future AI models for the general purpose.
In fact, in the past summer, DeepMind demonstrates a system that combines Alphageometry2 with Alphaproof, AI model for formal mathematical reasoning to solve four of six problems from IMO 2024. In addition to geometry problems, approaches such as these can be extended to other areas of Mathematics and science – for example, to support complex engineering calculations.
Alphametry2 has several basic elements, including a language model of the Gemini family of Google from AI models and a “symbolic engine”. The twin model helps the symbolic engine that uses mathematical rules to bring solutions to problems, to achieve feasible evidence for a given geometry theorem.
The problems with the geometry of the Olympics are based on diagrams that need “structures” to be added before they can be solved, such as points, lines or circles. The Alphageometry2 Gemini model predicts which structures can be useful for adding a diagram that the engine refers to deductions.
In general, the Alphageometry2 Gemini model offers steps and structures in official mathematical language to the engine, which – following specific rules – checks these steps for logical sequence. The search algorithm allows Alphageometry2 to make multiple demands on solutions in parallel and to store any useful findings in a common basis of knowledge.
Alphageometry2 considers the problem “solved” when it comes to proof, which combines the proposals of the Gemini model with the famous principles of the symbolic engine.
Due to the complexity of translating evidence into AI format, there is a shortage of usable data for geometry training. Thus, DeepMind created its own synthetic data to train the Alphageometry2 language model, generating over 300 million theorems and evidence of varying complexity.
The DeepMind team has selected 45 problems with IMO geometry in the last 25 years (from 2000 to 2024), including linear equations and equations that require moving geometric objects around the plane. They then “translated” this into a larger set of 50 problems. (For technical reasons, some problems had to be divided into two.)
According to the document, Alphageometry2 has solved 42 of the 50 problems, clearing the average result of a gold medalist from 40.9.
Of course, there are restrictions. The technical strangeness prevents problems solving Alphageometry2 with a variable number of points, non -linear equations and inequalities. A Alphageometry2 is not technically The first AI system that reaches the performance of the level of the gold-medal in geometry, although it is the first to achieve it with a problem of this size.
Alphageometry2 also coped more with another set of more difficult problems with IMO. For an additional challenge, the DeepMind team has chosen problems – a total of 29 – who were nominated for IMO exams by mathematics experts, but this has not yet appeared in a race. Alphageometry2 can solve only 20 of them.
However, the results of the study will probably nourish the debate on whether AI systems need to be built on the manipulation of symbols-that is the manipulation of symbols that represent knowledge using rules-or apparently more diligent neural networks.
Alphageometry2 perceives a hybrid approach: its model Gemini has a neural network architecture, while its symbolic engine is based on rules.
Proponents of neural network techniques claim that intelligent behavior, from recognizing speech to image generation, can appear from nothing more than huge amounts of data and calculations. Unlike symbolic systems that solve the tasks by defining sets of manipulation of symbols dedicated to specific tasks, such as editing line in text processors software, neural networks are trying to solve tasks through statistical approximation and learning from examples.
Neural networks are the milestone of powerful AI systems such as Openai’s “reasoning” model of OpenAiS But, claiming supporters of the symbolic AI, they are not the extreme all; Symbolic AIs can be better positioned to effectively encode worldwide knowledge, think along the way through complex scenarios and “explain” how they came to an answer, these supporters say.
“It is striking to see the contrast between continuing, spectacular progress on these types of indicators and, in the meantime Computer science specializing in AI, Tecrunch told. “I don’t think all this is smoke and mirrors, but it illustrates that we still do not know what behavior to expect from the next system. These systems are likely to be very impactful, so we need to understand them and the risks that are much better. “
Alphageometry2 may demonstrate that the two approaches – manipulation of symbols and neural networks – combined are a promising way forward in the search for a summary AI. In fact, according to Deepmind paper, O1, which also had a neural network architecture, could not solve any of the IMO problems that Alphageometry2 managed to answer.
This may not be so forever. In the document, the DeepMind team said he found preliminary evidence that the Alphageometry2 language model was able to generate partial solutions to problems without the help of the symbolic engine.
“(The results of the results support ideas that large language models can be self -sufficient without depending on external instruments (such as symbolic engines),” writes the Deepmind team in the article, “But while (the model) speed does not improve and get better improved and improved and improved and improved and improved and the speed hallucination are fully allowed, the tools will remain essential for mathematical applications. “