8/01/2024

AI achieves silver-medal standard solving International Mathematical Olympiad problems

AlphaProof and AlphaGeometry teams, AI achieves silver-medal standard solving International Mathematical Olympiad problems, 25 JULY 2024.

AlphaProof: a formal approach to reasoning

AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.

A more competitive AlphaGeometry 2

AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about movements of objects and equations of angles, ratio or distances.

BENJ EDWARDS, Google claims math breakthrough with proof-solving AI models, Ars Technica, 7/26/2024 

Despite Google's claims, Sir Timothy Gowers offered a more nuanced perspective on the Google DeepMind models in a thread posted on X. While acknowledging the achievement as "well beyond what automatic theorem provers could do before," Gowers pointed out several key qualifications.

"The main qualification is that the program needed a lot longer than the human competitors—for some of the problems over 60 hours—and of course much faster processing speed than the poor old human brain," Gowers wrote. "If the human competitors had been allowed that sort of time per problem they would undoubtedly have scored higher."

Gowers also noted that humans manually translated the problems into the formal language Lean before the AI model began its work. He emphasized that while the AI performed the core mathematical reasoning, this "autoformalization" step was done by humans.

Regarding the broader implications for mathematical research, Gowers expressed uncertainty. "Are we close to the point where mathematicians are redundant? It's hard to say. I would guess that we're still a breakthrough or two short of that," he wrote. He suggested that the system's long processing times indicate it hasn't "solved mathematics" but acknowledged that "there is clearly something interesting going on when it operates."

Even with these limitations, Gowers speculated that such AI systems could become valuable research tools. "So we might be close to having a program that would enable mathematicians to get answers to a wide range of questions, provided those questions weren't too difficult—the kind of thing one can do in a couple of hours. That would be massively useful as a research tool, even if it wasn't itself capable of solving open problems."

沒有留言:

張貼留言