🔔 The automatic evaluation on CodaLab are under construction. The MathVista dataset is derived from three newly collected datasets: IQTest, FunctionQA, and Paper, as well as 28 other source datasets.
GPT-5.2 Pro delivers a Lean-verified proof of Erdős Problem 397, marking a shift from pattern-matching AI to autonomous ...
Federal data shows post-pandemic student math scores are still down. Maine education officials are responding with a new effort to show students that math has real-world relevance. Federal data ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results