Apple researchers find ‘major’ flaws in AI reasoning models ahead of WWDC 2025
Apple's Machine Learning Research study challenges the notion of true reasoning in large language models like o1 and Claude's variants. Through custom puzzle environments, researchers found that accuracy collapses beyond certain complexity thresholds, even with sufficient resources. The models exhibited inconsistent performance, suggesting reliance on pattern matching rather than genuine reasoning, highlighting fundamental scaling limitations.