Measuring Progress Towards AGI: A Cognitive Framework
To understand AI capabilities across these cognitive abilities, we propose a three-stage evaluation protocol that benchmarks system performance in relation to human capabilities: Evaluate AI systems across a broad suite of cognitive tasks covering each ability, using held-out test sets to prevent data contamination Collect human baselines for the same tasks from a demographically representative…