ARC Prize 2024 // $1M
Jean Piaget, defined intelligence as “what you use when you don’t know what to do.”
Overview
In this competition, you’ll develop AI systems to efficiently learn new skills and solve open-ended problems, rather than depend exclusively on AI systems trained with extensive datasets. The top submissions will show improvement toward human reasoning benchmarks.
Description
Current AI systems can not generalize to new problems outside their training data, despite extensive training on large datasets. LLMs have brought AI to the mainstream for a large selection of known tasks. However, progress towards Artificial General Intelligence (AGI) has stalled. Improvements in AGI could enable AI systems that think and invent alongside humans.
The Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) benchmark measures an AI system’s ability to efficiently learn new skills. Humans easily score 85% in ARC, whereas the best AI systems only score 34%. The ARC Prize competition encourages researchers to explore ideas beyond LLMs, which depend heavily on large datasets and struggle with novel problems.
This competition includes several components. The competition as described here carries a prize of $100,000 with an additional $500,000 available if any team can beat a score of 85% on the leaderboard. Further opportunities outside of Kaggle are also available with associated prizes- to learn more visit ARCprize.org.
Your work could contribute to new AI problem-solving applicable across industries. Vastly improved AGI will likely reshape human-machine interactions. Winning solutions will be open-sourced to promote transparency and collaboration in the field of AGI.
The speaker reflects on the progress in AI, noting advancements in tasks like image recognition and natural language processing. Despite these achievements, the ultimate goal of achieving human-level general artificial intelligence (AGI) remains elusive. The speaker questions the effectiveness of current benchmarks in measuring true progress towards AGI and emphasizes the need to define intelligence rigorously.
The speaker argues that task-specific skills are not good proxies for intelligence. High levels of skill can be achieved without true intelligence through hardcoded solutions, like chess engines that excel at chess but lack broader intelligence, or by training on extensive data, which relies on memory rather than genuine understanding.
True intelligence, as seen in humans and animals, is the ability to adapt to new, unseen situations and acquire new skills without prior preparation. The speaker quotes Jean Piaget, a Swiss psychologist and the father of developmental psychology, who defined intelligence as “what you use when you don’t know what to do.”
The speaker discusses a promising approach to AI development that involves combining discrete program search with deep learning-based intuition. They note that discrete program search has proven effective and expected results, but it operates naively, leading to combinatorial explosion problems. Large Language Models (LLMs) have shown good intuition in solving tasks, suggesting the next step is to integrate these two methods.
By augmenting program search with deep planning-driven intuition from deep learning models, the search process can be guided more intelligently. The deep learning model can provide suggestions on what to try next or sketch out potential solutions, thus improving efficiency and effectiveness. This hybrid approach, which few have explored, is expected to yield high-quality solutions in the coming years.