Key points are not available for this paper at this time.
The evaluation of incremental progress towards 'Strong AI' or 'AGI' remains a challenging open problem. In this paper, we draw inspiration from benchmarks used in artificial commonsense reasoning to propose a new benchmark problemthe Toy Box Problem-that tests the practical real-world intelligence and learning capabilities of an agent. An important aspect of a benchmark is that it is realistic and plausibly achievable; as such, we outline a preliminary solution based on the Comirit Framework.
Benjamin Johnston (Fri,) studied this question.