Paper: https://arxiv.org/abs/2311.02462

Abstract:

We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors. This framework introduces levels of AGI performance, generality, and autonomy. It is our hope that this framework will be useful in an analogous way to the levels of autonomous driving, by providing a common language to compare models, assess risks, and measure progress along the path to AGI. To develop our framework, we analyze existing definitions of AGI, and distill six principles that a useful ontology for AGI should satisfy. These principles include focusing on capabilities rather than mechanisms; separately evaluating generality and performance; and defining stages along the path toward AGI, rather than focusing on the endpoint. With these principles in mind, we propose ‘Levels of AGI’ based on depth (performance) and breadth (generality) of capabilities, and reflect on how current systems fit into this ontology. We discuss the challenging requirements for future benchmarks that quantify the behavior and capabilities of AGI models against these levels. Finally, we discuss how these levels of AGI interact with deployment considerations such as autonomy and risk, and emphasize the importance of carefully selecting Human-AI Interaction paradigms for responsible and safe deployment of highly capable AI systems.

https://preview.redd.it/64biopsh79zb1.png?width=797&format=png&auto=webp&s=9af1c5085938dac000aaf23aa1b306133b01edb4

  • imnotthomas@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    This was going to be my point as well. LLMs on their own probably aren’t there yet. But creative uses of in context learning can get you there. By having the LLM interact with the world in some way, judge it’s response against some objective, and then store the response and score in a vector db so that the next time the LLM encounters a similar scenario it can retrieve that example and use it to improve its response.

    That process can take you a long way to AGI with tech we have today