Paper: https://arxiv.org/abs/2311.02462

Abstract:

We propose a framework for classifying the capabilities and behavior of Artificial General Intelligence (AGI) models and their precursors. This framework introduces levels of AGI performance, generality, and autonomy. It is our hope that this framework will be useful in an analogous way to the levels of autonomous driving, by providing a common language to compare models, assess risks, and measure progress along the path to AGI. To develop our framework, we analyze existing definitions of AGI, and distill six principles that a useful ontology for AGI should satisfy. These principles include focusing on capabilities rather than mechanisms; separately evaluating generality and performance; and defining stages along the path toward AGI, rather than focusing on the endpoint. With these principles in mind, we propose ‘Levels of AGI’ based on depth (performance) and breadth (generality) of capabilities, and reflect on how current systems fit into this ontology. We discuss the challenging requirements for future benchmarks that quantify the behavior and capabilities of AGI models against these levels. Finally, we discuss how these levels of AGI interact with deployment considerations such as autonomy and risk, and emphasize the importance of carefully selecting Human-AI Interaction paradigms for responsible and safe deployment of highly capable AI systems.

https://preview.redd.it/64biopsh79zb1.png?width=797&format=png&auto=webp&s=9af1c5085938dac000aaf23aa1b306133b01edb4

  • ThisBeObliterated@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Well, you sort of answered the matter yourself - the fact that prompting works in some cases means you don’t strictly need weight updates for new skills to be learned. It doesn’t mean prompting is an end-all solution, but for DeepMind, this seems enough to consider LLMs “emerging AGI”.

    Most people entering in the field now (in the literal sense, aka academia, not some random r/singularity ramblers) disregard current LLM capabilities, but their current level of reasoning was deemed almost a fantasy 5 years ago.

    • Dankmemexplorer@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      and current LLMs are pretty great for automating simple, easily defined tasks that would drive a human insane (labelling datasets etc). i’m really optomistic about their use in online moderation in the short term, lots of horror stories of facebook employees having mental breakdowns

    • Difficult_Ticket1427@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      When I mentioned prompt engineering, I more so meant that people where explaining what to do in a if/else manner to get the LLM to play tiktaktoe (not chain of thoughts or any of those techniques).

      In my opinion, learning is both 1) acquiring new skills, and 2) improving upon those skills with repetition. I think it’s very debatable if an LLM could learn something truly novel (or even something like an existing game with some new rules, I.e., chess but with the game pieces in different positions) with in context learning. Secondly, no matter how much you play tiktaktoe with an LLM, it will never improve at the game.

      This is just my two cents on why I don’t believe LLMs to fit the criteria of “emerging AGI” that the researchers laid out. Imo I think that to fit that criteria they would need to implement some type of online learning but I definitely could be wrong.