[R] Levels of AGI: Operationalizing Progress on the Path to AGI - DeepMind 2023

APaperADay@alien.top · 2 years ago

[R] Levels of AGI: Operationalizing Progress on the Path to AGI - DeepMind 2023

Difficult_Ticket1427@alien.top · 2 years ago

I doubt that any model currently is in the “emerging AGI” category (even by there own metric of “general ability and metacognitive abilities like learning new skills”).

The model(s) we currently have are fundamentally unable to update their own weights so they do not “learn new skills”. Also I don’t like how they use “wide range of tasks” as a metric. Yes, LLMs outperform many humans at things like standardized tests, but I have yet to see an LLM who can constantly play tiktaktoe at the level of a 5 year old without a paragraph of “promt engineering”

I’m not the most educated on this topic (still just a student studying machine learning) but imo I think that many researchers are overestimating the abilities of LLMs

ThisIsBartRick@alien.top · 2 years ago

To be fair, 5 year Olds don’t have the innate ability to know how tic tac toe works. Someone had to teach them. We just chose not to teach that to llms.

ThisBeObliterated@alien.top · 2 years ago

Well, you sort of answered the matter yourself - the fact that prompting works in some cases means you don’t strictly need weight updates for new skills to be learned. It doesn’t mean prompting is an end-all solution, but for DeepMind, this seems enough to consider LLMs “emerging AGI”.

Most people entering in the field now (in the literal sense, aka academia, not some random r/singularity ramblers) disregard current LLM capabilities, but their current level of reasoning was deemed almost a fantasy 5 years ago.

Dankmemexplorer@alien.top · 2 years ago

and current LLMs are pretty great for automating simple, easily defined tasks that would drive a human insane (labelling datasets etc). i’m really optomistic about their use in online moderation in the short term, lots of horror stories of facebook employees having mental breakdowns

Difficult_Ticket1427@alien.top · 2 years ago

When I mentioned prompt engineering, I more so meant that people where explaining what to do in a if/else manner to get the LLM to play tiktaktoe (not chain of thoughts or any of those techniques).

In my opinion, learning is both 1) acquiring new skills, and 2) improving upon those skills with repetition. I think it’s very debatable if an LLM could learn something truly novel (or even something like an existing game with some new rules, I.e., chess but with the game pieces in different positions) with in context learning. Secondly, no matter how much you play tiktaktoe with an LLM, it will never improve at the game.

This is just my two cents on why I don’t believe LLMs to fit the criteria of “emerging AGI” that the researchers laid out. Imo I think that to fit that criteria they would need to implement some type of online learning but I definitely could be wrong.

Comprehensive_Ad7948@alien.top · 2 years ago

Not designed or intended to- is not the same “fundementally unable”. There are quite simplistic architectures that are very able of updating their weights, which does not make them AGI or any more intelligent. The discussion is about general capability in intellectual tasks, not the training mechanisms.

Difficult_Ticket1427@alien.top · 2 years ago

I more so meant that to learn something new the model would have to update its own weights (I have my reasoning for this in another reply in this thread).

When I said “fundamentally unable to” I meant that current LLM architectures do not have the capability to update their own weights (although I probably should’ve worded that a bit differently)

Comprehensive_Ad7948@alien.top · 2 years ago

They don’t have it because it wasn’t programmed into it, because it’s risky business (see chatbot Tay), not because it’s currently impossible. There’s nothing preventing you from running backprop weight updates based on user interactions, e.g. with reinforcement from user sentiment.

lakolda@alien.top · 2 years ago

In context learning allows the model to learn new skills to a limited degree.

imnotthomas@alien.top · 2 years ago

This was going to be my point as well. LLMs on their own probably aren’t there yet. But creative uses of in context learning can get you there. By having the LLM interact with the world in some way, judge it’s response against some objective, and then store the response and score in a vector db so that the next time the LLM encounters a similar scenario it can retrieve that example and use it to improve its response.

That process can take you a long way to AGI with tech we have today

ReasonablyBadass@alien.top · 2 years ago

You can use a pretrained LLM as the core of a system capable of learning though. Like in the MemGPT paper

oldjar7@alien.top · 2 years ago

On the other hand, I asked ChatGPT-4 to build a table of specific production and GDP contribution information for the 2 dozen most important raw materials production industries and the results were well reasoned and fairly accurate. Don’t think the average person on the streets would be able to do this, let alone know what the answer is off the top of their heads like GPT-4 knows right away.

axolotlbridge@alien.top · 2 years ago

If I write out a one paragraph text on how to play a game I’ve just made up called “Madeupoly,” and you read it, we’d say that you learned a new skill. If we prompt an LLM with the same text, and they can play within the rules after, couldn’t we say they’ve also learned a new skill?