What is Q* and how do we use it?

georgejrjrjr@alien.top · 2 years ago

What is Q* and how do we use it?

sprectza@alien.top · 2 years ago

Yeah I think its MCTS reinforcement learning algorithm. I think DeepMind is the best lab when it comes to depeloping strategy and planning capable agents, given how good AlphaZero and AlphaGo is, and if they integrate it with the “Gemini” project, they really might just “ecliplse” GPT-4. I don’t know how scalable it would be in terms of inference given the amount of compute required.

lockdown_lard@alien.top · 2 years ago

Have DeepMind released any leading-edge tools recently? MuZero was quite a few years ago now, and AlphaGo is ancient in AI terms.

DeepMind seem to have promised an awful lot, come up with a lot of clever announcements, but been very sparse on actual delivery of much at all.

rarted_tarp@alien.top · 2 years ago

Has to be a mix of Q-learning and A* right?

letsburn00@alien.top · 2 years ago

I know you’re joking, but it’s hilarious how many random things in science just got given letters.

A* is the algorithm your phone uses to help you drive home…and the supermassive black hole in the centre of the galaxy.

TheOtherKaiba@alien.top · 2 years ago

It’s also a star.

Unfair-Emergency-658@alien.top · 2 years ago

What is a star?

KallistiTMP@alien.top · 2 years ago

…and the supermassive black hole in the centre of the galaxy.

What did you think they were gonna use for that? Djikstra’s?

RaiseRuntimeError@alien.top · 2 years ago

I was going to say it seems like it was just yesterday I was learning A* and now I find out that they are already up to Q*

DoubleDisk9425@alien.top · 2 years ago

Can you please ELI-idiot?

Local_Beach@alien.top · 2 years ago

Mayve an A* search in vector space

Mrleibniz@alien.top · 2 years ago

Let the co-founder of OpenAI John Schulman explain it to you

chipstastegood@alien.top · 2 years ago

This should be higher up.

MannowLawn@alien.top · 2 years ago

explain it to you

lol, might as wel spoken mandarin, thhis is so far away from my math skills. I have no clue what this guy is saying

ninjasaid13@alien.top · 2 years ago

What’s so special about Q*

Oswald_Hydrabot@alien.top · 2 years ago

A marketing piece by OpenAI to lie to people to hype product

Interesting_Bison530@alien.top · 2 years ago

I think there a few llms that incorporate mcts on github

RogueStargun@alien.top · 2 years ago

Q* is just a reinforcement learning technique.

Perhaps they scaled it up and combined it with LLMs

Given their recently published paper, they probably figured out a way to get GPT to learn their own reward function somehow.

Perhaps some chicken little board members believe this would be the philosophical trigger towards machine intelligence deciding upon its own alignment.

herozorro@alien.top · 2 years ago

Given their recently published paper, they probably figured out a way to get GPT to learn their own reward function somehow.

you just need 2 GPTs talking with each other. the seconds acts as a critic and guides the first

newsreddittoday@alien.top · 2 years ago

Which paper are you referring to?

balianone@alien.top · 2 years ago

We can infer that any such advance by OpenAI that follows the naming convention of “Q*” would likely be a significant development in the field of reinforcement learning, possibly expanding upon or enhancing traditional Q-Learning methodologies.

tortistic_turtle@alien.top · 2 years ago

Thanks, ChatGPT

20rakah@alien.top · 2 years ago

Wasn’t there a big thing about tree search just a few months ago? haven’t been keeping up too much.

345Y_Chubby@alien.top · 2 years ago

If it teaches itself to learn it’s just a matter of time until it teaches itself to code

HeinrichTheWolf_17@alien.top · 2 years ago

I’m wondering if Q-Star is a recursive self improvement mechanism? Perhaps the in house model they have can innovate and consistently learn on top of what it’s been trained on?

BlackSheepWI@alien.top · 2 years ago

I heard they have an even bigger breakthrough up their sleeve… Rumor is that it’s called GPT2, and it’s too dangerous to even release to the public 👀

Honest_Science@alien.top · 2 years ago

Qtransformer.github.io

olddoglearnsnewtrick@alien.top · 2 years ago

It’s a silicon based version of Qanon. I will be terminated by telling you but wait 'till they launch MAGA (Machine Augmented General AI) !!!

Useful_Hovercraft169@alien.top · 2 years ago

FREEDUM

Kep0a@alien.top · 2 years ago

gguf when

FunkyFr3d@alien.top · 2 years ago

Calling it Q was a terrible idea. The cookers are going to go crazier

DefinitelyNotEmu@alien.top · 2 years ago

https://www.nytimes.com/2023/11/28/technology/amazon-ai-chatbot-q.html

FunkyFr3d@alien.top · 2 years ago

I’m not a fan of tech companies in general but Amazon is definitely one of most disliked.