What is Q* and how do we use it?

georgejrjrjr@alien.top · 1 year ago

What is Q* and how do we use it?

RogueStargun@alien.top · 1 year ago

Q* is just a reinforcement learning technique.

Perhaps they scaled it up and combined it with LLMs

Given their recently published paper, they probably figured out a way to get GPT to learn their own reward function somehow.

Perhaps some chicken little board members believe this would be the philosophical trigger towards machine intelligence deciding upon its own alignment.

herozorro@alien.top · 1 year ago

Given their recently published paper, they probably figured out a way to get GPT to learn their own reward function somehow.

you just need 2 GPTs talking with each other. the seconds acts as a critic and guides the first

newsreddittoday@alien.top · 1 year ago

Which paper are you referring to?