• 0 Posts
  • 2 Comments
Joined 1 year ago
cake
Cake day: October 29th, 2023

help-circle
  • Q* is just a reinforcement learning technique.

    Perhaps they scaled it up and combined it with LLMs

    Given their recently published paper, they probably figured out a way to get GPT to learn their own reward function somehow.

    Perhaps some chicken little board members believe this would be the philosophical trigger towards machine intelligence deciding upon its own alignment.