@koi691337 - Communick News

0 Posts
1 Comment

Joined 1 year ago

Cake day: November 22nd, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

koi691337@alien.topBtoMachine Learning@academy.garden•[R] Orca 2: Teaching Small Language Models How to Reason
link
fedilink
English
arrow-up
1·
1 year ago

Then you could have the language model generate imagined user responses and optimize the reward signal on the imagined user responses

Wouldn’t this just constitute to the model sort of overfitting to noise?

link
fedilink