[R] Orca 2: Teaching Small Language Models How to Reason

Memories-Of-Theseus@alien.top · 1 year ago

[R] Orca 2: Teaching Small Language Models How to Reason

til_life_do_us_part@alien.top · 1 year ago

It’s a risk if your model can’t accurately predict user responses, but I don’t see how it’s a necessary characteristic of the approach. If so the same issue would apply to model based RL in general no? Unless you are suggesting something special about language modelling or user responses which makes it fundamentally hard to learn a model of.