Orca 2: Teaching Small Language Models How to Reason

Memories-Of-Theseus@alien.top · 2 years ago

Orca 2: Teaching Small Language Models How to Reason

TheCrazyAcademic@alien.top · 2 years ago

It’d be interesting to see how an MoE framework of multiple Orca 2s each trained on different subsets of data basically routing your prompt to different orca 2 experts would fair. I feel like that can come extraordinarily close to a GPT 4 in performance metrics but would take decent computing power to test the hypothesis. If each orca 2 expert is 10 billion parameters and you wanted to run a 100 billion sparse orca 2 MoE that’s gonna require at least 500 gig+ of VRAM at minimum.

Orca 2: Teaching Small Language Models How to Reason

Orca 2: Teaching Small Language Models How to Reason

Orca 2: Teaching Small Language Models&nbsp;How to Reason

Orca 2: Teaching Small Language Models How to Reason