Amgadoz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Why didn't gpt4 work at first and how did they "fix it"?

1

Why didn't gpt4 work at first and how did they "fix it"?

Amgadoz@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

According to this tweet,

when gpt4 first finished training it didn’t actually work very well and the whole team thought it’s over, scaling is dead…until greg went into a cave for weeks and somehow magically made it work

So gpt-4 was kind of broken at first. Then greg spent a few weeks trying to fix it and then it somehow worked.

So why did it not work at first and how did they fix it?
I think this is an important question to the OSS community,

Chat

FormerIYI@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
Maybe papers from Pangu-Sigma or other large scale MoE models can be helpfulhttps://arxiv.org/abs/2303.10845