Dankmemexplorer@alien.topBtoLocalLLaMA@poweruser.forum•RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it.English
1·
1 year agoim very behind, was 14B not chinchilla optimal?
im very behind, was 14B not chinchilla optimal?
“the model can have the test data as a treat”
and current LLMs are pretty great for automating simple, easily defined tasks that would drive a human insane (labelling datasets etc). i’m really optomistic about their use in online moderation in the short term, lots of horror stories of facebook employees having mental breakdowns
isnt the biggest advantage of ViTs that theyre easier to distribute training for?
the model can have a little of the test data as a treat