When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)?

paradigm11235@alien.top · 2 years ago

When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)?

lordpuddingcup@alien.top · 2 years ago

you pick the biggest one, it’s almost always the best unless it was truely a shitty trained model, a really well trained 30b with a 120b version 120b will be better, unless you mean by “can run them all” you mean can run full quant 7b and q1_k_m 120b lol