[D] How large an LLM can I train from scratch on a single A100 GPU with 80Gb memory?

eeeehhh@alien.top · 1 year ago

[D] How large an LLM can I train from scratch on a single A100 GPU with 80Gb memory?

karlwikman@alien.top · 1 year ago

This question might come off as stupid, but it’s really something I’m curious about:

I 100% see why someone would like to take a state-of-the-art current open model and fine-tune it on their own data. I don’t see why someone would want to train their own model from scratch. Can you explain it?