[P] X—LLM: Few lines of code to train your own 7B LLM in Colab using cutting edge techniques like QLoRA

DesperatePresence473@alien.top · 1 year ago

[P] X—LLM: Few lines of code to train your own 7B LLM in Colab using cutting edge techniques like QLoRA

Crafty-Run-6559@alien.top · 1 year ago

Any idea what the vram requirements are for locally training a 7b qlora?

DesperatePresence473@alien.top · 1 year ago

I strongly recommend training on a GPU, as it speeds up the training process by an order of magnitude and has become the standard. I can recommend services that offer GPU rentals at the lowest prices.
https://vast.ai
https://www.runpod.io
https://www.tensordock.com

Crafty-Run-6559@alien.top · 1 year ago

Ah, OK- but what about a setup with dual local 3090s?

What kind of gpu rental would you recommend? An a100 80gb?

DesperatePresence473@alien.top · 1 year ago

I apologize, I’ve confused you. At first, I read RAM and thought that you wanted to train on the CPU.
Of course, 2 x 3090 would be more than enough for training. I believe even a 13B model with a large context length could be trained.
If you have 2 GPUs, I suggest training through the command line and utilizing DeepSpeed or FSDP (which has been tested less).
Here are examples of projects where it’s explained in detail how you can train:
https://github.com/BobaZooba/xllm-demo
https://github.com/BobaZooba/wgpt
On Twitter, one person unknown to me posted about how they easily managed to train on multi-gpu (a super simple and short example):

https://twitter.com/darrenangle/status/1724913070105841806

Crafty-Run-6559@alien.top · 1 year ago

Awesome thank you.

Last question! Would it be reasonable to train on a single 3090 following that guide as well?

Edit: train a 7b on single

DesperatePresence473@alien.top · 1 year ago

And feel free to ask! I’m just here to help you

DesperatePresence473@alien.top · 1 year ago

It depends on how deeply you want to immerse yourself. The library is intended for both rapid prototyping and production-ready development. I would recommend starting with the former, it’s very simple and will take about 10-15 minutes to get started, not including training time.
Here is a notebook that allows you to train models on a single GPU:

https://colab.research.google.com/drive/1CNNB\_HPhQ8g7piosdehqWlgA30xoLauP
You can download it and train your model locally on your computer.

Crafty-Run-6559@alien.top · 1 year ago

Thank you so much, this is awesome.