BlueMetaMind@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

How to run 70B on 24GB VRAM ?

1

How to run 70B on 24GB VRAM ?

BlueMetaMind@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

I want to run a 70B LLM locally with more than 1 T/s. I have a 3090 with 24GB VRAM and 64GB RAM on the system.

What I managed so far:

Found instructions to make 70B run on VRAM only with a 2.5B that run fast but the perplexity was unbearable. LLM was barely coherent.
I randomly made somehow 70B run with a variation of RAM/VRAM offloading but it run with 0.1 T/S

I saw people claiming reasonable T/s speeds. Sine I am a newbie, I barely can speak the domain language, and most instructions I found assume implicit knowledge I don’t have*.

I need explicit instructions on what 70B model to download exactly, which Model loader to use and how to set parameters that are salient in the context.

Chat

TuuNo_@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
Well, I have never used Linux before since the main purpose of my pc is gaming. But I heard running LLMs on Linux is overall faster.
- silenceimpaired@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  2 years ago
  It is… but koboldcpp doesn’t have a executable for me to run :/