https://www.amazon.se/-/en/NVIDIA-Tesla-V100-16GB-Express/dp/B076P84525 price in my country: 81000SEK or 7758,17 USD
My current setup:
NVIDIA GeForce RTX 4050 Laptop GPU
cuda cores: 2560
memory data rate 16.00 Gbps
My laptop GPU works fine for most ML and DL tasks. I am currently finetuning a GPT-2 model with some data that I scraped. And it worked surprisingly well on my current setup. So it’s not like I am complaining.
I do however own a stationary PC with some old GTX 980 GPU. And was thinking of replacing that with the V100.
So my question to this community is: For those of you who have bought your own super-duper-GPU. Was it worth it. And what was your experience and realizations when you started tinkering with it?
Note: Please refrain giving me snarky comments about using Cloud GPU’s. I am not interested in that (And I am in fact already using one for another ML task that doesn’t involve finetuning) . I am interested to hear about the some hardware hobbyists opinion on this matter.
https://preview.redd.it/qo7pl73erp2c1.png?width=1703&format=png&auto=webp&s=ab2ecf26490fb6b73ee28497d2ea1610b754de59
So basically either 4090 or H100
Yeah, perhaps If I am crazy enough I could just buy 3 of those and call it a day
I can’t corraborate results for Pascal cards. They had very limited FP16 performance, usually 1:64 of FP32 performance. Switching over to rtx 3090 ti from gtx 1080 got me around 10-20x gains in qlora training, assuming keeping the exact same batch size and ctx length, changing only calculations from fp16 to bf16.
I’m not sure where this chart is from, but I remember it was made before qlora even existed.
Is there any such benchmark that includes both the 4090/A100 and a mac with M2 Ultra / M3 Max? I’ve searched quite a bit but didn’t find anyone comparing them on similar setups, it seems very interesting due to the large unified memory.
A6000 being worse than 3090 doesn’t make any sense.
Man those h100s really are on another level. I shudder to think where are in 5 years.