I have a dataset with context length of 6k. I want to fine tune Mistral model with 4 bit quantization and I faced with error while using 24 GIG RAM GPU.
Is there any base how much ram is needed for this large context length?
Thanks!
I have a dataset with context length of 6k. I want to fine tune Mistral model with 4 bit quantization and I faced with error while using 24 GIG RAM GPU.
Is there any base how much ram is needed for this large context length?
Thanks!
If you don’t want to lower the context length to fit on 24G, you can find A100_80GB (or 40GB) on shadeform’s cloud marketplace