• 0 Posts
  • 2 Comments
Joined 11 months ago
cake
Cake day: October 30th, 2023

help-circle

  • My main desktop is an RTX 4090 windows box, so I run phind-codellama on it most of the time. If I need to extend the context window then I swap the M2 Ultra to phind so I can do 100,000 token context… but otherwise its so darn fast on the 4090 running q4 that I use that mostly.

    are you running exllama on phind for 4090? was there a reason you’d need to run it on m2 ultra when switching to 100k context?

    also, I didn’t know mistral could do coding tasks, how is it?