Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I’m generating I’d rather have a self hosted model. What should I look into?
Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I’m generating I’d rather have a self hosted model. What should I look into?
i would grab a server like vllm or text-generator.io (open source too)
Then get a model like others have suggested like deepseek or something to put in the server (both those servers are OpenAI compatible so makes switching easy)
I’ve not heard of text-generator.io, is it as performant as vllm on multibatch or is it a wrapper around it?