any open source LLM you want scaled to 200 gpus I will create a tutorial for

Ok_Post_149@alien.top · 1 year ago

any open source LLM you want scaled to 200 gpus I will create a tutorial for

Ok_Post_149@alien.top · 1 year ago

This is really useful feedback, I’d definitely be able to produce a revenue generating product faster if I focus on chatbots… so in terms of trying to get funding for this idea that seems to be the better avenue. In the future I could definitely address both use cases but trying not to spread myself too thin at the moment.

Ok_Post_149@alien.top · 1 year ago

Thanks for this feedback, what is your definition of an on-prem chatbot? Hosted on their physical infrastructure?

Ok_Post_149@alien.top · 1 year ago

on-demand inference or batch inference?

Ok_Post_149@alien.top · 1 year ago

Is home hardware a requirement for this project? I guess I’m a little confused what that has to do with model hallucinations.

Ok_Post_149@alien.top · 1 year ago

I just wrote a tutorial on how you can scale Mistral-7b to many GPUs in the cloud. I hope this can give you some value. Not sure if you’re looking to do on-demand inference or inference on a bunch of inputs.

https://www.reddit.com/r/LocalLLaMA/comments/17k2x62/i_scaled_mistral_7b_to_200_gpus_in_less_than_5/

Ok_Post_149@alien.top · 1 year ago

This is really cool! We are more focused on lengthy workloads so running 500k inputs through an LLM in one batch instead of on-demand inference (starting to support this). Right now the startup time is pretty long (2-5 minutes) but we are working on cutting it down.

Ok_Post_149@alien.top · 1 year ago

I scaled Mistral 7B to 200 GPUs in less than 5 minutes