I scaled Mistral 7B to 200 GPUs in less than 5 minutes

Ok_Post_149@alien.top · 1 year ago

I scaled Mistral 7B to 200 GPUs in less than 5 minutes

sergeant113@alien.top · 1 year ago

I’m in the middle of building my app on Modal. Guess I’ll adapt it to run on your service and see. Thanks for sharing!

Ok_Post_149@alien.top · 1 year ago

This is really cool! We are more focused on lengthy workloads so running 500k inputs through an LLM in one batch instead of on-demand inference (starting to support this). Right now the startup time is pretty long (2-5 minutes) but we are working on cutting it down.