Inevitable-Army-4274@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

LoRAX: Open Source Serving for 100s of Fine-Tuned LLMs in Production

4

1

LoRAX: Open Source Serving for 100s of Fine-Tuned LLMs in Production

Inevitable-Army-4274@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

4

Last month, we announced LoRAX (LoRA eXchange), a framework that makes it possible to serve hundreds of fine-tuned LLMs on one GPU with minimal degradation in throughput and latency. Today, we’re excited to release LoRAX to the open-source community under the permissive and commercial-friendly Apache 2.0 license. (original LoRAX blog).
What is LoRAX?
LoRAX works by loading in the fine-tuned “adapter” weights dynamically at runtime. Combining this with an optimized caching and scheduling policy that allows us to fuse multiple adapters into a single batch, LoRAX gives you the best of both worlds: low-cost serving with high performance. 💸 🏎️
Why open source?
At Predibase, we believe the future is smaller, faster, cheaper fine-tuned models. To get there, we as a community must work together to make serving fine-tuned models cost-competitive with the big commercial APIs.
As the core maintainers of Ludwig (https://ludwig.ai/) and Horovod (https://github.com/horovod/horovod), we’re no strangers to building communities around open-source AI. This isn’t a side project for us, it’s the foundation of our mission. 💪
Why join the LoRAX community?
🚢 Built for scale. LoRAX isn’t an academic project, it’s production infrastructure. Batteries included with pre-built Docker images, Helm charts for Kubernetes, metrics, and telemetry.
🤝 Research meets production. Bring together the best ideas from research into a single production framework (example: recently integrated SGMV kernel from Punica for significant performance improvements: https://arxiv.org/abs/2310.18547).
🕊️ Commercially viable, always. Whether you’re an individual developer or an AI platform like Predibase, you can build on LoRAX thanks to the permissive Apache 2.0 license.
Try LoRAX yourself today, and join the community to contribute and receive updates as we continue to invest in growing LoRAX in the weeks and months ahead.
Blog: https://predibase.com/blog/lorax-the-open-source-framework-for-serving-100s-of-fine-tuned-llms-in
GitHub: https://github.com/predibase/lorax

https://preview.redd.it/tscb64btqy0c1.png?width=1024&format=png&auto=webp&s=47e0e484bca5f3c957c639596216fd921b4ac266

Chat

Independent_Key1940@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
Wait, is this what gpt 4 is?? Because if you have noticed there’s a noticeable delay when you submit input to gpt 4 compared to when you submit to 3.5