Hi all,
Just curious if anybody knows the power required to make a llama server which can serve multiple users at once.
Any discussion is welcome:)
Hi all,
Just curious if anybody knows the power required to make a llama server which can serve multiple users at once.
Any discussion is welcome:)
Have a look at https://www.runpod.io/ for AI cloud hosting. You could do some testing based on the number of users you want to cater for, and see what capacity you’ll get for your $.
Start with a basic plan, run some tests to see what it can handle and compare it as you scale up the number of users with simultaneous queries.