Spread LLM to multiple hosts

erick-fear@alien.top · 1 year ago

Spread LLM to multiple hosts

Feeling-Currency-360@alien.top · 1 year ago

Check this out https://github.com/ggerganov/llama.cpp#mpi-build

Feeling-Currency-360@alien.top · 1 year ago

On another note these gpu manufactures must get their head out of their ass and start cranking out cards with much higher memory capacities. First one to do it cost effectively will gain massive market share and huge profits. Nvidia’s A100 etc doesn’t qualify for this as it’s prohibitively expensive.

dodo13333@alien.top · 1 year ago

If i didnt missunderstood your question, answer is Petals:

https://research.yandex.com/blog/petals-decentralized-inference-and-finetuning-of-large-language-models