Cheapest site for hosting custom LLM models?

StrangeImagination5@alien.top · 1 year ago

Cheapest site for hosting custom LLM models?

AntoItaly@alien.top · 1 year ago

Replicate $0.000575/sec for a Nvidia A40 (48GB Vram)

yahma@alien.top · 1 year ago

The startup time makes Replicate nearly unusable for me. Only popular models stay in memory. Other less used models shutdown, and you need to wait for startup before first inference.

No_Baseball_7130@alien.top · 1 year ago

0.000575

that is nearly 2.1$ per hour. on https://runpod.io, you could get an a40 for 0.79$ / hr. for a 34b model, 24gb vram is more than enough so you could get a A5000 for around 0.44$ / hr