minus-squareggerganov@alien.topBtoLocalLLaMA@poweruser.forum•Need help setting up a cost-efficient llama v2 inference API for my micro saas applinkfedilinkEnglisharrow-up1·1 year agoI just wrote a post today about serving 7B models with `llama.cpp` from cheap AWS instances - might be useful: https://github.com/ggerganov/llama.cpp/discussions/4225 linkfedilink
I just wrote a post today about serving 7B models with `llama.cpp` from cheap AWS instances - might be useful:
https://github.com/ggerganov/llama.cpp/discussions/4225