TL;DR: We’re looking to deploy a LLaMA 2 13B model on an AWS Service and want to know the best option. AWS Bedrock is not preferred as we need full control over the model.
Hello everyone,
The company I work for is exploring options to host a LLaMA 2 13B model on an AWS Service, but we’re unsure of the best choice. Our priorities are data privacy and maximum control over data processing. This is solely for internal use.
We’ve tried AWS Bedrock, but have decided against it as it doesn’t provide us with complete control over the model.
Currently, we’re testing SageMaker and considering other options like an EC2 instance, but we’re uncertain about the most effective and economical solution.
I would appreciate hearing about your experiences.
Thanks in advance.
I deployed Llama 2 (GGUF and using CPU) as an Amazon ECS fargate service
I just bundled my entire Docker build into ECR and fired up my container
What about AWS Lambda? I tried it and it works. It is economical and fast.