Communick News
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Acceptable_Can5509@alien.topB to LocalLLaMA@poweruser.forumEnglish · 2 years ago

Llama-2 7b Unquantized Transformers using 26.8GB of Vram.

message-square
message-square
0
link
fedilink
1
message-square

Llama-2 7b Unquantized Transformers using 26.8GB of Vram.

Acceptable_Can5509@alien.topB to LocalLLaMA@poweruser.forumEnglish · 2 years ago
message-square
0
link
fedilink

I’m running Llama-2 7b using Google Colab on a 40gb A100. However it’s using 26.8 gb of vram, is that normal? I tried using 13b version however the system ran out of memory. Yes I know quantized versions are almost as good but I specifically need unquantized.

​

https://colab.research.google.com/drive/10KL87N1ZQxSgPmS9eZxPKTXnobUR_pYT?usp=sharing

alert-triangle
You must log in or register to comment.

LocalLLaMA@poweruser.forum

localllama@poweruser.forum

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@poweruser.forum

Community to discuss about Llama, the family of large language models created by Meta AI.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4 users / day
  • 1 user / week
  • 4 users / month
  • 4 users / 6 months
  • 1 local subscriber
  • 4 subscribers
  • 1.02K Posts
  • 5.81K Comments
  • Modlog
  • mods:
  • communick@poweruser.forum
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org