1 Post
1 Comment

Joined 1 year ago

Cake day: November 16th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

keklsh@alien.topOPBtoLocalLLaMA@poweruser.forum•How to minimize model inference costs?
link
fedilink
English
arrow-up
1·
1 year ago
Nah, a simple calculation of 3090 ($0.22/hr, and not enough VRAM to run 70b 4bit!) generating at 20t/s puts it at $13.8/million tokens.

That’s extremely expensive compared to the API price.

link
fedilink

keklsh@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

How to minimize model inference costs?

5

1

How to minimize model inference costs?

keklsh@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

5