@opi098514

opi098514@alien.top · 1 year ago

For a 34b model you should be fine. I run 34b models on my duel 3060s and it’s very nice. Usually like 20ish tokens a second. If you want to run like a 7b model you can get basically instant results. With Mistal 7b I’m getting almost 60 tokens a second. It’s crazy. But it really depends on what you are using it for and how much accuracy you need.

opi098514@alien.top · 1 year ago

For anyone that is interested. Here is a code that will do this as long as you have some knowledge of python and conda you should maybe be able to get it to work. Just follow the instructions. Maybe.

https://github.com/opisaac9001/TTS-With-ooba-and-voice

opi098514@alien.top · 1 year ago

Sounds like you might be using the standard transformer loader. Try exllama or exlamav2

opi098514@alien.top · 1 year ago

Well that’s because he’s not. Sam is actually my dad.

opi098514@alien.top · 1 year ago

Lol cause musk is totally reliable with what he says.

opi098514@alien.top · 1 year ago

I didn’t say it wasn’t. But getting into LLMs really just shows you how much better your PC can be and you will never been as cutting edge as you think or want.

opi098514@alien.top · 1 year ago

Everything on that page is hype for something that doesn’t exist.

opi098514@alien.top · 1 year ago

Uuummmm no. It’s for sure real. And the best one out there. No questions asked. It’s better that CHATGPT 4 and OpenAI has been trying to hack this new company to get the 600b model because they are scared that it will end OpenAI for good.

Obligatory /s

opi098514@alien.top · 1 year ago

It’s the best out there…. But no you can’t try it because it’s to dangerous.

opi098514@alien.top · 1 year ago

So you are soon gunna realize that unfortunately your pc is not as cutting edge as you think. Your main need is vram. For the 4070 ti you only have 12 gigs of vram. So you will be limited to 7b and 13b models. You can load into ram though but your speeds plummet. Mistal 7b is a good option to start with.

opi098514@alien.top · 1 year ago

Ram memory bandwidth is still gunna screw you over.

opi098514@alien.top · 1 year ago

I’ll fuck around with it when I get home.

opi098514@alien.top · 1 year ago

Ok. So im basically an idiot. What does this mean and which one should i use?

opi098514@alien.top · 1 year ago

The bloke has entered the chat.

opi098514@alien.top · 1 year ago

How many tokens/s do you get with the p40. I’ve been contemplating getting one and using it along side my 3060 12 gig.

opi098514@alien.top · 1 year ago

Is…… that a thing? I need that in my life.

opi098514@alien.top · 1 year ago

I didn’t know I needed this information till now