Venus-120b: A merge of three different models in the style of Goliath-120b

nsfw_throwitaway69@alien.top · 1 year ago

Venus-120b: A merge of three different models in the style of Goliath-120b

a_beautiful_rhind@alien.top · 1 year ago

Sadly doesn’t work on 48gb like the other 120b. It can only fit sub 2048 context otherwise it goes OOM.

nsfw_throwitaway69@alien.top · 1 year ago

Crap, what’s your setup? I tested it with a single 48GB card but if you’re using 2x 24 then it might not work. I’ll have to make a 2.8 bpw quant (or get someone else to do it) so that it’ll work with card splitting.

a_beautiful_rhind@alien.top · 1 year ago

I have 2x3090 for exl2. I have tess and goliath and both fit with ~3400 context so somehow your quant is slightly bigger.

nsfw_throwitaway69@alien.top · 1 year ago

Venus-120b is actually a bit bigger than Goliath-120b. Venus has 140 layers and Goliath has 136 layers, so that would explain it.

a_beautiful_rhind@alien.top · 1 year ago

Makes sense… it’s doing pretty well. Like the replies. Set the limit to 3400 in tabby, no oom yet but using 98%/98%. I assume this means I can bump up the other models past 3400 too if I’m using tabby and autosplit.