Quantisation techniques difference?

No-Belt7582@alien.top · 1 year ago

Quantisation techniques difference?

Dead_Internet_Theory@alien.top · 1 year ago

Yeah, EXL2 is awesome. It’s kinda black magic how GPUs that were released way before ChatGPT was a twinkle in anyone’s eyes can run something that can trade blows with it. I still don’t get how fractional bpw is even possible. What the hell, 2.55 bits man 😂 how does it even run after that to any degree? It’s magic, that’s what it is.