I wonder what the performance degradation is after quantising. For other models some users reported that quantizing greatly affected other language capabilities and this model seems to be at leash 50% Chinese.
I wonder what the performance degradation is after quantising. For other models some users reported that quantizing greatly affected other language capabilities and this model seems to be at leash 50% Chinese.
We know basically nothing about Grok. Elmo made no effort to share any meaningful information as far as I’m aware.
The problem with 70B is that it is incrementally better than smaller models, but is still nowhere near competitive with GPT-4, so it is stuck in no man’s land.
Once we finally get an open source model or architecture that can spar even with GPT-4, let alone 5, there will be much more interest in large models.
Regarding Falcon Chat 180B, it’s no better in my tests and for my use cases than fine tuned Llama 2 70B, which is a shame. It makes me think that there is something fundamentally wrong with Falcon, besides the laughably small context window.
you merged Starling with Starling? What merge did you use? Can you share the yaml?
How was the model size increased to 11B. It’s a merge but with what?
big is an understatement. Please do correct me if I got it wildly wrong, but it appears to be a 3.6TB colossus.
Take a look here:
and here to some extent, multimodal application:
Unquantized, 34B models require at least 65GB plus extra depending on context. I cannot see how your comparison of alternatives can work.
You would need 3 x 4090 to run the model.
I think that was the fundamental difference between Altman and the board. He wanted commercial products and profile, the board wanted something larger. Thus it’s extremely unlikely that Altman will be the open source hero.
I’d like to think that this will refocus OpenAI towards fundamental research that will deliver the ASI rather than efforts to commercialise fragments.
With WSL2 you can even mount a native ext4 drive and have the whole environment run at an almost native Linux speed.
Which koboldccp allows you to set the samplers order? The latest main branch does not have this available, in linux.