2 days? Bro if they said November and haven’t released it by now, it’s not two days.
How much would that be in H100s or H200s?
is this q*? https://arxiv.org/abs/2311.15648
note : HF reportedly block in china
iirc China has their own version of huggingface. Modelscope?
What’s so special about Q*
what about the 2.5%?
Been using lmql since guidance appeared to be abandoned
what do you mean abandoned? it seems to be getting updates for some time now.
H200? Theres a new accelerator?
Has anyone tested this?
It’s all a rabbit hole of time wasting, imho. People judge x or y model on how well it works for their use cases.
Well people don’t want to be falsely advertised on the capabilities of the model, if it’s only good on certain use cases, then just say it.
Socialism = public monopoly (the state, or a private monopoly)
How is a private monopoly a public monopoly?
here’s the github page: https://github.com/togethercomputer/RedPajama-Data open-source with 30 Trillion tokens of Data.
Not just private but closed access.