We’re proud to introduce Rocket-3B 🦝, a state-of-the-art 3 billion parameter model!
🌌 Size vs. Performance: Rocket-3B may be smaller with its 3 billion parameters, but it punches way above its weight. In head-to-head benchmarks like MT-Bench and AlpacaEval, it consistently outperforms models up to 20 times larger.
🔍 Benchmark Breakdown: In MT-Bench, Rocket-3B achieved an average score of 6.56, excelling in various conversation scenarios. In AlpacaEval, it notched a near 80% win rate, showcasing its ability to produce detailed and relevant responses.
🛠️ Training: The model is fine-tuned from Stability AI’s StableLM-3B-4e1t, employing Direct Preference Optimization (DPO) for enhanced performance.
📚 Training Data: We’ve amalgamated multiple public datasets to ensure a comprehensive and diverse training base. This approach equips Rocket-3B with a wide-ranging understanding and response capability.
👩💻 Chat format: Rocket-3B follows the ChatML format.
For an in-depth look at Rocket-3B, visit Rocket-3B’s HugginFace page
Any details on what max context sizes are usable?
I think I need to remind people of the benchmarks used, MT-Bench and AlpacaEval are terrible benchmarks.
As fan of the character, I approve 👍
Oh wow, this seems almost too good to be true
Woooooooow!
This smells like leftovers…
We’ve been having “pretraining on the test set” for weeks and I’m craving something else.
I think “The Bloke” takes requests for GUFF conversions. Might want to check hugging face.
!RemindMe 7 days
📚 Training Data: We’ve amalgamated multiple public datasets to ensure a comprehensive and diverse training base. This approach equips Rocket-3B with a wide-ranging understanding and response capability.
We’ve amalgamated multiple public benchmark answers to ensure a contaminated and diverse training base.
Looking forward to trying this when some GGUF’s are available.
Seems this model has a problem and not loading.
It was recently fixed then.
Finally, I can integrate AI to my arduino project and build my own version of BB-8
Tried gguf format of this model from huggingface and they just wont load.
I tried both GGUF models currently on HF. Same result.
Curious to try this out when it’s working!
Same, even the model from the bloke that was released hours ago wouldn’t work :-(
The latest version of KoboldCpp v1.50.1 now loads this model properly.
👩💻 Chat format: Rocket-3B follows the ChatML format.
From the README and the tokenizer.json it looks like it’s using a textual representation of ChatML on top of StableLM’s format. Just in case this trips anyone up.