🚀 Launching SauerkrautLM-7b-HerO: A New Era in German Language Modeling!

AffectionateCan2342@alien.top · 10 months ago

🚀 Launching SauerkrautLM-7b-HerO: A New Era in German Language Modeling!

AffectionateCan2342@alien.top · 10 months ago

You could at least justify that the scientific basis for merging is given by the published papers on this topic area. Here are a few examples: https://arxiv.org/abs/2306.01708 https://arxiv.org/abs/2203.05482 https://arxiv.org/abs/2204.03044

Nevertheless, it must be admitted that some merges that should achieve good results on paper only produce gibberish in practice or vice versa. So you probably need a bit of luck ;-)

For the German-speaking world, however, I can definitely say that we are not primarily interested in getting better numbers, but in making the English-language models accessible to the German language, at least to some extent, without completely eliminating their cleverness. So the more intelligent the original English model is before it is fine-tuned with German data, the less stupid the model will be in German, and that is our goal as long as there are no German pretrained models.