nsfw_throwitaway69@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Venus-120b: A merge of three different models in the style of Goliath-120b

1

Venus-120b: A merge of three different models in the style of Goliath-120b

nsfw_throwitaway69@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Hi everyone, I’d like to share something that I’ve been working on for the past few days: https://huggingface.co/nsfwthrowitaway69/Venus-120b-v1.0

This model is the result of interleaving layers from three different models: Euryale-1.3-L2-70B, Nous-Hermes-Llama2-70b, and SynthIA-70B-v1.5, resulting in a model that it larger than any of the three used for the merge. I have branches on the repo for exl2 quants at 3.0 and 4.85 bpw, which will allow the model to run in 48GB or 80GB of vram, respectively.

I love using LLMs for RPs and ERPs and so my goal was to create something similar to Goliath, which is honestly the best roleplay model I’ve ever used. I’ve done some initial testing with it and so far the results seem encouraging. I’d love to get some feedback on this from the community! Going forward, my plan is to do more experiments with merging models together, possibly even going even larger than 120b parameters to see where the gains stop.

Chat

ambient_temp_xeno@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
I still have this feeling in my gut that closedai have been doing this for a while. It seems like a free lunch.
- Charuru@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  2 years ago
  I don’t think so, this is something you do when you’re GPU poor, closedai would just not undertrain their models in the first place.