Hi everybody!
Inspired by a recent thread, mentioning the insane goliath abilities I decided to merge four SFT Yi models to make 2 seperate 55B Yi, one with context 200K and one with 32K.
Try them out and let me know!
You must log in or # to comment.
What are the eval results?
Did you do post-merge retraining? Without at least some results are going to be poor…
Very cool. How did you merge them?
