Goliath-120B - quants and future plans

AlpinDale@alien.top · 2 years ago

Goliath-120B - quants and future plans

Cybernetic_Symbiotes@alien.top · 2 years ago

This is highly interesting and unintuitive. Have you written down the details of your approach anywhere? Why did you interleave in the manner you did?

Have you tested on GSM8K or DROP? Something I noticed in the recent HFLB update is that a lot of high flying Mistral merges scored poorly on those two benchmarks. DROP scores in particular, plummeted.

AlpinDale@alien.top · 2 years ago

As I mentioned here, it’d perform poorly on benchmarks until it’s went through a few steps of full finetuning so the weight disagreement is ironed out.