• 1 Post
  • 16 Comments
Joined 11 months ago
cake
Cake day: October 30th, 2023

help-circle












  • Yea, I’ve had my “honeymoon effect” with some new/large models like, say, Falcon and even Claude: they are inherently random and that affects quality, too. I’ve had great outputs from Falcon, for instance (on Petals), but also long stretches of mediocre and some outright bad… and also sometimes really great and creative output from 7b Mistral, especially with enough prompt tinkering and setting sampling “just right”. Objective evaluations of LMMs is extremely hard and time-consuming!


  • Can we have some non-cherry-picked examples of writing?

    Does not have to be highly nsfw/whatever, but a comparison of goliath writing compared to output from constituent models at same settings and same (well-crafted) prompts will be very interesting to see, and preferably at least 3 examples per model due to inherent randomness of model output…

    If you say this is “night and day” difference, it should be apparent… I’m not sceptical per se, but “writing quality” is highly subjective and the model style may simply mesh better with your personal preferences?