Robot1me@alien.topBtoLocalLLaMA@poweruser.forum•What do you think about GPT-isms polluting datasets? Do you consider them a problem? If so, how big of a problem do you think it is?English
1·
1 year agoWhat do you think about this?
I think an interesting experiment is when you edit an AI output message to start with “As an AI language model” and then let it continue the rest. If it completely loses character and just sounds like ChatGPT, it’s then quite telling.
I’m kind of cautious how random merging affects the overall quality, since many of these merges models were trained with different prompt formats. In my experience that would inevitably lead to AI outputs that attempt some gibberish by adding bits of other used prompt formats (e.g. “### Response:” being printed out while using the ChatML template). To my surprise I witnessed that with OpenHermes 2.5 in some edge cases. But I would be eager to hear other people’s experience on this.