• 1 Post
  • 24 Comments
Joined 1 year ago
cake
Cake day: October 30th, 2023

help-circle




  • For instruct specifically, certain models do better with certain things. OpenChat, OpenHermes and Capybara seem to be the best. But they will all underperform next to a good merge/finetune of a 13B model. Depending on the type of instruction one of those will be better than the others.

    For repetition this seems to fall away somewhat with very long context sizes. Because of the sliding window, it can handle these context sizes, and if you use something like llamacpp the context can be reused such that you won’t have to process the whole prompt each time.

    7b is generally better for creative writing, however, there are as I said, specific types of instructions they will handle well.







  • Having used it a lot, I can say for sure that without much prompting it readily produces junk web text, urls etc, so it is not a fully filtered or fully synthetic dataset.

    My guess would be that it’s just ‘a bit better filtered than llama-2’, and maybe slightly more trained on that set. Slightly better quality set, slightly more trained on that set.

    My intuition based on this, is that per parameter size EVERYTHING open source could be optimized considerably more.








  • it takes up more vram than a dense model.

    If you are using qlora, it’s not by much. The main issue is that you need another model to parse the prompt. But I could see this being useful sometimes. Maybe as an option though, rather than default

    That’s useful, though its gonna be mixed with real data for model robustness.

    I actually really don’t like synthetic data. It’s a great method for filtering large datasets, and perhaps augmenting them, but if you use purely synthetic data you are replicating inaccuracies and prose from the origin model that will only be exaggerated by the target model. I’d rather this was a quality control step, not a dataset producer.

    Multimodality

    I’m personally very eh about this. It has it’s uses, and I’ve used it. But if LLM intelligence has a long way to go and this could take focus away from that. Let that be a seperate project IMO. I’m sure it has it’s uses, and it’s fans, not knocking it - I just think open source is nessasarily already behind proprietary models, and mixed focus could just make that worse.

    Massive ctx len

    Because of the accuracy issues involved, I’d rather they worked on smarter data retrieval like openAI has (it doesn’t really have the context sizes quoted, it grabs out the relevant bits). Generally speaking for prompts, relevancy beats quantity.