Are 7b models useful?

Naiw80@alien.top · 2 years ago

Are 7b models useful?

Mbando@alien.top · 2 years ago

Falcon – 7B fine tuned is pretty powerful. Within domain, and in a RAG stack it out performs GPT – 3.5.

Naiw80@alien.top · 2 years ago

Thanks

ThinkExtension2328@alien.top · 2 years ago

They generally good for single shot or low shot tasks. Eg get cliff notes , create templates . You can use vector db for informational accuracy. They struggle to keep character and context iv noticed.

Naiw80@alien.top · 2 years ago

Okay thanks

vatsadev@alien.top · 2 years ago

OpenHermes 2.5 is amazing from what I’ve seen. it can call functions, summarize text, is extremely competitive, all the works

_ralph_@alien.top · 2 years ago

How does it function call? Some internal api?

vatsadev@alien.top · 2 years ago

it outputs the call

https://twitter.com/abacaj/status/1727747892922769751

Shoddy_Vegetable_115@alien.top · 2 years ago

It returns a JSON with function name and respective arguments which you can parse later in the program and call the function with those arguments given by the model.

Relevant_Outcome_726@alien.top · 2 years ago

Can you provide the prompt for function call?

VertexMachine@alien.top · 2 years ago

I’m seconding that. I’m actually amazed by how it performs, frequently getting similar or better answers than bigger models. I start to think that we do lose a lot with quantization from the bigger models…

shivam2979@alien.top · 2 years ago

Haven’t you noticed slower inference from OpenHermes 2.5 compared to other 7B models?

ModeradorDoFariaLima@alien.top · 2 years ago

I use airoboros 7b to give me plots and write the beginnings of stories. It’s very useful for that.

DarthNebo@alien.top · 2 years ago

Try to use the instruct models like Mistral. Ensure your template is the correct one a well.

laterral@alien.top · 2 years ago

How do you find the right template?

DarthNebo@alien.top · 2 years ago

It should be model page on HuggingFace, they also have a explicit template module which you can import automatically when interacting using model-id.

Llama ones are forgiving for not using structure but the mistral-instruct is very bad if structure is not maintained

penguished@alien.top · 2 years ago

anecdotally, I keep going back to a 13b one…

swagonflyyyy@alien.top · 2 years ago

Mistral 7B instruct can get you pretty far. Even the quantized model has been pretty useful for me.

Naiw80@alien.top · 2 years ago

Thanks

_aigeek@alien.top · 2 years ago

Llama-2 chat, Mistral, Zephyr, and Open Hermes 2.5 are great 7B models for fine-tuning. I have experimented with these and was able to get great results for summarization, and RAG.

fab_space@alien.top · 2 years ago

Tried most of them absolutely useless.

Naiw80@alien.top · 2 years ago

I still evaluate and hopefully thanks to all tips and suggestions here my opinion may change.

spiritplumber@alien.top · 2 years ago

Great for rubber-ducking if you’re writing a story.

Monkey_1505@alien.top · 2 years ago

For instruct specifically, certain models do better with certain things. OpenChat, OpenHermes and Capybara seem to be the best. But they will all underperform next to a good merge/finetune of a 13B model. Depending on the type of instruction one of those will be better than the others.

For repetition this seems to fall away somewhat with very long context sizes. Because of the sliding window, it can handle these context sizes, and if you use something like llamacpp the context can be reused such that you won’t have to process the whole prompt each time.

7b is generally better for creative writing, however, there are as I said, specific types of instructions they will handle well.

Naiw80@alien.top · 2 years ago

Update on this topic…

I realised I’ve made some mistakes, the reason to start with I asked about 7b models is because the computer I’m using is resource constrained (and normally I use a frontend for the actual interaction)

But because I only have 8GB RAM in the computer I decided to go with llama.cpp and this is obviously where things went wrong.

First of all I obviously messed up the prompt, not that I notice any significant difference now when I realised but it did not follow the expected format for the model I was using.

But the key thing appeared to be I’ve been using the -i (interactive) argument and thought it would work like a chat session, well it appears to do for a few queries but as stated in the original post then all of sudden the model starts to converse with itself (filling in for my queries etc).
But it turns out I should have used --instruct all along, and after I realised now things started to work a lot better (although not perfect).

Finally I decided to give neural-chat a try and dang it appears to do most things I ask it to with great success.

Thanks all for your feedback and comments.