Tool to quickly iterate when fine-tuning open-source LLMs

torque-mcclyde@alien.top · 3 years ago

Tool to quickly iterate when fine-tuning open-source LLMs

KittCloudKicker@alien.top · 3 years ago

All weekend I’ve been wishing for a more streamlined fine-tuning experience. H

torque-mcclyde@alien.top · 3 years ago

Glad to hear that we’re not the only ones!

herozorro@alien.top · 3 years ago

could you provide some directions on how to fine tune the model for coding? i have a ui framework in python that i would like to feed it the docs and some github repos code.

how would the dataset look like for that? should i be formulating different uses cases on the framework as if the user is asking?

in addition, do i need to provide standard python code or do those base modles have code in them already?

CygnusX1@alien.top · 3 years ago

Interesting service, I’m definitely going to try it. I’d like to fine tune a 7B for function calling, and if possible, mimic openai’s function description template so I can share them between model calls. I’ve experimented with injecting the function descriptions with a preamble to a user’s prompt and it works ok (with Mistral 7B Instruct) but with many edge cases. I suspect I need to fine tune to get it to improve. How would I go about structuring my user prompts in the training dataset? Would something like this work?:

{"messages": [{"role": "system", "content": "You are a helpful navigation assistant that calls the appropriate function base on a user's input."}, {"role": "user", "content": "Go to Paris, France"}, {"role": "assistant", "content": "{"lat": 48.856667, "lng":2.352222}]}

MrBeforeMyTime@alien.top · 3 years ago

Why not just use grammar sampling with Llama cpp?

ithkuil@alien.top · 3 years ago

Is it possible to do this in a way that allows the model to choose whether to write normal text or to call one or more functions?

MrBeforeMyTime@alien.top · 3 years ago

Well, you don’t have to have it ever write “normal” text. You can just have an object with a “text” property that the model is instructed to use only when it is not calling a function. Otherwise, it can provide different function calling json.

A grammar means it’s forced to output a structure, in this case, json. You can write instructions to output different json based on different scenarios and use code to check which key is present in the json. If the object has the key “text” its a text response. If it doesn’t its a function response.

That’s basically how the function call api works anyway, just less consistent than grammar.

kivathewolf@alien.top · 3 years ago

This is really cool! Good choice on starting with the chat model and not the base model. They are much more friendly to alignment with a small dataset. In your post you mention you do QLorA in few mins. I am assuming that’s for a small dataset like <1000 samples? What’s your backend running on? I would love to learn how you are deploying and scaling this for multiple customers. Best of luck!

torque-mcclyde@alien.top · 3 years ago

Yes, our datasets usually have a few hundred examples. We do support arbitrarily large datasets though, the fine-tuning just takes a little longer.

For deploying and scaling we’re using Modal, it’s a “serverless” GPU provider that we found to be very user-friendly.

SupplyChainNext@alien.top · 3 years ago

This is an amazing sub with amazingly talented individuals. I love it here. This is great.

torque-mcclyde@alien.top · 3 years ago

This means a lot! Thank you.

visarga@alien.top · 3 years ago

hey, can I do the fine-tuning on my own computer or only in your cloud?

torque-mcclyde@alien.top · 3 years ago

Fine-tuning is online. You can download the weights and run them wherever (including your own computer).