Hello everyone; I am creating a chatbot that converts my text to JSON format, and it decides the fields/categories on its own. For the example below, it will convert the medical description to JSON
Example:
Patient John. Allergies to shellfish. 6 Feet, Previous Checkup. Chronic Backpain. Migraines Regular. Cane user. Sport Injury. 35 years. Visit 1 year go. Heavy Smoker previously. Fatigued. BMI above normal. Temp normal. On Ibuporofen. Skin rash, using topical cream.
Which lightweight model will be the best for this task?
You may try the langchain create extraction chain api with code llama model, it should work well.
API link: https://python.langchain.com/docs/use_cases/extraction
Model link: https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GPTQCheck out the gron tool
If you work with open source models you can use Outlines: https://github.com/outlines-dev/outlines
You just need to specify the structure you expect with a pedantic model.