So I want to ask for advice on 2 related topics:

  1. If I have a corpus of many documents embedded in a vector store, how can I dynamically select (by metadata, for example) a subset of them and only perform retrieval on that subset for answer generation.

  2. I want LLaMa to be able to say I DO NOT KNOW if the context it retrieved cannot answer the question. This behavior is not stable yet from what I have seen.

Thank you so much!

  • vec1nu@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Use something like lmql, guidance or guiderails to get the model to say it doesn’t know. I’ve also had some success with the airoboros fine-tuned models, which have this behaviour defined in the dataset using a specific prompt.