• ZenEngineer@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    There was a paper where you’d return a faster model to come up with a sentence and then basically run a batch on them big model with each prompt being the same sentence, with different lengthsending in a different word predicted by the small model, to basically see where the small one went wrong. That gets you a speed up if the two models are more or less aligned.

    Other than that I could imagine other things, like having batches with one sentence being generated for each actor, one for descriptions, one for actions, etc. Or simply multiple options for you to choose.