• LinuxSpinach@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Progressive Learning: We start with LLaMA-2-7B or LLaMA-2-13B checkpoint and
    finetune it on the train split of FLAN-v2 dataset for one epoch. Note that FLAN-v2 dataset
    contains both zero-shot and few-shot problems. We then train on 5 million ChatGPT data
    from Orca 1 for 3 epochs. Then we train on the combination of 1 million GPT-4 data from
    Orca 1 and Orca 2’s 817K data for 4 epochs.