@LinuxSpinach

LinuxSpinach@alien.top · 1 year ago

I know what Q* is. Free attention and hype. Based on the endless media-grabbing headlines of the past week, I think it’s better to wait and see. You should not trust so easily.

LinuxSpinach@alien.top · 1 year ago

Progressive Learning: We start with LLaMA-2-7B or LLaMA-2-13B checkpoint and
finetune it on the train split of FLAN-v2 dataset for one epoch. Note that FLAN-v2 dataset
contains both zero-shot and few-shot problems. We then train on 5 million ChatGPT data
from Orca 1 for 3 epochs. Then we train on the combination of 1 million GPT-4 data from
Orca 1 and Orca 2’s 817K data for 4 epochs.

LinuxSpinach@alien.top · 1 year ago

As of about a year ago, I haven’t seen anything that really outperforms Tesseract across multiple benchmarks. You can get near 100% accuracy if the image is clean and the font isn’t anything weird. But if you have image noise, you need to lower your expectations.