Cradawx@alien.topBtoLocalLLaMA@poweruser.forum•Could multiple 7b models outperform 70b models?English
1·
1 year agoNo, several sources include Microsoft have said GPT 3.5 Turbo is 20B. GPT 3 was 175B, and GPT 3.5 Turbo was about 10x cheaper on the API than GPT 3 when it came out so it makes sense.
There’s the ALMA models based on LLaMA 2:
https://huggingface.co/haoranxu/ALMA-13B
I’ve tried this for translating Japanese, seems pretty good: https://huggingface.co/mmnga/webbigdata-ALMA-7B-Ja-V2-gguf