I’m blown away. See for yourself.
https://migel.substack.com/p/a-conversation-with-tess
Tess, welcome to the world!
Model is Open Source with 200K context length.
Available at: https://huggingface.co/migtissera/Tess-M-v1.0
I’m blown away. See for yourself.
https://migel.substack.com/p/a-conversation-with-tess
Tess, welcome to the world!
Model is Open Source with 200K context length.
Available at: https://huggingface.co/migtissera/Tess-M-v1.0
According to TheBloke the Sequence Length is 8192 ctx, so I’m assuming 8192 ctx is its default and it can extend up to 200k ctx via alpha_scale?
No, the base model itself is 200K: https://huggingface.co/01-ai/Yi-34B-200K