palpapeen@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Good and fast model around ~1B to run on web?

9

1

Good and fast model around ~1B to run on web?

palpapeen@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

9

I’ve playing with a lot of models around 7B but I’m now prototyping something that would be fine with a 1B model I think, but there’s just Phi-1.5 that I’ve seen of this size, and I haven’t seen a way to run it efficiently so far. llama.cpp has still not implemented it for instance.

Anyone has an idea of what to use?

Chat

vatsadev@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.