Good and fast model around ~1B to run on web?

palpapeen@alien.top · 10 months ago

Good and fast model around ~1B to run on web?

darxkies@alien.top · 10 months ago

https://github.com/huggingface/candle/tree/main/candle-examples/examples/phi

kristaller486@alien.top · 10 months ago

You can try MLC-LLM (https://llm.mlc.ai/), it has tools for inference of quantized models on the web

LyPreto@alien.top · 10 months ago

Deepseek-Coder has a 1B model I believe that’s outperforming 13B models— I’ll check back once I find a link

Edit: found it https://evalplus.github.io/leaderboard.html

palpapeen@alien.top · 10 months ago

Thanks! But I’m not looking for one that does coding, more one that’s good at detecting fallacies and reasoning. Phi-1.5 seems a better fit for that

LyPreto@alien.top · 10 months ago

I would still give it a try— it’s misleading to think these coding models are only good at that, being good at coding actually has shown to improve its scores across multiple benchmarks.

vatsadev@alien.top · 10 months ago

RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.

Regular_Instruction@alien.top · 10 months ago

Tiny llama is 1.1b ?

palpapeen@alien.top · 10 months ago

I mean yeah but it’s not done training AFAIK, and not fine-tuned either

SnooSquirrels3380@alien.top · 10 months ago

Anyone saw this https://burn.dev/demo ?