koehr@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

40x or more speedup by selecting important neurons

1

40x or more speedup by selecting important neurons

koehr@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

https://arxiv.org/abs/2311.10770

“UltraFastBERT”, apparently a variant of BERT, that uses only 0.3% of it’s neurons during inference, is performing on par with similar BERT models.

I hope that’s going to be available for all kinds of models in the near future!

Chat

Acceptable_Can5509@alien.topB
link
fedilink
English
arrow-up
1·
2 years ago
Basically gpt 4 turbo
- lakolda@alien.topB
  link
  fedilink
  English
  arrow-up
  1·
  2 years ago
  GPT-4 turbo only speeds things up by 3x…