lans_throwaway@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Look ahead decoding offers massive (~1.5x) speedup for inference

4

cross-posted to:
localllama@poweruser.forum

1

Look ahead decoding offers massive (~1.5x) speedup for inference

lans_throwaway@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

4

cross-posted to:
localllama@poweruser.forum

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org

TL;DR: We introduce lookahead decoding, a new, exact, and parallel decoding algorithm to accelerate LLM inference. Look...

Chat

FlishFlashman@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
This seems like this approach could also be useful in situations where the goal isn’t speed, but rather “quality” (by a variety of metrics).