Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

APaperADay@alien.top · 1 year ago

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Revolutionalredstone@alien.top · 1 year ago

what does the white and blue text mean in the video?

knownboyofno@alien.top · 1 year ago

White is the normal generation while the blue is the look ahead.

_Lee_B_@alien.top · 1 year ago

The blue text is what this method improved the speed of (I think by parallelizing the inference similarly to CPU pipelining), and so what contributed to the overall text being produced more quickly.