@too_long_story - Communick News

0 Posts
1 Comment

Joined 10 months ago

Cake day: November 13th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

too_long_story@alien.topBtoMachine Learning@academy.garden•[R] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
link
fedilink
English
arrow-up
1·
10 months ago
Well, but how to merry it with batching so that flash attention kernels can work with it?

Any complicated masks for attention imply hard times of making possible supporting batching.

link
fedilink