@lengyue233 - Communick News

0 Posts
2 Comments

Joined 10 months ago

Cake day: November 13th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

lengyue233@alien.topBtoLocalLLaMA@poweruser.forum•NVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLM
link
fedilink
English
arrow-up
1·
10 months ago
Yes, batch size is intended for multiple sessions (1024 parallel sessions in the case).

link
fedilink

lengyue233@alien.topBtoLocalLLaMA@poweruser.forum•NVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLM
link
fedilink
English
arrow-up
1·
10 months ago
Are u going to talk with Yuki at 1024 batch size?

link
fedilink