just to double check: you embed the sentence numbers into the context, right?
so the llm will see: “1: Giraffes have long necks. 2: They eat mostly leaves….”
or does the llm learn by itself what sentence is what number?
The general optimization behind this is to reduce the number of tokens to generate even at a slight increase in context size, correct?
Wonder where the trade off here is … there are probably more tricks like this, but I assume at some point there will be diminishing returns, where the added context size makes it not worth it …
just to double check: you embed the sentence numbers into the context, right?
so the llm will see: “1: Giraffes have long necks. 2: They eat mostly leaves….”
or does the llm learn by itself what sentence is what number?
The general optimization behind this is to reduce the number of tokens to generate even at a slight increase in context size, correct?
Wonder where the trade off here is … there are probably more tricks like this, but I assume at some point there will be diminishing returns, where the added context size makes it not worth it …