Doctor_Turkleton@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

How to achieve more than 4k context?

4

1

How to achieve more than 4k context?

Doctor_Turkleton@alien.topB to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

4

People talk about it around here like this is pretty simple (these days at least). But once I hit about 4200-4400 tokens (with my limit pushed to 8k) all I get is gibberish. This is with the LLaMA2-13B-Tiefighter-AWQ model, which seems highly regarded for roleplay/storytelling (my use case).

I also tried OpenHermes-2.5-Mistral-7B and it was nonsensical from the very start oddly enough.

I’m using Silly Tavern with Oobabooga, sequence length set to 8k in both, and a 3090. I’m pretty new to all of this and it’s been difficult finding up to date information (because things develop so quickly!) The term fine-tuning comes up a lot, and with it comes a whooooole lot of complicated coding talk I know nothing about.

As a layman, is there a way to achieve 8k (or more) context for a roleplay/storytelling model?

Chat

BangkokPadang@alien.topB
link
fedilink
English
arrow-up
1·
1 year ago
For llama2 models set your alpha to 2.65 when loading them at 8k.

The general suggestion is “2.5” but if you plot the formula on a graph, 8192 context aligns with 2.642, so 2.65 is more accurate than 2.5