[D] Exploring Optimal Approaches for Expanding Context Windows in Language Models

Infamous-Belt8671@alien.top · 2 years ago

[D] Exploring Optimal Approaches for Expanding Context Windows in Language Models

sshh12@alien.top · 2 years ago

I’ve been working on some experimental context window extensions using multimodal models https://github.com/sshh12/multi_token

Similar to the idea of putting text into an image for GPT4V, I’m just directly encoding chunks of texts into embeddings and injecting them in the models. This gives you a very lossy 128x extension of your context window which is pretty massive.

Infamous-Belt8671@alien.top · 2 years ago

Thanks for the input. This seems amazing.