I want to fine tune some LLM models with my own dataset which contains very long examples (a little > 2048 tokens). vRAM usage jumps up several GBs by just increasing the Cutoff Length from 512 to 1024.

Is there a way to feed those long examples into the models without increasing vRAM significantly?