I’m using ollama and I have a RTX 3060 TI. Using only 7B models.
I tested with Mistral 7B, Mistral-OpenOrca and Zephyr, they all had the same problem where they kept repeating or speaking randomly after some amount of chatting.
What could it be? Temperature? VRAM? ollama?
goliath 120b would fit in 64 ram, tho. It doesnt have repeating problem…