Backend: 99% of the time, KoboldCPP, 1% of the time (testing EXL2 etc) Ooba
Front End: Silly Tavern
Why: GGUF is my preferred model type, even with a 3090. KoboldCPP is the best that I have seen at running this model type. SillyTavern should be obvious, but it is updated multiple times a day and is amazingly feature rich and modular.
Backend: 99% of the time, KoboldCPP, 1% of the time (testing EXL2 etc) Ooba
Front End: Silly Tavern
Why: GGUF is my preferred model type, even with a 3090. KoboldCPP is the best that I have seen at running this model type. SillyTavern should be obvious, but it is updated multiple times a day and is amazingly feature rich and modular.