I use various things, regularly testing if one of them has become better etc.
Mainly llama.cpp backend and server as UI - it has everything what I need, it’s lightweight, it’s hackable
Ollama - Simplifies many steps, has very convenient functions and an overall coherent and powerful ecosystem. Mostly in terminal, but sometimes in a modified Ollama Webui
Sometimes Agnai and/or RisuAI - nice and powerful UIs with satisfying UXs, however not as powerful as sillytavern. But sillytavern is too much if you are not a RP power-user.
In general I try to avoid everything that comes with python code and I prefer solutions with as minimal dependencies as possible, so it’s easier to hack and customize to my needs.
I use various things, regularly testing if one of them has become better etc.
Mainly llama.cpp backend and server as UI - it has everything what I need, it’s lightweight, it’s hackable
Ollama - Simplifies many steps, has very convenient functions and an overall coherent and powerful ecosystem. Mostly in terminal, but sometimes in a modified Ollama Webui
Sometimes Agnai and/or RisuAI - nice and powerful UIs with satisfying UXs, however not as powerful as sillytavern. But sillytavern is too much if you are not a RP power-user.
My own custom Obsidian ChatGPT-MD + Canvas Chat Addon Addons with local endpoints.
In general I try to avoid everything that comes with python code and I prefer solutions with as minimal dependencies as possible, so it’s easier to hack and customize to my needs.