I don’t know if this helps but I’m using the GGUF version of that and it’s working perfectly
I don’t know if this helps but I’m using the GGUF version of that and it’s working perfectly
Young and bored teenagers would get a nice chuckle seeing people unknowingly having convos with their bots online.
Imagine you hate political candidate A but love political candidate B. Imagine setting your bot up to trash A and promote B.
Even more entertaining would be to setup your bot to debate and waste peoplea time. Go to sleep and wake up to see your bot has been arguing with someone wasting their time for 8 hours. That would be hilarious to a troll.
I just posted this somewhere else but it seems relevant, try KoboldCPP, it has this feature enabled by default:
Context Shifting is a better version of Smart Context that only works for GGUF models. This feature utilizes KV cache shifting to automatically remove old tokens from context and add new ones without requiring any reprocessing. So long as you use no memory/fixed memory and don’t use world info, you should be able to avoid almost all reprocessing between consecutive generations even at max context. This does not consume any additional context space, making it superior to SmartContext. Context Shifting is enabled by default, and will override smartcontext if both are enabled. Your outputs may be different with shifting enabled, but both seem equally coherent. To disable Context Shifting, use the flag --noshift.
I think koboldCPP already does this unless I’m misunderstanding, have a look at this:
Context Shifting is a better version of Smart Context that only works for GGUF models. This feature utilizes KV cache shifting to automatically remove old tokens from context and add new ones without requiring any reprocessing. So long as you use no memory/fixed memory and don’t use world info, you should be able to avoid almost all reprocessing between consecutive generations even at max context. This does not consume any additional context space, making it superior to SmartContext. Context Shifting is enabled by default, and will override smartcontext if both are enabled. Your outputs may be different with shifting enabled, but both seem equally coherent. To disable Context Shifting, use the flag --noshift.
I keep hearing about her.
Mythomax is great but super slutty.
Sometimes I like the challenge of getting an LLM to be inappropriate, and Mythomax just jumps right in!
Try Setting temperature to .1
Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.
I went 4060ti 16gb on small black Friday sale because I don’t want to upgrade my PSU, and it’s my workstation for both my jobs so I don’t want to mess with a used or refurbished 3090 which would be a lot more money if I factor the PSU.
This way it costs me less than half as much for new gear and my goals is to just run 20b, 13b and smaller coding models and in time I feel like something with higher vram will come out at a reasonable price without requiring a huge PSU.
I also have 64 gigs of ram if I need to call on a larger model.
You can find used ones on there occasionally for $715 without a warranty or you can spend the extra $200-300 for the renewed to get the warranty it’s whatever you value most.
Many models will just gladly do whatever horrible request you have, so no need. That’s one of the beauties of LLMs, we have unseasoned models.
Also since we can modify the output before it responds we can kite it to respond how we want by having the LLM response start with “sure here you go” which will often change its mind if it’s a censored model.