panchovix@alien.topB to LocalLLaMA@poweruser.forumEnglish · 1 year agoTabbyAPI released! A pure LLM API for exllama v2.github.comexternal-linkmessage-square6fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTabbyAPI released! A pure LLM API for exllama v2.github.companchovix@alien.topB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square6fedilink
minus-squareRight-Structure-1619@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoDoes anyone know if they expose all the good stuff that Guidance uses for their guided generation and speedup? This plus guidance (kv cache, grammar control, etc) would be fast fast!
Does anyone know if they expose all the good stuff that Guidance uses for their guided generation and speedup? This plus guidance (kv cache, grammar control, etc) would be fast fast!