Even the quantised version seems to be working pretty well with the stablelm-support branch, but either the template or model is missing the end token or the LlamaCPP branch isn’t quite ready as the output just keeps going…
Does anyone else have the same problem and know what to do?
This is how I interpreted the template from the model card:
Even the quantised version seems to be working pretty well with the
stablelm-support
branch, but either the template or model is missing the end token or the LlamaCPP branch isn’t quite ready as the output just keeps going…Does anyone else have the same problem and know what to do?
This is how I interpreted the template from the model card:
https://preview.redd.it/eqxvq2lefjxb1.png?width=1240&format=png&auto=webp&s=b5c1f1550b50fcf16ab1177894b890b852cf65a9