• 1 Post
  • 6 Comments
Joined 1 year ago
cake
Cake day: October 30th, 2023

help-circle

  • I agree with finetuning + RAG, given that OP already seems to have Q&A pairs, so it should be a great starting point as a dataset.

    The language (Dutch <-> English) could possibly be a barrier for reasonable performance with Llama or any other 7B model, but as OP stated they might be able to use translation for that. I’m not sure whether DeepL could be used for that, i.e., using the DeepL API as a wrapper around the code for user input and chatbot output. It should have pretty good perfomance for Dutch. I like the idea and would like to test this or see the results when properly implemented. So please keep us updated on your approach u/Flo501



  • From my understanding, if you say you want to run the models without quality loss, then quantized models are not exactly what you are looking for, at least not below a certain threshold. With your setup you should be able to run 7B models in 8-bit.

    For everything beyond that you’ll need higher quantized models (e.g., 4-bit), which also introduce higher quality loss.

    There was this post a while back which lined out the hardware requirements for 8-bit and 4-bit, for GPU and CPU setups. Of course you can go even higher with quantization and run even larger models, but it’ll introduce more loss as well.


  • For better readability, OP’s original text rewritten by ChatGPT:

    This is the third time I am trying to rewrite this introduction on this thread. While I am not sure why I need to do this, I feel it is important to make a good introduction. So, hello everyone, and good night from here. I hope you enjoy your holiday or Thanksgiving.

    Uncensored Local AI models are loved by everyone and championed like a golden kid in some cases (Mistral, for example) because they’re just good compared to their aligned versions. So, thinking that uncensored models are fully uncensored is completely biased and not true.

    Some models are creations of various models that are merged to create a new model that is not aligned but retains its capability and value, such as Mistral-Orca 7b. Others are just being fine-tuned with a dataset that is uncensored, but the base model is still not uncensored but aligned, making it still generate output advising or lecturing the user instead of giving a straightforward answer.

    A model can be called truly uncensored when it does not give the user advice and faithfully generates the user’s desired output.

    Those are my thoughts for this time. I hope everyone enjoys their Thanksgiving