I’m thinking of using Llama 2 to detect spam messages:
-
The model will first be fine tuned using LoRa/PEFT with some public dataset.
-
Then, when given a block of text, it will decide if it’s spam and provide reasons for the user.
-
However, there can be false positives etc., so I figured a way to combat this would be to let the user tell the model if the response is correct or wrong (thumbs up/down).
Based on my requirements, is it better to use RLHF or DPO? Am I over complicating this, will fine tuning it based on user feedback work too?
You must log in or register to comment.
You’re better off using something like BERT rather than shooting a pigeon with a ballistic missile. It easier, cheaper, faster and much more reliable.