What is the best methods to detect harmful content such as racial abuse in tweets?
I’m thinking about a research project in which I try various methods and compare their accuracy. Am I right in thinking that Naive Bayes, Logistic Regression, Support Vector Machine, LSTM and BERT would be some of the best methods?
You must log in or register to comment.
You could try an open source LLM like Llama 2. You could probably use Langchain tools to give it a tool to tag when a tweet has harmful content.
Cohere AI allows you to create classification models by fine-tuning our transformer based embedding models for your use case. It’s free to train a classification finetune so give it a slot. To the best of my knowledge this is near state of the art
gpt api