I’m trying to create an NLP Emotion Classification Model for a research project but kind of confused on where and how to start. I have this huge dataset of Reddit posts and want to classify each post into like 12 different emotion categories.

Is there a way to do this using existing models eg. BERT or can I also do this using unsupervised learning?

I have at least 12000 different posts and so want to avoid supervised learning because its going to take so long to label a set for training data also I might lose a lot of time doing that.

Whats the most efficient and accurate way to do this? Any help would be amazing!

  • Ok-Kangaroo-59@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Go google the setfit library on GitHub. Frame it as a few shot learning task, won’t get perfect results but seems tractable as a problem.

  • Ronny_Jotten@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I doubt you’ll find 12 different emotions on Reddit. I think everything can fit into:

    1. sarcasm
    2. rage
    3. polemic rage
    4. bewilderment
    5. jocularity
    6. serious

    I might have missed one or two, but I’m sure there aren’t 12.