I’m looking for suggestions for a transformer model that I can fine-tune for a text classification task. Due to hardware constraints the model has to be fairly small. Something in the order of a 50 MB weight file.

  • kwnaidoo@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    While not a transformer, what about “Gaussian naive Bayes”? It’s not the best classifier around but for some tasks - it’s good enough. I used it to build a small search term classifier model which basically classifies e-commerce search terms against a category or tag.

  • pythonpeasant@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Could you please provide some more information as to your constraints? If space is an issue, you might be better off with a more memory friendly model, like an LSTM. You even have per-token attention with some models.

    There’s a really interesting sparkfun video which I’ll look around for, showing a question-answering model using some sort of BERT(?) running on a Raspberry Pi Zero-type chip, with 25-50MB of flash memory.