[Discussion] What are best practices when building/training very small models?

Snagnar@alien.top · 2 years ago

[Discussion] What are best practices when building/training very small models?

Seankala@alien.top · 2 years ago

TL;DR The more constraints on the model, the more time should spend analyzing your data and formulating your problem.

I’ll agree with the top comment. I’ve also had to deal with a problem at work where we were trying to perform product name classification for our e-commerce product. The problem was that we couldn’t afford to have anything too large or increase infrastructure costs (i.e., if possible we didn’t want to use any more GPU computing resources than we already were).

It turns out that extensive EDA was what saved us. We were able to come up with a string-matching algorithm sophisticated enough that it achieved high precision with practically no latency concerns. Might not be as flexible as something like BERT but it got the job done.