I am currently trying to build small convolutional regression models with very tight constraints regarding model size (max. a few thousand parameters).
Are there any rules of thumb/gold standards/best practices to consider here? E.g. should I prefer depth of the model over width, do skip connections add anything in these small scales, are there any special training hacks that might boost performance, etc?
Any hints or pointers, where to look are greatly appreciated.
(I assume you are talking about convolutional models in the context of computer vision)
I had similar constraints (embedded devices in specific environment) and we didn’t use deep learning at all. Instead, we used classical image descriptors from OpenCV like color histograms, HOG, SIFT etc. with SVM as classifier. It can work surprisingly well for many problems, and is blazing fast.
Consider how you can make the problem easier. Maybe you can do binary classification instead of multiclass, or use only grayscale images. Anything that will make the task itself easier will be a good improvement.
If your problem absolutely requires neural networks, I would use all tools available:
You can also consider training a larger network and then applying compression techniques, such as knowledge distillation, quantization or pruning.