I am currently trying to build small convolutional regression models with very tight constraints regarding model size (max. a few thousand parameters).
Are there any rules of thumb/gold standards/best practices to consider here? E.g. should I prefer depth of the model over width, do skip connections add anything in these small scales, are there any special training hacks that might boost performance, etc?
Any hints or pointers, where to look are greatly appreciated.
A few thousand parameters is very little. I would have a look at models like LeNet, which are extremely small but have been proven effective. Maybe start by copying the architecture and then drop the number of filters until you reach your desired parameter number.