I’m new to image classification and ML and this is going to be my first project on those topics. I’m considering using VGG16 because I saw some studies showing that it has a generally great accuracy score (80-95%) but I’m worried that the model might not be fast enough or the app file size might get massive if I want the app to be usable without internet connection.
What do you guys think?
VGG16 is outdated. CNNs based on architectures like EfficientNet and MobileNetV3 have superior accuracy. If attention mechanisms are acceptable in the network, Vision Transformers such as MobileViT are excellent.
You may use
MobileNet
models as they use separable convolutions, which have lesser parameters and execution time than simple/regular convolutions. Moreover,MobileNet
s are easy to train and setup (tf.keras.applications.*
has a pre-trained model) and can be used as a backbone model for fine-tuning on datasets other than theImageNet
.Further, you can also explore quantization and weight pruning. These are some techniques that can be used to optimize models to have a smaller memory footprint and smaller execution time on embedded devices.
No try mobile net
Try mobilenet of mobile vit
Look at tflight model maker. It will walk you through everything you need to make a light weight mobile friendly image classifier.
https://www.tensorflow.org/lite/models/modify/model_maker/image_classification