Has there been a study that performed a deep dive into the opposite end of the spectrum? There are myriad edge applications out there which cannot rely on training a large model and pruning it down for deployment. I wonder which architectures are most suited to learning at small scales.
Has there been a study that performed a deep dive into the opposite end of the spectrum? There are myriad edge applications out there which cannot rely on training a large model and pruning it down for deployment. I wonder which architectures are most suited to learning at small scales.