A fAIry tale of the Inductive Bias
As we have seen in recent years deep learning has had exponential growth both in use and in the number of models. What paved the way for this success is perhaps the transfer learning itself-the idea that a model could be trained with a large amount of data and then used for a myriad of specific tasks.
In recent years, a paradigm has emerged: transformer (or otherwise based on this model) is used for NLP applications. While for images, vision transformers or convolutional networks are used instead.
On the other hand, while we have plenty of work showing in practice that these models work well, the theoretical understanding of why has lagged behind. This is because these models are very broad and it comes difficult to experiment. The fact that Vision Transformers outperform convolutional neural networks by having a theoretically less inductive bias for vision shows that there is a theoretical gap to be filled.
0 Comments