Key points are not available for this paper at this time.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We also train ResMLP models in a self-supervised setup, to further remove priors from employing a labelled dataset. Finally, by adapting our model to machine translation we achieve surprisingly good results. We share pre-trained models and our code based on the Timm library.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hugo Touvron
BC Platforms (Finland)
Piotr Bojanowski
Centre National de la Recherche Scientifique
Mathilde Caron
Google (United States)
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sorbonne Université
Building similarity graph...
Analyzing shared references across papers
Loading...
Touvron et al. (Mon,) studied this question.
synapsesocial.com/papers/69d835db05ee2ba81dbef37f — DOI: https://doi.org/10.1109/tpami.2022.3206148