What type of study is this?

This is a Quantitative Study study.

September 20, 2025Open Access

Efficient rotation invariance in deep neural networks through artificial mental rotation

Key Points

AMR reduces the top-1 error across datasets to 0.743, outperforming traditional rotational data augmentation.
By implementing AMR, models improved their recognition capability on datasets like ImageNet and Stanford Cars.
Transfer of AMR-trained modules enhances performance from 32.7 to 55.2 IoU in semantic segmentation tasks.
This flexible approach works seamlessly with various CNN and ViT architectures.

Abstract

Humans and animals recognize objects irrespective of the beholder's point of view, which may drastically change their appearance. Artificial pattern recognizers strive to also achieve this, e.g., through translational invariance in convolutional neural networks (CNNs). However, CNNs and vision transformers (ViTs) both perform poorly on rotated inputs. Here we present AMR (artificial mental rotation), a method for dealing with in-plane rotations focusing on large datasets and architectural flexibility, our simple AMR implementation works with all common CNN and ViT architectures. We test it on randomly rotated versions of ImageNet, Stanford Cars, and Oxford Pet. With a top-1 error (averaged across datasets and architectures) of 0.743, AMR outperforms rotational data augmentation (average top-1 error of 0.626) by 19%. We also easily transfer a trained AMR module to a downstream task to improve the performance of a pre-trained semantic segmentation model on rotated CoCo from 32.7 to 55.2 IoU.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper

Cite This Study

Tuggener et al. (Fri,) studied this question.

synapsesocial.com/papers/68d46aae31b076d99fa67715 https://doi.org/https://doi.org/10.3389/fcomp.2025.1644044

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

AI에게 질문

Bookmark

View Full Paper