A central approach in neuroscience is to analyze neural representations as a means to understand a system's function, through the use of methods like principal component analysis, regression, and representational similarity analysis. These analyses often rest on a tacit "linking assumption": that the features explaining the most variance in neural activity are the most important for the system's computation. Here, we challenge this assumption. We review recent work in machine learning demonstrating "representation biases"-the fact that learned representations can be biased toward certain features over others. For example, learned representations heavily overrepresent simple (linear) features while representing complex (nonlinear) features much more weakly, even when both are equally critical for the system's computations. We review the origins of these biases in learning dynamics and patterns of computation. We then discuss their consequences for neuroscience. We show that if a subset of features dominates the representations, standard analytic techniques can yield highly biased inferences-for example, resulting in the mistaken conclusion that a system is simpler than it really is or that two systems are more similar than they really are. We discuss some connections between these findings and recent empirical developments in neuroscience. Finally, we present homomorphic encryption as a conceptual case study of the potential for a total dissociation between representational geometry and computation. We conclude that achieving a complete understanding of neural systems requires moving beyond high-variance signals, as critical computational mechanisms may be hidden in low-variance components.
Lampinen et al. (Sun,) studied this question.