Understanding multisensory perception requires models that operate directly on the sensory stream. The Multisensory Correlation Detector (MCD) framework provides the first stimulus-computable, biologically plausible account of audiovisual integration, predicting behaviour from raw video and audio with high accuracy. Building on early evidence that humans integrate cues only when their temporal structures are correlated, successive studies established the MCD’s mechanistic validity, neural plausibility, and dependence on unimodal transients. In this article, I introduce the current population model and outline the key steps in its development, providing a concise overview of how the MCD framework has evolved into a general theory of multisensory integration.
Cesare V. Parise (Mon,) studied this question.