Reconstruction of visual perception from brain signals has emerged as a promising research topic. Electrocorticography (ECoG) is a kind of high-quality intracranial signal with good spatiotemporal resolution that offers some new opportunities. However, according to our knowledge, there are no studies to reconstruct the perceived images from human ECoG signals at present. We have conducted the pioneering work and developed a novel pipeline that integrates Talairach coordinate alignment masked autoencoders (TA-MAE) with denoising diffusion probabilistic models (DDPM). Our approach exploits the spatiotemporal dynamics of human ECoG signals, enabling the restoration of details in high-resolution. Experiments show that our method outperforms the current state-of-the-art methods in terms of appearance, structure, signal-noise ratio, and semantic consistency. Additionally, our study indicated that unsupervised learning-based signal reconstruction outperforms manually annotated label-guided feature recognition in capturing the low-dimensional representation of brain signals, potentially facilitating the exploration of vision's intrinsic mechanisms. These results highlight the advantages of unsupervised decoding and provide a generalisable framework for human ECoG-based visual reconstruction.
Deng et al. (Thu,) studied this question.