Motivation: Extracting features from brain regions is crucial for effective brain analysis. Recently, fMRI foundational models have introduced brain region embeddings by pre-training on large unlabeled datasets. However, many depend on specific ROI parcellation, which can lead to information loss. Goal(s): We propose a voxel-based fMRI model that captures spatio-temporal dependencies, effectively extracting features adaptable to various datasets.。 Approach: We employ a self-supervised approach to train an encoder on large-scale unlabeled data and validate its performance on labeled data to demonstrate its superiority. Results: Experimental results confirm that our model generates high-quality feature representations of fMRI data. Impact: This project introduces a foundational model with voxel-level inputs and spatio-temporal attention, enhancing fMRI representation accuracy, generalization, and insights into brain networks.
Wang et al. (Tue,) studied this question.