We present two complementary extensions to Brain-JEPA, a self-supervised learningframework based on the I-JEPA architecture 1 originally designed for resting-statefunctional MRI (rs-fMRI) 2. Our contributions are: (1) multi-scale temporal masking,which replaces the original single-scale patch masking strategy with a hierarchical schemeoperating simultaneously at short, medium and long temporal horizons, and (2) a VICRegcovariance regularization term 3, which penalizes off-diagonal correlations and variancedeviations in the embedding space to avoid representational collapse. This approach istheoretically motivated by LeJEPA 4 which establishes isotropic Gaussian embeddings asoptimal for downstream task performance. Both modifications are evaluated on the UCLAConsortium for Neuropsychiatric Phenomics dataset 5 (ds000030, N=261) via a linearprobe predicting biological sex from frozen encoder representations. Results show thatVICReg alone improves AUC from 0.542 ± 0.097 (baseline) to 0.556 ± 0.053, multi-scalemasking alone achieves 0.543 ± 0.094, and their combination yields 0.567 ± 0.068, ourhighest AUC across conditions. We discuss the theoretical motivations, implementationdetails, and limitations of this preliminary study, which provide a methodological proof-of-concept for integrating distributional regularization and multi-scale masking into the Brain-JEPA framework.
LOUAN BARDOU (Mon,) studied this question.