With the rapid growth of Earth-observation datasets, geospatial foundation models (FMs) provide a scalable approach to learn transferable features across diverse satellite sensor data. However, their cross-sensor adaptation ability needs more exploration. To study this issue, we present a benchmarking study of SatVision-TOA, an FM pre-trained on over 20 years of MODIS data, when adapted to the GOES NOAA ABI sensor for four downstream cloud properties: cloud mask, cloud phase (segmentation), and cloud optical depth (COD) and cloud particle size (CPS) (regression). We propose a multi-task learning finetuning pipeline with a U-Net-based decoder and a lightweight preprocessor to address band-mismatch handling (14 MODIS bands for pretraining vs. 16 ABI bands for fine-tuning). To evaluate our pipeline, we benchmark fine-tuned models against from-scratch baselines, evaluate full fine-tuning (FFT) versus parameter-efficient fine-tuning (PEFT) methods (LoRA, VPT), and compare 14-band versus 16-band inputs. Our experiments show that multi-task learning improves efficiency and predictive quality in both fine-tuned and from-scratch settings. For the other four comparisons (FT vs. from-scratch, FFT vs. PEFT, 14bands vs. 16 bands and loss functions), the results are mixed and there is no setup that always performs the best for all segmentation/regression tasks.
Murphy et al. (Sat,) studied this question.