What question did this study set out to answer?

This research aims to assess the adaptability of geospatial foundation models for retrieving cloud properties across different satellite sensors.

May 2, 2026Open Access

Deep Learning Approaches for Cloud Property Retrieval: Comparing Foundation Model Fine-Tuning with Training From Scratch

Key Points

This research aims to assess the adaptability of geospatial foundation models for retrieving cloud properties across different satellite sensors.
Performed benchmarking of SatVision-TOA model fine-tuned on GOES NOAA ABI sensor data.
Utilized a U-Net-based decoder and lightweight preprocessor for multi-task learning.
Compared full fine-tuning with parameter-efficient fine-tuning approaches.
Multi-task learning enhanced efficiency and prediction quality in both fine-tuned and from-scratch models.
Mixed outcomes observed for comparisons between different fine-tuning methods and input bands with no single superior setup for all tasks.

Abstract

With the rapid growth of Earth-observation datasets, geospatial foundation models (FMs) provide a scalable approach to learn transferable features across diverse satellite sensor data. However, their cross-sensor adaptation ability needs more exploration. To study this issue, we present a benchmarking study of SatVision-TOA, an FM pre-trained on over 20 years of MODIS data, when adapted to the GOES NOAA ABI sensor for four downstream cloud properties: cloud mask, cloud phase (segmentation), and cloud optical depth (COD) and cloud particle size (CPS) (regression). We propose a multi-task learning finetuning pipeline with a U-Net-based decoder and a lightweight preprocessor to address band-mismatch handling (14 MODIS bands for pretraining vs. 16 ABI bands for fine-tuning). To evaluate our pipeline, we benchmark fine-tuned models against from-scratch baselines, evaluate full fine-tuning (FFT) versus parameter-efficient fine-tuning (PEFT) methods (LoRA, VPT), and compare 14-band versus 16-band inputs. Our experiments show that multi-task learning improves efficiency and predictive quality in both fine-tuned and from-scratch settings. For the other four comparisons (FT vs. from-scratch, FFT vs. PEFT, 14bands vs. 16 bands and loss functions), the results are mixed and there is no setup that always performs the best for all segmentation/regression tasks.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper