What type of study is this?

This is a Quantitative Study study.

September 18, 2025

Pre-Training Hyperspectral Image Encoder via Synthetic Data

Key Points

Using synthetic fractal-based hyperspectral images improved model performance on semantic segmentation tasks.
Pre-training with synthetic data, as opposed to real RGB or hyperspectral datasets, reveals training variability impact.
The study utilizes multiple datasets to analyze the relationship between dataset type and model performance.
Findings suggest transformer-based models can adapt to synthetic data, which may mitigate dataset limitations.

Abstract

Computer vision is being revolutionized by the use of transformer-based machine learning architectures. However, these models need large datasets to enable pre-training through self-supervised learning. However, there is a lack of open-source datasets of the same magnitude as standard RGB color images. This work analyzes the effect of using randomly generated fractal-based hyperspectral images versus real data to understand the effect of pre-training dataset on a Swin image encoder model performance, during supervised-training of a semantic segmentation hyperspectral dataset. Two real data datasets are used for comparison to the synthetic dataset, one RGB-based and another hyperspectral-based to understand how variability in spectral resolution during pre-training effects model performance on semantic segmentation.

Mark Helpful

Bookmark

Relay