Machine learning relies on the shape of data for tasks like classification, where supervised methods map high-dimensional inputs to lower-dimensional label spaces. But in the field of machine learning, there are data limitation challenges: more training data may not be available, the quality of data may be unknown, and the impact of using simulated data may be uncertain. This study examines the shape of simulated and real Synthetic Aperture Radar (SAR) image embeddings through topological dataset analysis. We demonstrate that meta characteristics of SAR data, such as azimuth angle, intrinsically create structure among images. We compare topological structures from simulated and real SAR data using UMAP and Mapper diagrams. By looking at particular pixels in SAR images, we find a link between general features in SAR images and the separability of images of different classes. Our findings reveal that simulated backgrounds found in training data can negatively impact classification accuracy on real data, suggesting that classifiers trained on SAR data are influenced by certain image characteristics.
Bauer et al. (Tue,) studied this question.