While machine learning models generally benefit from including more training data, aggregating data, in particular medical data, can be difficult due to data privacy regulations. In response, federated learning has emerged in the last few years to train models on distributed data without the need of exchanging sensitive information between different locations. However, federated learning demands large computational resources, especially in the field of AI-based image analysis. Our empirical study demonstrates that utilizing the embeddings of foundation models can reduce the trainable model size by more than 90%, thereby reducing communication overhead during model training, as demonstrated by numerical estimations. This could enable institutions with limited compute resources to participate in federated model training and ensure the inclusion of more diverse data resources, potentially leading towards more robust model performance. To additionally enhance data privacy during the model transfer secure multiparty computation was integrated. Furthermore, we propose a method, using the HSV image space, to assess inter-cohort imaging biases. With our work we demonstrate that federated learning is still relevant in the era of general-purpose foundation models and that the utilization of image embedding by foundation models during preprocessing reduces the model size during federated training.
Lohmann et al. (Sun,) studied this question.