With the increasing concerns about privacy and data regulations, federated learning (FL) has been emerging as a solution to train machine learning models collaboratively with non-exchangeable data from multiple clients. As a result of data locality, data is usually not identically or independently (non-IID) distributed across clients, and the non-IID property has long been the key challenge in FL. Furthermore, in real-world cross-silo scenarios, it is ubiquitous that clients are organizations owning private data from multiple domains internally, which exacerbates the non-IID issue. For example, in healthcare applications, each client (hospital) gathers data from patients with heterogeneous demographics. While previous works have made efforts to address the non-IID challenge across clients by assuming various relations among client-level data distributions and enabling personalized models at the client level, they ignore the internal data heterogeneity within each client or require explicit data domain indicators, which are hardly accessible in real-world data. Here, we propose (SL-PFL) to bridge the gap. SL-PFL incorporates prototypical learning under the FL framework and provides a fine-grained personalized model for each data sample instead of learning one uniform model for all samples of each client. Meanwhile, it can be trained using data without ground-truth domain indicators. Experimental results demonstrate that our proposed method with sample-level personalized models outperforms existing FL methods with a global model or client-level personalized models on various real-world regression and classification tasks from weather, computer vision, and healthcare applications.
Meng et al. (Wed,) studied this question.