What question did this study set out to answer?

The study aims to improve federated learning performance by quantifying the importance of client data in heterogeneous environments.

March 4, 2026Open Access

Federated Learning Method Based on Data Distribution Heterogeneity Grading and Marginal Contribution Calculation

Read Full Paperexternally

Key Points

The study aims to improve federated learning performance by quantifying the importance of client data in heterogeneous environments.
Proposed a federated learning framework based on data heterogeneity grading.
Graded and quantified the differences in client data distributions.
Developed a dynamic weighted aggregation mechanism combining marginal contributions and data importance.
Conducted multi-dataset comparative experiments under non-IID and noisy-label conditions.
Achieved consistent increases in model accuracy during training.
Demonstrated improved convergence rates in heterogeneous data environments.
Successfully reduced the computational complexity of Shapley value calculations.

Abstract

As federated learning scales up in distributed scenarios, training instability and performance degradation caused by data quality issues—such as statistical heterogeneity and noise—have become major bottlenecks for practical deployment. Existing aggregation algorithms have been shown to not adequately account for differences in data importance. This can exacerbate client selection bias and incentive misalignment. As a result, global convergence can slow down and performance can deteriorate. To address this issue, this paper proposes a robust federated learning framework based on data heterogeneity grading and marginal contribution calculation. The objective of this study is to enhance the overall performance of federated learning systems in heterogeneous environments by quantifying data importance. The framework first grades and quantifies the heterogeneity of client data distributions, precisely characterizing data importance while reducing the computational space for Shapley value calculations, effectively lowering its exponential complexity. Subsequently, it integrates client marginal contributions with data distribution heterogeneity to establish a dynamic weighted aggregation mechanism that balances fairness, robustness, and differentiated data quality requirements. Multi-dataset comparative experiments demonstrate that the proposed method achieves consistent gains in model accuracy and convergence under non-IID splits and noisy-label settings.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jianhua Liu

Weiqing Zhang

Yanglin Zeng

Journals

Applied Sciences

Actions

Institutions

Hunan University of Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Federated Learning Method Based on Data Distribution Heterogeneity Grading and Marginal Contribution Calculation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study