Gradient Boosted Decision Trees (GBDTs) are popular for their strong predictive performance. However, in domains like finance and healthcare, data are often distributed across organizations, making collaborative model training challenging due to privacy concerns. Vertical federated learning (VFL) enables such collaboration when data are split by features, but many existing methods focus on protecting raw data while exposing sensitive model information, such as gradients and Hessians—especially to the label-owning party. Techniques like Homomorphic Encryption and Secret Sharing help, but often rely on trusted or privileged parties and may still leak intermediate statistics. To address this, we propose MPC-XGB , a privacy-preserving framework for training XGBoost under VFL with an honest-but-curious threat model. It uses secure three-party computation with Replicated Secret Sharing, distributing data across non-colluding servers and performing all computations on shares. This ensures that raw data, labels, and model statistics remain hidden, while supporting both secure training and prediction. Experiments show that MPC-XGB achieves strong performance (0.93 accuracy, 0.82 AUC), comparable to that of existing methods, with improved privacy guarantees.
Ramay et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: