Abstract Performative prediction refers to scenarios where model predictions influence the underlying data distribution they aim to predict. A desirable property in this context is performative stability, where model predictions are already optimal for the distribution they induce, indicating converged model parameters and no need for further retraining. Achieving performative stability requires characterizing the data distribution map D (), i. e. , the relationship between predictions and the resulting distribution shifts. Current studies typically quantify distribution differences using metrics like W₁ distance or ² divergence, which may not provide isometric embeddings or maintain metric equivalence in practical scenarios, limiting their applicability across various data distribution maps. Moreover, the crucial smoothness parameter in existing work is often unobtainable in performative scenarios, constraining the real-world utility of current theoretical results and methods. To address these challenges, we develop an algorithm that learns a performatively stable model for arbitrary data distribution maps without requiring the joint smoothness parameter. Specifically, we introduce a new -sensitivity measure for D (), quantified by the gradient of the loss function, which naturally and directly characterizes how distribution shifts affect the optimization of the objective function. Based on this sensitivity, we formulate a -strongly convex loss function and optimize the deployed model accordingly, where is derived from the defined, eliminating the need for the -joint smoothness assumption. Our theoretical results guarantee the convergence of the deployed model to performative stability. Extensive experiments on synthetic and real-world datasets with diverse data distribution maps demonstrate the superiority of our method over state-of-the-art techniques in two key aspects: prediction accuracy and performative stability.
Zhong et al. (Thu,) studied this question.