In many applications, when building linear regression models, it is important to account for the presence of outliers, that is, corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders of magnitude faster than existing big-M formulations in the literature for this problem. In addition, we verify that exact solution of the mixed-integer optimization problems can lead to substantially better-quality solutions than simpler relaxations or heuristics commonly used in practice. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: This work was supported by a public grant as part of the Investissement d’avenir project Grant ANR-11-LABX-0056-LMH, LabEx LMH. A. Gómez is supported by the US Air Force Office of Scientific Research Grant FA9550-22-1-0369. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2025.1215 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2025.1215 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Gomez et al. (Tue,) studied this question.