November 19, 2024Open Access

Gradient-based bilevel optimization for multi-penalty Ridge regression through matrix differential calculus

Key Points

Key points are not available for this paper at this time.

Abstract

Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the trade-off between minimizing the fitting error and the norm of the learned model coefficients. As this hyperparameter is scalar, it can be easily selected via random or grid search optimizing a cross-validation criterion. However, using a scalar hyperparameter limits the algorithm’s flexibility and potential for better generalization. In this paper, we address the problem of linear regression with ℓ 2 -regularization, where a different regularization hyperparameter is associated with each input variable. We optimize these hyperparameters using a gradient-based approach, wherein the gradient of a cross-validation criterion with respect to the regularization hyperparameters is computed analytically through matrix differential calculus. Additionally, we introduce two strategies tailored for sparse model learning problems aiming at reducing the risk of overfitting to the validation data. Numerical examples demonstrate that the proposed multi-hyperparameter regularization approach outperforms LASSO, Ridge, and Elastic Net regression in terms of R 2 score both in a static regression and in a system identification problem. Moreover, the analytical computation of the gradient proves to be more efficient in terms of computational time compared to automatic differentiation, especially when handling a large number of input variables, with an improvement of more than an order of magnitude. Application to the identification of over-parameterized Linear Parameter-Varying models is also presented. • Enhanced flexibility and generalization through variable specific regularization. • Efficient bilevel gradient-based optimization via matrix differential calculus. • Overfitting mitigation with two strategies for sparse model learning.

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper