What question did this study set out to answer?

The main aim is to optimize credit strategies during transaction authorization by balancing fraud risk and customer experience.

February 20, 2026Open Access

Offline Conservative RL for Transaction Authorization: Smartly Balancing Fraud Risk and Customer Friction

Key Points

The main aim is to optimize credit strategies during transaction authorization by balancing fraud risk and customer experience.
Utilized an Offline Conservative Reinforcement Learning (CQL) framework
Applied a multi-objective cost function to balance fraud loss, operational burden, and customer friction
Designed a Markov Decision Process (MDP) with state featurization, action space, and cost weights
Used a public credit card transaction dataset with severe class imbalance
The learned policy significantly reduces total costs compared to cost-sensitive supervised methods
Shows favorable trade-offs along a Pareto frontier between risk, operations, and user friction
Demonstrated that CQL effectively mitigates out-of-distribution overestimation in offline environments

Abstract

This study instantiates credit strategy optimization at the transaction authorization layer, with actions approve, review, and decline. Within an Offline Conservative RL (CQL) framework, we co-optimize fraud loss, operational burden from manual reviews, and customer friction from false positives and delays via a unified multi-objective cost function. Using a public credit-card transaction dataset with severe class imbalance, the learned policy improves total cost relative to cost-sensitive supervised baselines, while offering favorable trade-offs along a Pareto frontier between risk, operations, and friction. We detail the MDP design (state featurization, action space, and cost weights) and show that CQL mitigates out-of-distribution overestimation in offline settings. The results indicate that conservative RL is a practical path for transaction-level credit decision-making that balances fraud risk with operational efficiency and user impact.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yang Ximeng

Zhang Yiming

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Offline Conservative RL for Transaction Authorization: Smartly Balancing Fraud Risk and Customer Friction

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study