The insurance industry confronts two analytically critical and financially consequential challenges: accurate prediction of claim settlement amounts and timely detection of fraudulent claims. Conventional approachesrule-based heuristics, logistic regression scorecards, and manual adjuster assessmentsare demonstrably inadequate for capturing the nonlinear, high-dimensional interactions that characterise modern insurance claim data. This paper presents ClaimSmart AI, a comprehensive, modular, end-to-end machine learning pipeline that addresses both challenges within a unified analytical framework. The system operates on a synthetically generated dataset of 15,000 insurance claim records encompassing 19 attributes spanning policyholder demographics, policy characteristics, vehicle parameters, claim specifics, and behavioural indicators. A dual-model architecture employs a Random Forest Regressor (150 estimators) for claim amount prediction and a Random Forest Classifier (150 estimators, balanced class weights) for binary fraud risk detection, both trained on a stratified 80/20 holdout split with StandardScaler feature normalisation and LabelEncoder categorical transformation. The regression model achieves a Mean Absolute Error below INR 15,000 and an R-squared coefficient of determination exceeding 0.70, while the classification model delivers accuracy above 0.80, fraud-class recall exceeding 0.74, and F1-Score above 0.76, surpassing logistic regression and rule-based baselines on equivalent evaluation protocols. Prediction outputs are enriched with four derived business metricspredicted claim amount, claim variance, fraud risk probability, and a three-tier fraud risk categoryand persisted to a MySQL relational database for direct consumption by Power BI and enterprise analytics platforms. Eight publication-quality visualisation charts provide comprehensive analytical coverage from fraud distribution and regional heatmaps to actual-versus-predicted scatter analysis. A mysqldump-format SQL export module ensures enterprise portability and regulatory archival compliance. The complete pipeline executes through a single orchestration script, establishing ClaimSmart AI as both a rigorous academic contribution and a practical template for production insurance analytics deployment.
Building similarity graph...
Analyzing shared references across papers
Loading...
C. Nawaz Basha
S. Usharani
Building similarity graph...
Analyzing shared references across papers
Loading...
Basha et al. (Thu,) studied this question.
synapsesocial.com/papers/69ec59c688ba6daa22dab775 — DOI: https://doi.org/10.64672/ijifr/26.04.13.08.048
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: