Abstract Background Type III (dysbetalipoproteinemia) is a rare lipoprotein disorder that results in an increased risk of coronary heart disease and peripheral vascular disease. It often goes undiagnosed because it is difficult to identify with a standard lipid panel. For definite diagnosis genotype is usually required. Therefore, a rule-in test for genotyping would be useful. We investigate whether a ratio of very low-density lipoproteins (esVLDL-C) over Apolipoprotein B (ApoB) concentrations could be useful as a rule-in test. We also evaluate whether the addition of other easily obtainable biomarkers could improve this prediction. Methods 413,998 patients were assessed from the UKBiobank, 909 patients were classified as Type III defined as having an E2/E2 genotype, determined by exome and genome data, and presenting with mixed dyslipidemia (total cholesterol =200 mg/dL and triglycerides =175 mg/dL). The data was randomly split with 80% reserved for model training and 20% for testing. VLDL-C was calculated using the enhanced Sampson (es) equation that uses apoB as an independent variable. A logistic regression model was made based on esVLDL-C levels over ApoB and assessed for its performance. A ROC curve was produced for the model and the optimal threshold was identified utilizing the Youden-Index. To further enhance the model, additional biomarkers were assessed: age, BMI, Sex, Apo B, esVLDL, non-HDL-C, Type 2 diabetes, HDL-C, total cholesterol, and triglycerides. Utilizing recursive feature elimination (RFE) with a 10-fold cross-validation the optimal features were selected based on balanced accuracy. With the resulting biomarkers, a new model was built utilizing the logistic regression modeling and setting the class weights to balanced accuracy. The balanced accuracy was used so the loss function was adjusted for a class imbalance between the positive and negative cases. Results The logistic model of esVLDL-C/ApoB had an area under the curve (AUC) of 0.99. The performance characteristics were 97.6% sensitivity, 95.3% specificity, 3.2% PPV, and 99.60% NPV for the identification of those with Type III. Based on findings from the RFE there was less than a 1% performance difference in accuracy between the model using 5 biomarkers and the model using all 11 biomarkers. However, when only 4 biomarkers were used, there was a noticeable drop in accuracy. Therefore, 5 biomarkers—esVLDL-C/ApoB, sex, non-HDL-C, ApoB, and diabetes diagnosis—were selected for training the final model. The resulting model had an AUC of 1.00 with 98.94%/98.90% sensitivity, 99.03%/98.97% specificity, 14.01%/12.82% PPV, and 99.998%/99.998% NPV within the training/testing for detecting Type III. Conclusion Within we have described two models, one simple model of esVLDL-C/ApoB with 95% accuracy and another model with 99% accuracy which harnesses additional readily obtainable biomarkers and clinical features. Although the PPV of this model is still only 12.82%, mostly due to the rarity of the disease, this PPV is still considerably improved compared to previous models. The new algorithm could help identify individuals who would benefit most from APOE genotyping to confirm a Type III diagnosis, enabling earlier clinical intervention and potentially reducing the cardiovascular disease risk associated with the disorder.
Auger et al. (Wed,) studied this question.