Performance of the Various Machine Learning Models in the Validation Set Using All 38 Variables*
Model | Training | Validation | Test | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AUC (95% CI) | AUC (95% CI) | Sensitivity | Specificity | PPV | NPV | Overall Accuracy | Savings | AUC (95% CI) | Sensitivity | Specificity | PPV | NPV | Overall Accuracy | Savings | |
Universal Screening (No rule) | — | — | 1.0 | 1.0 | 0.26 | 0.74 | 1.0 | 0% | — | 1.0 | 1.0 | 0.26 | 0.74 | 1.0 | 0% |
Random Forest | 0.85 (0.84–0.86) | 0.80 (0.79–0.81) | 0.45 | 0.90 | 0.58 | 0.82 | 0.79 | 85% | 0.78 (0.77–0.79) | 0.50 | 0.88 | 0.55 | 0.83 | 0.76 | 75% |
Support Vector Machines | 0.81 (0.80–0.82) | 0.77 (0.76–0.78) | 0.34 | 0.89 | 0.50 | 0.79 | 0.74 | 82% | — | — | — | — | — | — | — |
Neural Networks | 0.79 (0.78–0.80) | 0.78 (0.77–0.78) | 0.36 | 0.90 | 0.58 | 0.80 | 0.76 | 82% | — | — | — | — | — | — | — |
K-nearest Neighbors | 0.78 (0.78–0.79) | 0.75 (0.74–0.76) | 0.35 | 0.84 | 0.45 | 0.78 | 0.71 | 79% | — | — | — | — | — | — | — |
Decision Trees | 0.77 (0.76–0.78) | 0.75 (0.73–0.76) | 0.34 | 0.90 | 0.56 | 0.79 | 0.75 | 83% | — | — | — | — | — | — | — |
Logistic Regression | 0.76 (0.75–0.77) | 0.71 (0.70–0.73) | 0.48 | 0.85 | 0.55 | 0.81 | 0.74 | 76% | — | — | — | — | — | — | — |
↵* Sensitivity, specificity, PPV, NPV, Overall Accuracy, and Savings are all calculated at the selected optimum operating point in each case.
PPV, positive predictive value; NPV, negative predictive value; AUC, area under the receiver operating characteristic curve; CI, confidence interval.