Regularization22 | glmnet | |
Elastic net | | Ridge regression shrinks coefficients for collinear independent variables toward zero but does not fully eliminate any independent variable Elastic net regression allows various penalties where coefficients for collinear independent variables are shrunk toward zero (but not to eliminating contributions to the predicted probability) and/or to zero (eliminating their contributions to the predicted probability) Mixing parameter penalty (alpha) is set somewhere between 0.01 and .99 Lasso regression shrinks coefficients for collinear covariate coefficients to zero, eliminating their contributions to the predicted probability
|
Spline | | |
Adaptive splines23 | earth | |
Adaptive polynomial splines24 | | |
| polspline | Final fit is built using a stepwise procedure that selects the optimal combination of basis functions Earth and polymars are generally similar but differ in the order that basis functions (eg, linear vs nonlinear) are added to build the final model
|
Decision trees | | |
Random forest25 | ranger | Decision tree methods capture interactions and nonlinear associations Independent variables are partitioned (based on values) and stacked to build decision trees and assemble an aggregate “forest” Random forest builds numerous trees in bootstrapped samples and generates an aggregate tree by averaging across trees (reducing overfit) Suitable for large data sets but may be unstable and overfitting
|
Gradient boosting26,27 | xgboost | Extreme gradient boosting decision tree algorithm. Final predictions are formulated by models sequentially built (using gradient descent algorithm to minimize loss) to resolve residual error made by existing models
|
Neural networks28 | nnet | Connections between predictors and the outcome are modeled as a network Predictors affect the outcome through intermediate layers Weights are assigned to connections Capture interactions and nonlinear associations Low interpretability
|