Appendix 3.

Brief Description of Machine Learning Algorithms Included in the SuperLearner Library

Algorithm	R Package	Description
Regularization²²	glmnet	Penalized regression reduces overfit due to collinear independent variables
Elastic net		Ridge regression shrinks coefficients for collinear independent variables toward zero but does not fully eliminate any independent variable Elastic net regression allows various penalties where coefficients for collinear independent variables are shrunk toward zero (but not to eliminating contributions to the predicted probability) and/or to zero (eliminating their contributions to the predicted probability) Mixing parameter penalty (alpha) is set somewhere between 0.01 and .99 Lasso regression shrinks coefficients for collinear covariate coefficients to zero, eliminating their contributions to the predicted probability
Spline
Adaptive splines²³	earth	Adaptive spline regression flexibly captures interactions and linear and nonlinear associations
Adaptive polynomial splines²⁴		Linear segments (splines) of varying slopes are connected and smoothed to create piecewise curves (basis functions)
	polspline	Final fit is built using a stepwise procedure that selects the optimal combination of basis functions Earth and polymars are generally similar but differ in the order that basis functions (eg, linear vs nonlinear) are added to build the final model
Decision trees
Random forest²⁵	ranger	Decision tree methods capture interactions and nonlinear associations Independent variables are partitioned (based on values) and stacked to build decision trees and assemble an aggregate “forest” Random forest builds numerous trees in bootstrapped samples and generates an aggregate tree by averaging across trees (reducing overfit) Suitable for large data sets but may be unstable and overfitting
Gradient boosting^26,27	xgboost	Extreme gradient boosting decision tree algorithm. Final predictions are formulated by models sequentially built (using gradient descent algorithm to minimize loss) to resolve residual error made by existing models
Neural networks²⁸	nnet	Connections between predictors and the outcome are modeled as a network Predictors affect the outcome through intermediate layers Weights are assigned to connections Capture interactions and nonlinear associations Low interpretability