8 Dimensionality Reduction

Complex models increase the chance of overfitting to the training sample. This leads to:

Poor prediction
Burdensome prediction models for implementation (need to measure lots of predictors)
Low power to test hypothesis about predictor effects

Complex models are difficult to intepret

Complexity generally increases with:

Non-parametric models
Unconstrained effects of predictors in parametric models (e.g., allowing for big coefficient for X⁵)
Inclusion of predictors with small or no (noise) effects
Increased number of predictors
Increased ratio of p to n

Many parametric models (linear model, generalized linear model, linear discriminant analysis):

Become very overfit as p approaches n
Can not be used when p >= n
Yet today we often have p >> n (NLP studies, genetics, precision medicine)

To reduce overfitting and/or allow p >> n, We need methods that can:

Reduce effective p (select or combine)
Constrain coefficients

We will consider three methods to accomplish this:

Subset selection (and filtering more generally)
Regularization (Shrinkage)
Dimensionality Reduction

8.1 Principal Components Regression

8.2 Partial Least Squares