In medical research, we generally formulate our regression model in terms of the research question that we want to answer and the variables that are included as predictors are determined by some physical or biological reason. Other times, we have a large set of candidate variables from which we try to identify the most appropriate predictors to include in the regression model.

In this lesson, we will learn about variable selections methods, that is, how to choose which variables are included in the regression model. We want our final regression model to be simple, contain as few predictors as possible, while still being useful. That is, the final model should

The first methods we will discuss are stepwise and best-subset selection methods. Although we will discuss these in terms of linear regression, these methods will also apply to other regression models such as logistic and poisonn models discussed later.

We will also discuss regularized regression models such as the LASSO as variable selection tools.