The multiple linear regression model is a very powerful model that often performs well with real data, especially when we have more observations than predictors. When using the linear regression model to answer a research question, there are four basic steps:
- Formulate a multiple regression model
- Determine how the model helps answer the research question
- Checking the model assumptions
- Performing a hypothesis test or calculating a confidence interval to answer the question
The mutltiple regression model with \(p\) predictors is
\[y_i=\beta_0+\beta_1x_{i1}+\beta_2{x_i2}+\cdots+\beta_px_{ip}+\varepsilon_i\]
where the \(\varepsilon_i\) are independent and have a normal distribution with mean 0 and contstant variance. The assumptions for this model are summarized LINE
- Linear relationship between the mean response and the predictors
- Independent errors
- Normally distributed errors with mean 0
- Equal variance for all the errors.
We can use a plot of residuals vs predicted to check the L and E assumptions and a histogram and QQ plot of the residuals to check the N assumption. The independence assumption requires careful consideration of how the data was sampled.
Some general questions that we can answer with the linear regression model are:
- Is there a (linear) relationship between the response and any of the predictors?
- How strong is the relationship between the response and the predictors?
- \(R^2\) provides a measure of the strength of the linear relationship.
- Which individual predictors are useful in predicting the response?
- Use the t-test for the individual slope parameters or the general linear F-test to test a subset of the slopes parameters.
- What is the effect of each of the predictors on the response?
- Interpret the confidence interval for the slope parameters
- What is the value of the response for particular values of the predictors
- Use a confidence interval for the mean if you are interested in the average response
- Use a prediction interval if you are want to know the response for an individual
- Does the effect of a predictor on the response depend on the value of another predictor?
- Think about interactions.