Multiple linear regression in Genstat
Which variables should I include in the model?
The general aim of regression is to model the relationship between a response variable (y) and one or more explanatory variables (x variables). However, when we have several explanatory variables, this creates an extra challenge – we need to decide which ones to include in our model. That is, we need to be able to explore the different possible models by comparing alternative variables or sets of variables. Genstat can help us do this.
Method 1: Manually fitting a sequence of regression model
Opened by clicking the Change model button on Genstat’s Linear Regression menu, the Change Model menu allows us to change our current model by adding or removing explanatory variables from it.
Stats | Regression Analysis | Linear Models… (General linear regression)
- Add adds the selected terms (i.e. explanatory variables) to the model
- Drop removes the selected terms from the model
- Switch removes all terms in the current model and adds those that are not
- Try allows you to assess the effect of switching (i.e. adding or dropping) each of the terms selected without actually changing your current model
Method 2: Stepwise regression
The stepwise facilities in the Change Model menu can be used to build up the regression model automatically.
- Forward selection builds a regression model by adding explanatory variables sequentially. In each forward step, the variable that gives the biggest improvement to the model is added.
- Backward elimination builds a regression model by removing explanatory variables sequentially. In each elimination step, the variable that results in the smallest change in improvement to the model is dropped.
- Stepwise regression builds a regression model by sequentially dropping and adding explanatory variables.
Method 3: All subset regression
All subsets regression can be used to search through all possible linear regression models and to evaluate these according to some criterion. However, fitting of all possible regression models can be very computer intensive and it should also be used with caution!
Stats | Regression Analysis
| All Subsets Regression | Linear Models…
With all of these approaches, Genstat provides concise output so that you can easily compare the models.
To learn more about multiple linear regression in Genstat, please click on Genstat Help → Genstat Guides → Regression → Multiple linear regression (p.17-29).