**FAQ: Is it a fixed or random effect?**

*C. Supakorn*

Designed experiments frequently give rise to several different sources of random variation and therefore, the data are often analysed using a mixed model. For example[1]:

- a split-plot experiment requires error terms for the main plots and subplots
- an experiment with repeated measurement over time (or space) requires the correlation between observations to be modelled
- a meta-analysis of experiments replicated at several sites and/or over several years requires the data to be combined appropriately

A key decision when formulating a mixed model is determining which explanatory variables should be fixed and which random.

- A
**fixed**term typically represents the effect of specific conditions chosen for the study. - A
**random**term typically represents the effect of a sample of conditions observed from some wider population, and it is the variability of the population that is of interest. The structural (or randomized) components of an experimental design, such as blocks and plots, can usually be argued to fall into this category (see below for more discussion).

When deciding whether an explanatory factor should be fixed or random, there are 5 basic questions you should consider:

Let’s look at an example: A randomized complete block design (RCBD) experiment was conducted to compare the mean yield of three different irrigation methods (drip, surface, sprinkler) in tangerine tree orchards. Twelve orchards (i.e. blocks) were randomly selected from the population of interest. Each orchard was divided into 3 plots and an irrigation method randomly assigned so that each method appears in all twelve orchards. At harvest, the fruit from each plot was weighed. The conventional linear mixed model for this RCBD experiment is

As the objective is to estimate irrigation treatment means and/or to test for differences between them, the “*Irrigation*” factor should be modelled as fixed. As the aim is to make inferences about the wider population of orchards, rather than just the 12 observed, then the “*Orchard”* factor should be modelled as random.

In the above example, the blocking variable (*Orchard*) was treated as random. However, it is not always appropriate for the *block* effect to be random. Indeed, there is some diversity of opinion between American and European statisticians! Dixon (2016)[2] stated that the blocks in an experiment are rarely a random sample from a larger population. But this assumption is crucial in the computation of the standard error in the random block model. Let’s illustrate. Imagine a study of pigs, in which blocks containing 6 pigs are formed on the basis of initial weight: block 1: 50.1-55 kg, block 2: 55.1-60 kg, block 3: 60.1-65kg, and so on until block 8: 85.1-90 kg. Although the pigs in each block may be randomly sampled from an available population, the blocks themselves are not randomly sampled. Consequently, in some cases, the variance between the blocks will not match the variance of initial weight in the population. Indeed, if very light pigs (50-55 kg) and very heavy pigs (85.1-90 kg) are less common in the population, the variance component for blocks will be larger than the population variance. In this case, treating blocks as a fixed effect may be a more appropriate choice.