FAQ: Using blocking to improve precision and to avoid bias
When conducting an experiment, an important consideration is how to even out the variability among the experimental units to make comparisons between the treatments fair and precise. Ideally, we should try to minimize the variability by carefully controlling the conditions under which we conduct the experiment. However, there are many situations where the experimental units are non-uniform. For example:
- in a field experiment laid out on a slope, the plots at the bottom of the slope may be more fertile than the plots at the top
- in a medical trial, the gender and age of the subjects may vary
When you know there are differences between the experimental units, you can improve precision and avoid bias by “blocking”. Blocking involves grouping the experimental units into homogenous groups so that groups of experimental units are as alike as possible. The treatments are then randomized to the experimental units within each group (or block). For example, in the field experiment above, plots would be blocked (i.e. grouped) according to the slope, and in a medical trial, subjects would be blocked by gender and age group.
Let’s look at an example to see how blocking improves the precision of an experiment by reducing the unexplained variation. In this field trial, the yields (pounds per plot) of four strains of Gallipoli wheat were studied. During the design phase, the 20 experimental plots were grouped into five blocks (each containing 4 plots). Within each block, the four wheat strains were randomly assigned to the plots. This is an example of randomized complete block design (RCBD).
This data set can be accessed from within Genstat. Click on File in the menu bar, then Open Example Data Sets… and select “Wheatstrains.gsh”.
To demonstrate the advantage of blocking, we will analyse the data as both a completely randomized design (CRD), which ignores the blocking, and as an RCBD, which takes the blocking into account. Note, one of the assumptions behind a CRD is that the set of experimental units to which the treatments are applied is effectively homogeneous.
To analyse the data, click Stats|Analysis of Variance|One- and Two- way…
Notice that the ANOVA table for the RCBD has an additional line, “Blocks stratum”. This records the variation between blocks. The strains are now estimated in the Blocks.*Units* stratum, which represents the variation within blocks. As a result:
A: the residual mean square (i.e. the unexplained variation) has decreased from 2.983 to 2.188
B: the standard error of the difference (s.e.d.) has decreased from 1.092 to 0.936
So, blocking has improved the precision of the experiment. This increase in precision means that we have a better chance of detecting differences between the wheat strains.
If you suspect that certain groups of experimental units may differ from each other, you can use those groups as a blocking factor. If the differences do appear, your estimated treatment effects will be more precise than if you had not included blocking in the statistical model. On the other hand, if no obvious groups of similar units exist, a CRD may be the best solution.
Did you find this blog post helpful? If so, who could you share it with? Thanks!