We have considered dependent and independent variables, but have neglected the nuisance variables. This is the most challenging part of experimental design. Only the general idea is given here; see [154,197] for exhaustive presentations. Suppose that when looking through the data it is noted that the dependent variable depends heavily on an identifiable property of the subjects, such as gender. This property would become a nuisance variable,
. We could imagine designing an experiment just to determine whether and how much
affects
, but the interest is in some independent variable
, not
.
The dependency on drives the variance high across the subjects; however, if they are divided into groups that have the same
value inside of each group, then the variance could be considerably lower. For example, if gender is the nuisance variable, then we would divide the subjects into groups of men and women and discover that the variance is smaller in each group. This technique is called blocking, and each group is called a block. Inside of a block, the variance of
should be low if the independent variable
is held fixed.
The next problem is to determine which treatment should be applied to which subjects. Continuing with the example, it would be a horrible idea to give treatment to women and treatment
to men. This completely confounds the nuisance variable
and independent variable
dependencies on the dependent variable
. The opposite of this would be to apply
to half of the women and men, and
to the other half, which is significantly better.
A simple alternative is to use a randomized design, in which the subjects are assigned
or
at random. This safely eliminates accidental bias and is easy for an experimenter to implement.
If there is more than one nuisance variable, then the assignment process becomes more complicated, which tends to cause a greater preference for randomization. If the subjects participate in a multiple-stage experiment where the different treatments are applied at various times, then the treatments must be carefully assigned. One way to handle it is by assigning the treatments according to a Latin square, which is an -by-
matrix in which every row and column is a permutation of
labels (in this case, treatments).
Steven M LaValle 2020-11-11