As David Murray points out in his book *Group-Randomized Trials*
(Oxford University Press, 1998), there's plenty of opportunity for
confusion about **units**. There are *units*, *observational
units*, *assignment units*, and *units of analysis*, among
other terms. Often, these terms are used interchangeably (I do so
myself), but not always.

In any study involving people, the individual is commonly thought of as the unit of analysis because we study people. However, the unit of analysis and the corresponding sample size are determined by the way the study is conducted.

Determining units of analysis and their number recalls the discussion
of why measuring a single mouse 100 times is different from measuring 100
mice once each. Measurements on the same mouse are likely to be more
similar than measurements made on different mice. If there is something
about the way an experiment is conducted that makes it likely that some
observations will be more similar than others, this must be reflected in
the analysis. This is true whether the study involves diets, drugs,
nutritional supplements, methods of planting, social policies, or ways of
delivering service. (However, if two measurements on the same mouse were
**not** likely to be more similar than two measurements made on
different mice, then measuring a single mouse 100 times is **no**
different from measuring 100 mice once each!)

Consider a study of 800 10th grade high school students receiving one of two treatments, A & B. Experience has shown that two students selected at random from the same class are likely to be more similar than two students selected at random from the entire school who, in turn, are likely to be more similar than two students selected at random from the entire city, who are likely to be more similar than two students selected at random from the entire state, and so on.

Here are three of the many ways to carry out the study in a particular state.

- Take a random sample of 800 10th grade students from all school students in the state. Randomize 400 to A and 400 to B.
- Take a random sample of 40 10th grade classes of 20 students each from the set of all 10th grade classes. Randomize 20 classes to A and 20 classes to B.
- Take a random sample of 20 schools. From each school, randomly select two classes of 20 students each. Randomize the schools into two groups with classes from the same school receiving the same treatment.

Each study involves 800 students--400 receive A and 400 receive B. However, the units of analysis are different. In the first study, the unit of analysis is the individual student. The sample size is 800. In the second study, the unit of analysis is the class. The sample size is 40. In the third study, the unit of analysis is the school. The sample size is 20.

Murray (p. 105) has an elegant way to identify the units of analysis.
It is not a definition a novice can use, but it is rigorous and is the
way a trained statistician decides what the units should be. *A unit is
the unit of analysis for an effect if and only if that effect is
assessed against the variation among those units.*

It's not easy to come up with a less technical definition, but most of
the time the units of analysis are *the smallest units that are
independent of each other* or *the smallest units for which all
possible sets are equally likely to be in the sample*. In the examples
presented above

- A random sample of students is studied. The students are independent of each other, so the student is the unit of analysis.
- Here, students are not independent. Students in the same
class are likely to be more similar than students from different classes.
Classes
*are*independent of each other since we have a simple random sample of them, so class is the unit of analysis. - Here, neither students nor classes are independent. Classes from the same school are likely to be more similar than classes from different schools. Schools are selected at random, so school is the unit of analysis.

You might think of causing trouble by asking what the unit of analysis would be in case 3 if, in each school, one class received A and the other received B, with the treatments assigned at random. The unit of analysis would still be the school, but the analysis is now effectively one of paired data because both treatments are observed in each school. In similar fashion

- In a twins study, where the members of each twin pair are purposefully randomized purposefully so that the two twins receive different treatments, the unit of analysis is the twin pair.
- In a study of two types of exercise, where each subject uses a different form of exercise for each arm with treatments assigned at random, the unit of analysis is the pair of arms, that is, the individual subject, not the individual arm.
- In an agricultural study, where each farm has plots devoted to all of the methods under investigation, the unit of analysis is the farm, not the plot.
- In a study of husbands and wives, the unit of analysis is the couple.

Pairing cuts the sample size to half of what it would have been otherwise. However, you have to measure the units twice as long/much and the analysis becomes complicated if one of the two measurements ends up missing.

- Group-randomization (without Pairing) estimates effects with less precision than had individuals been randomized because similar individuals receive the same treatment and tend to behave similarly.
- Pairing usually leads to greater precision because comparisons within pairs are generally more precise that comparisons between unpaired units. In theory, pairing buys back some of the precision lost through group randomization. It would be interesting to do some calculations to find out how much precision is recovered through pairing.

Returning to study 3: If the two classes within each school were randomized to different treatments, the unit of analysis would be the school, not the class. However, the treatment effect would be compared to the variability in differences between classes within in each school. Therefore, this version of the study would probably be better able to detect a real difference than study 2, which was based on a (simple) random sample of classes.

At first glance, it seems unfair that the third study, involving
hundreds of students, has a sample size equal to the number of schools.
However, if there were no differences between schools or classes, the two
analyses--individuals (incorrect) and schools (correct)--would give the
essentially the same result. (This is the same thing as saying 100
measurements on one mouse **are** the same as 1 measurement on each of
100 mice if all mice react the same way.) The sample size may be small
but the school means will show much less variability than class means or
individual students, so the small sample size is made up for by the
increase in precision.

Groups rarely respond exactly the same way, so treating the group as the unit of analysis saves us from making claims that are not justified by the data. The precision of a properly analyzed group-randomized study involving a grand total of N subjects spread out among K groups will be equal to a that of a simple random sample of somewhere between K and N individuals.

- If the groups respond differently but there is no variation within each group, we've essentially got K measurements, which we see over and over again.
- At the other extreme, if there's no difference between groups beyond the fact that they are composed of different individuals, we've essentially got N observations.