Which fit is better?

Sometimes the same model is fitted to two different populations. For example, an researcher might wish to investigate whether weight predicts blood pressure in smokers and nonsmokers and, if so, whether the regression model fits one group better than the other. The problem with questions like this is that the answer depends on what we mean by better.

It is common to hear investigators speak of the model with the larger coefficient of determination, R², as though it fits better because it accounts for more of the variability in the response. However, it is possible for the model with the smaller R² to have the smaller standard error of the estimate and make more precise predictions. Here is a small dataset to illustrate this behavior.

X Y X Y

158.2 157.8 140.4 153.2

214.9 146.6 211.9 157.4

153.2 147.5 152.4 149.6

196.0 153.1 124.7 154.9

88.5 143.7 103.9 145.2

55.5 132.3 128.5 141.7

86.4 144.3 187.1 159.7

223.6 169.1 168.5 145.3

256.9 160.9 138.3 151.7

252.4 157.1 137.9 141.7

20.9 141.6 203.3 153.3

92.9 145.4 102.5 145.8

X	Y	X	Y
158.2	157.8	140.4	153.2
214.9	146.6	211.9	157.4
153.2	147.5	152.4	149.6
196.0	153.1	124.7	154.9
88.5	143.7	103.9	145.2
55.5	132.3	128.5	141.7
86.4	144.3	187.1	159.7
223.6	169.1	168.5	145.3
256.9	160.9	138.3	151.7
252.4	157.1	137.9	141.7
20.9	141.6	203.3	153.3
92.9	145.4	102.5	145.8

The two data sets need not have the same regression line, but they have been constructed with the same regression line in this example to remove any suspicion that these results might have something to do with the slopes of the regression lines. They don't!

	Y = 134.9 + 0.100 X
	R²	s_e
Red	0.36	5.04
Black	0.64	6.28

The black data set, with open circles and outer prediction bands, has the larger R². The red data set, with filled circles and inner prediction bands, has the smaller s_e.

Does the model fit one group better than the other? I try to avoid questions demanding one word answers where the answer depends on the choice of summary measure. However, if pressed, I would argue that the answer is red. R² is just a disguised correlation coefficient (the square of the correlation between the observed and predicted values). I have yet to encounter a real research question for which the answer is "correlation coefficient". If I were to use "better" in connection with linear regression it would almost certainly have something to do with prediction. The standard error of the estimate (s_e) estimates the precision of the predictions. The accuracy of the predictions typically determines whether the regression equation will be useful. While the regression equation may account for more variability in the black group, the predictions are more precise in the red group.

Mathematical Details

R² can be written as

R² = 1 - Residual SS/Total SS , while s²_e can be written as s²_e = Residual SS / (n-2) The fit with the larger R² is the one that accounts for the greater proportion of the variability in the response, that is, the one for which Residual SS/Total SS is smaller. The fit with the smaller s_e is the one that leaves the smaller amount of variability unaccounted for, that is, the one for which Residual SS/(n-2) is smaller. If the sample sizes are equal the model with the smaller s_e is the one for which Residual SS is smaller. The model for which the ratio (Residual SS/Total SS) is smaller need not be the same model for which the numerator (Residual SS) is smaller.

Comment

These results apply when the same model is fitted to two different sets of observations. If two models were fitted to the same set of responses--for example, if weight and amount of exercise were used separately to predict blood pressure in the same set of individuals--then the model for which R² is larger would necessarily be the model for which s_e is smaller. That's because Total SS would be the same for both, so the model for which Residual SS/Total SS is smaller must also be the one for which Residual SS is smaller.

[back to The Little Handbook of Statistical Practice]