Announcement

What Does Multiple Linear Regression Look Like?

Consider once again the regression of homocysteine on B12 and folate (all logged to base 10). It's common to think of the data and display it as pairwise scatterplots. The regression equation

LHCY = 1.570602 - 0.082103 LCLC - 0.136784 LB12

is often mistakenly thought of as a line. However, it is not a line, but a surface.

Each observation is a point in 3-dimensional space {(xi, yi, zi), i = 1,..n} [here, (LCLCi, LB12i, LHCYi)]. When plotted, the data look like the picture to the left.

It can be difficult to appreciate a two-dimensional representation of three- dimensional data. The picture is redrawn with spikes from each observation to the plane defined by LCLC and LB12 to give a better sense of where the data lie.

The final display shows the regression surface. It is a flat plane. Predicted values are obtained by starting at the intersection of LB12 and LCLC on the LB12-LCLC plane and travelling parallel to the LHCY axis until the plane is reached (in the manner of the spike, but to the plane instead of the observation). Residuals are calculated as the distance from the observation to the plane, again travelling parallel to the LCHY axis.

The same thing happens with more than 2 predictors, but it's hard to draw a two-dimensional representation of it. With p predictors, the regression surface is a p-dimensional hyperplane in a (p+1)-dimensional space.