The regression coefficient for the i-th predictor is the expected
difference in response per unit difference in the i-th predictor, all other
things being equal. That is, if the i-th predictor is changed 1 unit
while all of the other predictors are held constant, the response is
expected to change b_{i} units. As always, it is important that
cross-sectional data not be interpreted as though they were longitudinal.

The regression coefficient and its statistical significance can change
according to the other variables in the model. Among postmenopausal
women, it has been noted that bone density is related to weight. In this
cross-sectional data set, density is regressed on weight, body mass
index, and percent ideal body weight^{*}. These are the
regression coefficients for the 7 possible regression models predicting
bone density from the weight measures.

(1) (2) (3) (4) (5) (6) (7) Intercept 0.77555 0.77264 0.77542 0.77065 0.74361 0.77411 0.75635 WEIGHT 0.00642 . 0.00723 0.00682 0.00499 . . BMI -0.00610 -0.04410 . -0.00579 . 0.01175 . PCTIDEAL 0.00026 0.01241 -0.00155 . . . 0.00277

Not only do the magnitudes of the coefficients change from model to
model, but for some variables the sign changes, too^{**}.

For each regression coefficient, there is a t statistic. The corresponding P value tells us whether the variable has statistically significant predictive capability in the presence of the other predictors. A common mistake is to assume that when many variables have nonsignificant P values they are all unnecessary and can be removed from the regression equation. This is not necessarily true. When one variable is removed from the equation, the others may become statistically significant. Continuing the bone density example, the P values for the predictors in each model are

(1) (2) (3) (4) (5) (6) (7) WEIGHT 0.1733 . 0.0011 <.0001 <.0001 . . BMI 0.8466 0.0031 . 0.1960 . <.0001 . PCTIDEAL 0.9779 0.0002 0.2619 . . . <.0001

All three predictors are related, so it is not surprising that model (1) shows that all of them are nonsignifcant in the presence of the others. Given WEIGHT and BMI, we don't need PCTIDEAL, and so on. Any one of them is superfluous. However, as models (5), (6),and (7) demonstrate, all of them are highly statistically significant when used alone.

The P value from the ANOVA table tells us whether there is predictive capability in the model as a whole. All four combinations in the following table are possible.

Overall F | |||
---|---|---|---|

Significant | NS | ||

Individual t | Significant | - | - |

NS | - | - |

- Cases where the t statistic for every predictor and the F statistic for the overall model are statistically significant are those where every predictor has something to contribute.
- Cases where nothing reaches statistical significance are those where none of the predictors are of any value.
- This note has shown that it is possible to have the overall F ratio statistically significant and all of the t statistics nonsignificant.
- It is also possible to have the overall F ratio nonsignificant and
some of the t statistics significant. There are two ways this can
happen.
- First, there may be no predictive capability in the model. However, if there are many predictors, statistical theory guarantees that on average 5% of them will appear to have statistically significant predictive capability when tested individually.
- Second, the investigator may have chosen the predictors poorly. If one useful predictor is added to many that are unrelated to the outcome, its contribution may not be large enough for the overall model to appear to have statistically significant predictive capability. A contribution that might have reached statistical significant when viewed individually, might not make it out of the noise when viewed as part of the whole.

-------------------

^{*}In general, great care must be used when using a predictor
such as body mass index or percent ideal body weight that is a ratio of
other variables. This will be discussed in detail later.

^{**}This touches on another point, too important to be left buried
here: **It is not always easy to guess/know what the sign of a
regression coefficient will be when a predictor is correlated with other
variables in the model.**

Consider model (2), for example. Both predictors are statistically significant. On there own, bone density goes up and down as they go up and down [models (6) & (7)]. Yet, when they appear in a model together, bone density goes