วันอังคารที่ 7 ธันวาคม พ.ศ. 2553

Linear Regression Analysis - When NOT to Center a Continuous Predictor Variable


Image : http://www.flickr.com


There are two reasons to center predictor variables in any time of regression analysis - linear, logistic, multilevel, etc.

1. To lessen the correlation between a multiplicative term (interaction or polynomial term) and its component variables (the ones that were multiplied).
2. To make interpretation of parameter estimates easier.

But when is centering NOT a good idea?

Well, basically when it doesn't help.

For reason #1, it will only help if you have multiplicative terms in a model. If you don't have any multiplicative terms - no interactions or polynomials - centering isn't going to help.

For reason #2, centering especially helps interpretation of parameter estimates (coefficients) when:

a) you have an interaction in the model
b) particularly if that interaction includes a continuous and a dummy coded categorical variable and
c) if the continuous variable does not contain a meaningful value of 0
d) even if 0 is a real value, if there is another more meaningful value, such as a threshhold point. (For example, if you're doing a study on the amount of time parents work, with a predictor of Age of Youngest Child, an Age of 0 is meaningful and will be in the data set, but centering at 5, when kids enter school, might be more meaningful).

So when NOT to center:

1. If all continuous predictors have a meaningful value of 0.
2. If you have no interaction terms involving any continuous predictors with categorical ones.
3. And if there are no values that are particularly meaningful.

All three of these criteria should apply before you choose to not center. If any one is false, centering will help you interpret your coefficients.

ไม่มีความคิดเห็น:

แสดงความคิดเห็น