Search code examples
statisticsmodeling

Is the akaike information criterion (AIC) unit-dependent?


One formula for AIC is:

AIC = 2k + n*Log(RSS/n)

Intuitively, if you add a parameter to your model, your AIC will decrease (and hence you should keep the parameter), if the increase in the 2k term due to the new parameter is offset by the decrease in the n*Log(RSS/n) term due to the decreased residual sum of squares. But isn't this RSS value unit-specific? So if I'm modeling money, and my units are in millions of dollars, the change in RSS with adding a parameter might be very small, and won't offset the increase in the 2k term. Conversely, if my units are pennies, the change in RSS would be very large, and could greatly offset the increase in the 2k term. This arbitrary change in units would lead to a change in my decision whether to keep the extra parameter.

So: does the RSS have to be in standardized units for AIC to be a useful criterion? I don't see how it could be otherwise.


Solution

  • No, I don't think so (partially rowing back from what I said in my earlier comment). For the simplest possible case (least squares regression for y = ax + b), from wikipedia, RSS = Syy - a x Sxy.

    From their definitions given in that article, both a and Sxy grow by a factor of 100 and Syy grows by a factor of 1002 if you change the unit for y from dollars to cents. So, after rescaling, the new RSS for that model will be 1002 times the the old one. I'm quite sure that the same result holds for models with k <> 2 parameters.

    Hence nothing changes for the AIC difference where the key part is log(RSSB/RSSA). After rescaling both RSS will have grown by the same factor and you'll get the exact same AIC difference between model A and B as before.

    Edit:

    I've just found this one:

    "It is correct that the choice of units introduces a multiplicative constant into the likelihood. Thence the log likelihood has an additive constant which contributes (after doubling) to the AIC. The difference of AICs is unchanged."

    Note that this comment even talks about the general case where the exact log-likelihood is used.