Search code examples
statisticsh2oglmgbmtweedie

Mean Residual Deviance Formula in H2O


I'm trying to find out the exact formula used in H2O for the Mean Residual Deviance loss function for a Tweedie distribution.

Or even, in general, what would be the mean residual deviance for a Tweedie distributed dependent variable?

So far, I've found this page (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/glm.html#tweedie-models) where the deviance formula for a tweedie distribution is given as:

Tweedie deviance in H2O documentation

However, inside the H2O code, found on github on this page line 103 (https://github.com/h2oai/h2o-3/blob/master/h2o-core/src/main/java/hex/Distribution.java#L103) the formula is specified differently (ignoring the omega, which is just the weight, and the lack of summation):

2 * w * (Math.pow(y, 2 - tweediePower) / ((1 - tweediePower) * (2 - tweediePower)) - y * exp(f * (1 - tweediePower)) / (1 - tweediePower) + exp(f * (2 - tweediePower)) / (2 - tweediePower))

which in equation form is:

Tweedie Deviance used in the code

So, is the documentation wrong or the implementation? I would appreciate any help!

Thank you!


Solution

  • Thank you for pointing this out, while the backend equation located here is correct (so the implementation is correct), the equation in the documentation appears to be incorrect. I have created this Jira ticket to update the equation in the documentation. The ticket contains the correct equation along with helpful information to derive it.