Search code examples
javaguavahashcodebigdecimal

"Normalizing" BigDecimal's hash code: howto?


I have a JSON Schema implementation written in Java which depends on Jackson (version 2.1.x). For accuracy reasons, I tell Jackson to use BigDecimal for floating point numbers.

For the needs of JSON Schema, there is a particular need: JSON value equality, for numeric values, is defined by the equality of their mathematical value. I need this kind of check since, for instance, this is not a legal schema (values in an enum should be unique):

{ "enum": [ 1, 1.0 ] }

But JsonNodes for 1 and 1.0 are not equal. Therefore, I have coded an implementation of Guava's Equivalence, and use Set<Equivalence.Wrapper<JsonNode>> where appropriate. And this implementation should work for all types of nodes, not just numeric nodes.

And the most difficult part of this implementation turns out to be doHash() for numeric nodes :/ I need the same hashcode for equivalent mathematical values, whether they are integers or floating point numbers.

The best I could come up with at the moment is this:

@Override
protected int doHash(final JsonNode t)
{
    /*
     * If this is a numeric node, we want a unique hashcode for all possible
     * number nodes.
     */
    if (t.isNumber()) {
        final BigDecimal decimal = t.decimalValue();
        try {
            return decimal.toBigIntegerExact().hashCode();
        } catch (ArithmeticException ignored) {
            return decimal.stripTrailingZeros().hashCode();
        }
    }

    // etc etc -- the rest works fine

This is, at the moment, the best I could come up with.

Is there a better way for calculating such a hashcode?

(edit: full code of the Equivalence implementation here)


Solution

  • Convert to Double and use the Double's hashCode, but base equality on the BigDecimal compareTo order.

    Two numerically equivalent BigDecimals will map to the same Double, and get the same hashCode. Some BigDecimal values that are very slightly different will get the same hashcode because of double rounding, but most distinct values will get different hashcodes, which is all you need.