Search code examples
javaperformancedictionarymultikey

Fast primitive int multi-key map?


For a project I'm working on, I try to create a multi-dimensional pivot on large data sets. I have all the keys I want to use as ints, so basically, I want to return a set of

( int1, int2, int3, .. intN ) -> (Aggregate1, Aggregate2, ... , AggregateM)

I cannot use a N-dimensional array, as it might get huge and probably will be sparse. I've looked a Trove, but they do not have a Multi-key map. Apache commons has a multi-key map, but that is for Objects; that would probably work, but seems less interesting as the ints will get auto-boxed to Integers and vice versa.

Does anyone know of a primitive multi-key map implementation? (That maps to objects?)

Or, does anyone have great hints, maybe there is a better approach to my problem?

[edit] Insertion time is less interesting, I need the performance on the lookup, as the map will be heavily used to lookup values.

[edit2] Thanks for all the answers. My implemenation choice is a custom class containing an int[], immutable so the hashcode can be calculated on construction time.

private static class MultiIntKey
{
    int[] ints;

    private int hashCode;

    MultiIntKey( int[] ints )
    {
        this.ints = ints;
        this.hashCode = Arrays.hashCode( this.ints );
    }

    @Override
    public int hashCode()
    {
        return this.hashCode;
    }

    @Override
    public boolean equals( Object obj )
    {
        if ( this == obj )
        {
            return true;
        }
        if ( obj == null )
        {
            return false;
        }
        if ( this.getClass() != obj.getClass() )
        {
            return false;
        }
        MultiIntKey other = (MultiIntKey) obj;
        if ( this.hashCode != other.hashCode )
        {
            return false;
        }
        if ( !Arrays.equals( this.ints, other.ints ) )
        {
            return false;
        }
        return true;
    }
}

Solution

  • Apache commons has a multi-key map, but that is for Objects; that would probably work, but seems less interesting as the ints will get auto-boxed to Integers and vice versa.

    Sure, it makes no sense to use N objects while trying to avoid one.

    • If you keys are small, consider packing them into a single int or long.
    • If they repeat a lot, consider a TIntObjectMap<TIntObjectMap<Value>> using trove4j, possibly with more nesting.
    • Otherwise, simply create a trivial object encapsulating all the ints. An object overhead is a few bytes, which is not that bad when compared to 4*N. A map hash a big overhead anyway...

    If your map is immutable, go for Guava's ImmutableMap. Look at Guava Table, it's 2D only, but it may help to save a bit.


    Only if you're sure you need to optimize a lot (have you done some benchmarking or profiling?) and you don't need a fully fledged map, think about your own implementation based on some int[], where you place all the keys in sequence. Most probably you'll find out that it wasn't worth it, but it's a good exercise. :D