Search code examples
javasetbenchmarkinghashset

Why writing map entries to a HashSet is slower than to a CopyOnWriteArraySet in Java


I think writing to a HashSet will be faster than to a CopyOnWriteArraySet; I'm not doing multi threading here. However I surpisingly got benchmark results indicate writing map entries to a CopyOnWriteArraySet is faster.

I did benchmarking on writing 1000 of Map.Entry<Integer, Integer> into a HashSet vs CopyOnWriteArraySet.

Benchmark          (n)   Mode  Cnt     Score    Error  Units
A.writeToCOWAS    1000  thrpt    4  1269.214 ± 40.635  ops/s
A.writeToHashSet  1000  thrpt    4   223.118 ± 34.955  ops/s

In addition to that, I got benchmark results of equals() and hashCode() of Map.Entry reveal that the former is more expensive.

Benchmark           Mode  Cnt          Score          Error  Units
MapEntry.equals    thrpt    4  177773394.054 ± 75436098.388  ops/s
MapEntry.hashCode  thrpt    4  272240403.923 ± 38794396.671  ops/s

I believe writing to a HashSet calls to hashCode() while CopyOnWriteArraySet calls to equals().

In the case of writing Integer or String,HashSet is way faster. Then I'm wondering what happens with Map.Entry type and why CopyOnWriteArraySet is faster according to my analysis?

My perf test:

@State(Scope.Benchmark)
@Fork(value = 2)
@Warmup(iterations = 2, time = 3)
@Measurement(iterations = 2, time = 3)
public class A {
    public Set<Map.Entry<Integer,Integer>> set;

    @Param({"1000"})
    public int n;

    @Setup
    public void setup() {
        set = new HashSet<>((int) (n / 0.75f + 1f), 0.75f);
        for (int i = 0; i < n; i++)
            set.add(Map.entry(i, i));
    }

    private void eliminateDeadCode(Set<Map.Entry<Integer,Integer>> out, Blackhole hole) {
        int hash = 0;
        for (Map.Entry<Integer,Integer> o : out)
            hash += o.hashCode();
        hole.consume(hash);
        if (out.size() != set.size())
            throw new RuntimeException(out.size() + " != " + set.size());
    }

    @Benchmark
    public void writeToCOWAS(Blackhole hole) {
        Set<Map.Entry<Integer,Integer>> out = new CopyOnWriteArraySet<>(set);
        eliminateDeadCode(out, hole);
    }

    @Benchmark
    public void writeToHashSet(Blackhole hole) {
        Set<Map.Entry<Integer,Integer>> out = new HashSet<>(set);
        eliminateDeadCode(out, hole);
    }

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(A.class.getSimpleName())
                .build();
        new Runner(opt).run();
    }
}

Solution

  • Hulk's answer is very instructive. However the problem is not necessarily the Map.entry() hashCode implementation, which is this, at least in Java 11:

    public int hashCode() {
        return key.hashCode() ^ value.hashCode();
    }
    

    The problem is that the hash codes of key and value are always the same, both in the OP's test, and in Hulk's test, hence the hash codes combined via XOR will always end up as 0. Change it so that key and value are different, and performance will change.