Search code examples
javamultithreadingcollectionsnullpointerexception

Java Collections: NPE even though no null value was ever added


I recently got an NPE in a context where I was absolutely sure it shouldn't be possible. The class is used in a very multi-threaded program, so I'm aware that you should expect basically anything to be possible, but still.

So, the class defines the following field:

private final Set<TcpIpConnection> _connections = new LinkedHashSet<>();

In the whole class, this set is only manipulated in two places:

// some method
  TcpIpConnection tcpipConnection = new ServerConnection(clientSocket, _channels, MyClass.this);
  _connections.add(tcpipConnection);
// some other method
  _connections.remove(connection);

So I think you will agree that there's no way a null could be added to the set. And yes, the set stays in the class and is never proliferated outside.

But now I have a test case which sometimes fails with an NPE in the following statement, which is the only other statement in the class that uses _connections:

new ArrayList<>(_connections).stream().forEach(c -> c.close("Server down"));

As you can see, I already prevented a ConcurrentModificationException by first creating a local copy of the set as ArrayList.

Now the NPE appears for c, which must be a value previously added to _connections - but how can that become null?

To be clear, I'm not looking for a solution - I added filter(Objects::nonNull) in the stream (or I could've used Collections.synchronizedSet() in the initializer), and it's guaranteed to work now.

How can that even happen? Yes, multi-threaded access can screw almost everything up, but putting a null in a collection which wasn't there?


Solution

  • Just to explain why null is possible

    Below are referencing jdk-21+35

    In new ArrayList<>(_connections), which will call
    _connections.toArray()
    -> HashSet#toArray()(as LinkedHashSet extends HashSet)
    -> LinkedHashMap#keysToArray()(as LinkedHashSet is backed by LinkedHashMap)

        // HashSet#toArray()
        @Override
        public Object[] toArray() {
            return map.keysToArray(new Object[map.size()]);
        }
    

    as here we can see an Object[] is constructed with length map.size(), and this Object[] will be copied to the ArrayList.

        // LinkedHashMap#keysToArray()
        final <T> T[] keysToArray(T[] a) {
            return keysToArray(a, false);
        }
    
        final <T> T[] keysToArray(T[] a, boolean reversed) {
            Object[] r = a;
            int idx = 0;
            if (reversed) {
                for (LinkedHashMap.Entry<K,V> e = tail; e != null; e = e.before) {
                    r[idx++] = e.key;
                }
            } else {
                for (LinkedHashMap.Entry<K,V> e = head; e != null; e = e.after) {
                    r[idx++] = e.key;
                }
            }
            return a;
        }
    

    And above is how we set value to the Object[]. So in a concurrent environment, when we call HashSet#toArray(), the map size might be 2, but when we go to LinkedHashMap#keysToArray(), the map size could be 1(some thread removed element in between), hence there could be some null element in the Object[]. And vice versa ArrayIndexOutOfBound could be thrown.

    Conclusion

    Never assume any behaviour of non thread safe class when working in multi-thread environment, there is 0 guarantee.