Search code examples
javagarbage-collectionweak-references

Adding WeakReferences into HashMap via #compute - may I get null or not?


Let's say I have following implementation of cache, purposed to associate some data (connection pool in my case) with the latest state/version of another object:

public class Demo<V> {

    private final Map<Integer, VAR<V>> cache = new ConcurrentHashMap<>();

    private final ReferenceQueue<V> refq = new ReferenceQueue<>();

    private final Function<Integer, V> factory;

    public Demo(Function<Integer, V> factory) {
        this.factory = factory;
    }

    public static void main(String[] args) throws Exception {
        Demo<Object> demo = new Demo<>(id -> new Object());

        Object data = demo.getCached(1, 0);
        assert data != null;

        WeakReference<Object> keyReference = new WeakReference<>(data);
        data = null;
        while (keyReference.get() != null) {
            System.gc();
            Thread.sleep(1_000);
        }

    }

    public V getCached(int id, int version) {
        assert version >= 0;
        VAR<V> ref = cache.get(id);
        V storedData = ref == null ? null : ref.get();
        int storedVersion = storedData == null ? -1 : ref.getVersion();

        if (storedData == null) {
            cache.remove(id, ref);
        }

        if (storedVersion >= version) {
            return storedData;
        }

        Supplier<VAR<V>> varFactory = () -> {
            V data = factory.apply(id);
            return new WeakVAR(data, id, version);
        };

        BiFunction<Integer, VAR<V>, VAR<V>> replaceFunction = (key, existing) -> {
            V data = existing == null ? null : existing.get();
            if (data == null) {
                return varFactory.get();
            }
            if (existing.getVersion() >= version) {
                return existing;
            }
            return varFactory.get();
        };

        Supplier<V> vFactory = () -> cache.compute(id, replaceFunction).get();

/*
        V result;
        while ((result = vFactory.get()) == null) {
            // nop
        }
        return result;
*/

        return vFactory.get();
    }


    interface VAR<V> {

        int getVersion();

        int getId();

        V get();

    }

    class WeakVAR extends WeakReference<V>
            implements VAR<V> {

        private final int id;

        private final int version;

        public WeakVAR(V referent, int id, int version) {
            super(referent, refq);
            this.id = id;
            this.version = version;
        }

        @Override
        public int getVersion() {
            return version;
        }

        @Override
        public int getId() {
            return id;
        }
    }

}

The question is: is it possible that public V getCached(int id, int version) method may return null when dereferencing WeakReference returned from Map#compute? If so, what is the best way to overcome this issue? Is busy loop a felicitous solution?


Solution

  • Yes, it’s possible that your method returns null. In principle, when you have code like this

    WeakReference<Object> ref = new WeakReference<>(new Object());
    Object o = ref.get();
    

    it’s already possible that o is null.

    Instead of creating a loop, you should ensure that there is no phase where the result object is only weakly reachable between the check for an existing object or the creation of a new object and the final return statement.

    Or, in other words, enforce that the object is strongly reachable throughout the operation.

    For example

    public V getCached(int id, int version) {
        // do not abuse assertions for argument checking
        if(version < 0) throw new IllegalArgumentException();
    
        VAR<V> ref = cache.get(id);
        V storedData = ref == null? null: ref.get();
    
        if(storedData != null && ref.getVersion() >= version) {
            return storedData;
        }
    
        List<V> strongReference = new ArrayList<>(1);
    
        cache.compute(id, (key, existing) -> {
            V data = existing == null? null: existing.get();
            if(data != null && existing.getVersion() >= version) {
                strongReference.add(data);
                return existing;
            }
            data = factory.apply(key);
            strongReference.add(data);
            return new WeakVAR(data, key, version);
        });
    
        return strongReference.get(0);
    }
    

    The fast path at the beginning does not need additional effort, as the storedData variable is already a strong reference, which is consistently used for checking the presence of an old object and returning it.

    The function passed to the compute method must return the weak reference to conform to the map’s type, hence, it needs an additional storage to keep a strong reference while returning through the compute method. The example uses an ArrayList but any local object with a reference variable would do.

    Often, a single element array is used for such a purpose, but Java doesn’t support creating a V[] array. In the end, the overhead doesn’t matter much, as long as the code succeeds in the fast path with the optimistic get() often enough.