Search code examples
javamultithreadingchronicle

String array - needless synchronization?


I'm studying ChronicleHFT libraries. I found a class StringInterner posted below

public class StringInterner {
    @NotNull
    protected final String[] interner;
    protected final int mask;
    protected final int shift;
    protected boolean toggle = false;

    public StringInterner(int capacity) throws IllegalArgumentException {
        int n = Maths.nextPower2(capacity, 128);
        this.shift = Maths.intLog2((long)n);
        this.interner = new String[n];
        this.mask = n - 1;
    }

    @Nullable
    public String intern(@Nullable CharSequence cs) {
        if (cs == null) {
            return null;
        } else if (cs.length() > this.interner.length) {
            return cs.toString();
        } else {
            int hash = Maths.hash32(cs);
            int h = hash & this.mask;
            String s = this.interner[h];
            if (StringUtils.isEqual(cs, s)) {
                return s;
            } else {
                int h2 = hash >> this.shift & this.mask;
                String s2 = this.interner[h2];
                if (StringUtils.isEqual(cs, s2)) {
                    return s2;
                } else {
                    String s3 = cs.toString();
                    this.interner[s != null && (s2 == null || !this.toggle()) ? h2 : h] = s3;
                    return s3;
                }
            }
   

I found yt video from Peter Lawrey on which he explains (or to be more precise - he just says) that this class is thread safe and doesn't need any additional synchronization to work in multithreaded environment. Video yt link: https://www.youtube.com/watch?v=sNSD6AUG5a0&t=1200

My question is why this class doesn't need any sync?

  1. How about visibility - if one thread put something into interner[n], does another threads are guaranteed to see it?
  2. What happens in case, when scheduler yields a thread in the middle of method? Does it lead to put same value in same index twice?

Solution

  • The Javadoc for StringInterner explains that it's not technically thread-safe:

    StringInterner only guarantees it will behave in a correct manner. When you ask it for a String for a given input, it must return a String which matches the toString() of that CharSequence.

    It doesn't guarantee that all threads see the same data, nor that multiple threads will return the same String object for the same string. It is designed to be a best-effort basis so it can be as lightweight as possible.

    So while technically not thread safe, it doesn't prevent it operating correctly when used from multiple threads, but it is faster than added explicit locking or thread safety. NOTE: It does rely on String being thread safe, something which was guarenteed from Java 5.0 onwards.

    Incidentally, I'm curious about the claim that String was not thread-safe prior to Java 5; I'd love to see a citation.