Search code examples
javalinkedhashset

LinkedHashSet Documentation Ambiguity: Re-insert


The java documentation for LinkedHashSet states this:

Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)

For a recent project I decided to use one to hold a set of data tokens in a client-server communication for display to users in some list view widget. The idea being that I'd be able to cheaply re-insert elements with updated data and the user wouldn't be surprised because the order wouldn't change.

Obviously that's not the case, using Oracle JRE 1.7.0_55-b13 it operates like any other Set, as this short test program shows:

import java.util.LinkedHashSet;

import static java.util.Arrays.deepToString;

public class LhsStack 
{
    public static class T 
    {
        public T ( int id ) { this.id = id; }

        public final Integer id;
        public String value;

        @Override
        public int hashCode () { return id.hashCode (); }

        @Override
        public boolean equals ( Object obj ) 
        { 
            return obj instanceof T && id.equals ( ((T)obj).id ); 
        }

        @Override
        public String toString () { return id + " => " + value; }
    }

    public static void main ( String [] args )
    {
        LinkedHashSet < T > set = new LinkedHashSet <> ();

        T a = new T ( 1 ),
          b = new T ( 1 ),
          c = new T ( 2 ),
          d = new T ( 3 );

        a.value = "Hello, World";
        b.value = "World, Hello";
        c.value = "Foo";
        d.value = "Bar"; 

        System.out.println ( "a == b: " + a.equals ( b ) );

        if ( set.add ( a ) ) {
            System.out.println ( "Inserted: " + a.value );
        }

        System.out.println ( "set.contains ( a ): " + set.contains ( a ) );
        System.out.println ( "set.contains ( b ): " + set.contains ( b ) );

        set.add ( c ); set.add ( d );

        System.out.println ( "Elements: " + set.size () );
        System.out.println ( deepToString ( set.toArray () ) );

        if ( set.add ( b ) ) {
            System.out.println ( "Re-Inserted: " + b.value );
        }
        else
        {
            System.out.println ( "Removing and Adding: " + b.value );
            set.remove ( b );
            set.add ( b );
        }

        System.out.println ( "Elements: " + set.size () );
        System.out.println ( deepToString ( set.toArray () ) );
    }
}

Output

a == b: true
Inserted: Hello, World
set.contains ( a ): true
set.contains ( b ): true
Elements: 3
[1 => Hello, World, 2 => Foo, 3 => Bar]
Removing and Adding: World, Hello
Elements: 3
[2 => Foo, 3 => Bar, 1 => World, Hello]

My question therefore is, since element b is not re-inserted into the set (i.e. it has to be removed, then re-added to update its value), what is the significance of the comment in the java documentation?

Thankyou!


Solution

  • Usually, linkedHashSet.add(elementToAdd) makes elementToAdd be the last element of linkedHashSet. The significance of the comment in the Javadoc is that if elementToAdd already appears inside linkedHashSet, then linkedHashSet.add(elementToAdd) will just leave it in place (and will not move it to the end).

    For what you're trying to do, you're better off with a LinkedHashMap<Integer, T>. You can then iterate over its values() to get your T instances in iteration order, with the ability to update mappings. (If you want, you can then wrap LinkedHashMap<Integer, T> in some sort of container object that, instead of offering put(Integer, T), instead offers an add(T) that handles the key-mapping behind the scenes. In fact, it should be very straightforward to extend AbstractSet<T> to create a LinkedHashMap<Integer, T>-based implementation of Set<T>.)


    Edit for updated question: Ah, O.K., sorry, I now understand your confusion better. The above is an explanation of the purpose of the first sentence ("Note that insertion order is not affected if an element is re-inserted into the set"); I did not realize that you were misunderstanding the second sentence ("An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.")

    So, let me explain. The second sentence is merely a definition of the term "reinserted"; it's not describing any behavior. The sentence is not saying that a LinkedHashSet will do anything called "reinsertion" if you call its add method with an element it already contains; rather, the sentence saying that if you call its add method with an element it already contains, then this call is called "reinsertion". The (non-)effect of reinsertion is what's explained in the first sentence, namely, that it won't move the element to the end.

    LinkedHashSet.add still obeys the requirements of Set.add, which specifies that "Adds the specified element to this set if it is not already present (optional operation). [...] If this set already contains the element, the call leaves the set unchanged and returns false."

    There are some cases of JDK classes that do not obey the requirements of the interfaces they claim to implement, but when that happens, it will be called out as a boldface warning, not just tucked away inside a parenthetical note and never mentioned again. See the Javadoc of IdentityHashMap for an example of this.