Search code examples
javastack-overflowhashset

Large size of HashSet throwing StackOverflow Error


I have 81K records of Long object and I am trying to store it in HashSet. My code snippet looks like this:

private static HashSet<Long> hashSet = new HashSet<>(Arrays.asList(*81K records*));

While compiling this is giving me StackOverflow Error. I am not understanding why only 81K records are being problem here? Solutions are appreciated.

Java version. :

openjdk version "1.8.0_322"
OpenJDK Runtime Environment Corretto-8.322.06.1 (build 1.8.0_322-b06)
OpenJDK 64-Bit Server VM Corretto-8.322.06.1 (build 25.322-b06, mixed mode)

Stack Trace:

[javac] 
    [javac] 
    [javac] The system is out of resources.
    [javac] Consult the following stack trace for details.
    [javac] java.lang.StackOverflowError
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)
    [javac]     at com.sun.tools.javac.code.Type.map(Type.java:220)

Line 220 of Type:

 208     /**
 209      * Return the least specific subtype of t that starts with symbol
 210      * sym.  If none exists, return null.  The least specific subtype
 211      * is determined as follows:
 212      *
 213      * <p>If there is exactly one parameterized instance of sym that is a
 214      * subtype of t, that parameterized instance is returned.<br>
 215      * Otherwise, if the plain type or raw type `sym' is a subtype of
 216      * type t, the type `sym' itself is returned.  Otherwise, null is
 217      * returned.
 218      */
 219     public Type asSub(Type t, Symbol sym) {
 220         return asSub.visit(t, sym);
 221     }
 222     // where
 223         private final SimpleVisitor<Type,Symbol> asSub = new SimpleVisitor<Type,Symbol>() {

Solution

  • The HashSet is irrelevant here. The problematic part is the varargs invocation of Arrays.asList with 81,000 elements.

    To reproduce the issue, we can use the following code

    class Tmp {
      static final String ARGUMENTS = "<<INSERT ARGUMENTS HERE>>";
    
      static final List<String> TEMPLATE = Arrays.asList(
          "import java.util.Arrays;",
          "import java.util.List;",
          "",
          "class Tmp {",
          "  static final List<Integer> L = Arrays.asList(",
               ARGUMENTS,
          "  );",
          "}");
    
      public static void main(String[] args) throws IOException {
        Path p = Files.createTempFile("Test", ".java");
        Files.write(p, () -> TEMPLATE.stream()
            .flatMap(line -> line.equals(ARGUMENTS)? varargsArgument(): Stream.of(line))
            .iterator());
        JavaCompiler c = ToolProvider.getSystemJavaCompiler();
        c.run(System.in, System.out, System.err, p.toString());
      }
    
      static Stream<CharSequence> varargsArgument() {
        return IntStream.range(0, 8100).mapToObj(i -> IntStream.range(0, 10)
                .mapToObj(j -> i * 10 + j + (i < 8099 || j < 9? ", ": ""))
                .collect(Collectors.joining()));
      }
    }
    

    With OpenJDK 8, it produces the

    java.lang.StackOverflowError
        at com.sun.tools.javac.code.Type.map(Type.java:220)
       …
    

    On recent JDKs, e.g. JDK 12, it produces

    /tmp/Test14992292170362927520.java:6: error: code too large
      static final List<Integer> L = Arrays.asList(
                                 ^
    

    showing that even when the compiler bug has been fixed, such code can’t get compiled.

    Such amount of data should be included as embedded resource which you read in once at startup.