Search code examples
javajava-7variadic-functions

Java SafeVarargs annotation, does a standard or best practice exist?


I've recently come across the java @SafeVarargs annotation. Googling for what makes a variadic function in Java unsafe left me rather confused (heap poisoning? erased types?), so I'd like to know a few things:

  1. What makes a variadic Java function unsafe in the @SafeVarargs sense (preferably explained in the form of an in-depth example)?

  2. Why is this annotation left to the discretion of the programmer? Isn't this something the compiler should be able to check?

  3. Is there some standard one must adhere to in order to ensure his function is indeed varags safe? If not, what are the best practices to ensure it?


Solution

    1. There are many examples on the Internet and on StackOverflow about the particular issue with generics and varargs. Basically, it's when you have a variable number of arguments of a type-parameter type:

      void foo(T... args);

    In Java, varargs are a syntactic sugar that undergoes a simple "re-writing" at compile-time: a varargs parameter of type X... is converted into a parameter of type X[]; and every time a call is made to this varargs method, the compiler collects all of the "variable arguments" that goes in the varargs parameter, and creates an array just like new X[] { ...(arguments go here)... }.

    This works well when the varargs type is concrete like String.... When it's a type variable like T..., it also works when T is known to be a concrete type for that call. e.g. if the method above were part of a class Foo<T>, and you have a Foo<String> reference, then calling foo on it would be okay because we know T is String at that point in the code.

    However, it does not work when the "value" of T is another type parameter. In Java, it is impossible to create an array of a type-parameter component type (new T[] { ... }). So Java instead uses new Object[] { ... } (here Object is the upper bound of T; if there upper bound were something different, it would be that instead of Object), and then gives you a compiler warning.

    So what is wrong with creating new Object[] instead of new T[] or whatever? Well, arrays in Java know their component type at runtime. Thus, the passed array object will have the wrong component type at runtime.

    For probably the most common use of varargs, simply to iterate over the elements, this is no problem (you don't care about the runtime type of the array), so this is safe:

    @SafeVarargs
    final <T> void foo(T... args) {
        for (T x : args) {
            // do stuff with x
        }
    }
    

    However, for anything that depends on the runtime component type of the passed array, it will not be safe. Here is a simple example of something that is unsafe and crashes:

    class UnSafeVarargs
    {
      static <T> T[] asArray(T... args) {
        return args;
      }
    
      static <T> T[] arrayOfTwo(T a, T b) {
        return asArray(a, b);
      }
    
      public static void main(String[] args) {
        String[] bar = arrayOfTwo("hi", "mom");
      }
    }
    

    The problem here is that in the method asArray we depend on the type of args to be T[] in order to return it as T[]. But actually the type of the argument at runtime is not an instance of T[], but an instance of Object[] as we explained earlier.

    1. If your method has an argument of type T... (where T is any type parameter), then:
    • Safe: If your method only depends on the fact that the elements of the array are instances of T
    • Unsafe: If it depends on the fact that the array is an instance of T[]

    Things that depend on the runtime type of the array include: returning it as type T[], passing it as an argument to a parameter of type T[], getting the array type using .getClass(), passing it to methods that depend on the runtime type of the array, like List.toArray() and Arrays.copyOf(), etc.

    1. The distinction I mentioned above is too complicated to be easily distinguished automatically.