Search code examples
javaarraysgenericscastingtype-safety

Creating generic array in Java via unchecked type-cast


If I have a generic class Foo<Bar>, I am not allowed to create an array as follows:

Bar[] bars = new Bar[];

(This will cause the error "Cannot create a generic array of Bar").

But, as suggested by dimo414 in an answer to this question (Java how to: Generic Array creation), I can do the following:

Bar[] bars = (Bar[]) new Object[];

(This will "only" generate a warning: "Type safety: Unchecked cast from Object[] to Bar[]").

In the comments responding to dimo414's answer, some people claim that using this construct can cause problems in certain situations and others say it's fine, as the only reference to the array is bars, which is of the desired type already.

I'm a little confused in which cases this is OK and in which cases it can run me into trouble. The comments by newacct and Aaron McDaid, for example, seem to directly contradict each other. Unfortunately the comment stream in the original question simply ends with the unanswered "Why is this 'no longer correct'?", so I decided to make a new question for it:

If the bars-array only ever contains entries of type Bar, could there still be any run-time issues when using the array or its entries? Or is the only danger, that at run-time I could technically cast the array to something else (like String[]), which would then allow me to fill it with values of a type other than Bar?

I know I can use Array.newInstance(...) instead, but I am specifically interested in the type-casting construct above, since, for example, in GWT the newInstance(...)-option isn't available.


Solution

  • Since I was mentioned in the question, I will chime in.

    Basically, it will not cause any problems if you don't expose this array variable to the outside of the class. (kinda like, What happens in Vegas stays in Vegas.)

    The actual runtime type of the array is Object[]. So putting it into a variable of type Bar[] is effectively a "lie", since Object[] is not a subtype of Bar[] (unless Object is Bar). However, this lie is okay if it stays inside the class, since Bar is erased to Object inside the class. (The lower bound of Bar is Object in this question. In a case where the lower bound of Bar is something else, replace all occurrences of Object in this discussion with whatever that bound is.) However, if this lie gets somehow exposed to the outside (the simplest example is returning the bars variable directly as type Bar[], then it will cause problems.

    To understand what is really going on, it is instructive to look at the code with and without generics. Any generics program can be re-written into an equivalent non-generics program, simply by removing generics and inserting casts in the right place. This transformation is called type erasure.

    We consider a simple implementation of Foo<Bar>, with methods for getting and setting particular elements in the array, as well as a method for getting the whole array:

    class Foo<Bar> {
        Bar[] bars = (Bar[])new Object[5];
        public Bar get(int i) {
            return bars[i];
        }
        public void set(int i, Bar x) {
            bars[i] = x;
        }
        public Bar[] getArray() {
            return bars;
        }
    }
    
    // in some method somewhere:
    Foo<String> foo = new Foo<String>();
    foo.set(2, "hello");
    String other = foo.get(3);
    String[] allStrings = foo.getArray();
    

    After type erasure, this becomes:

    class Foo {
        Object[] bars = new Object[5];
        public Object get(int i) {
            return bars[i];
        }
        public void set(int i, Object x) {
            bars[i] = x;
        }
        public Object[] getArray() {
            return bars;
        }
    }
    
    // in some method somewhere:
    Foo foo = new Foo();
    foo.set(2, "hello");
    String other = (String)foo.get(3);
    String[] allStrings = (String[])foo.getArray();
    

    So there are no casts inside the class anymore. However, there are casts in the calling code -- when getting one element, and getting the entire array. The cast to get one element should not fail, because the only things we can put into the array are Bar, so the only things we can get out are also Bar. However, the cast when getting the entire array, that will fail, since the array has actual runtime type Object[].

    Written non-generically, what is happening and the problem become much more apparent. What is especially troubling is that the cast failure does not happen in the class where we wrote the cast in generics -- it happens in someone else's code that uses our class. And that other person's code is completely safe and innocent. It also does not happen at the time where we did our cast in the generics code -- it happens later, when someone calls getArray(), without warning.

    If we didn't have this getArray() method, then this class would be safe. With this method, it is unsafe. What characteristic makes it unsafe? It returns bars as type Bar[], which depends on the "lie" we made earlier. Since the lie is not true, it causes problems. If the method had instead returned the array as type Object[], then it would be safe, since it does not depend on the "lie".

    People will tell you to not do such a cast like this, because it causes cast exceptions in unexpected places as seen above, not in the original place where the unchecked cast was. The compiler will not warn you that getArray() is unsafe (because from its point of view, given the types you told it, it is safe). Thus it depends on the programmer to be diligent about this pitfall and not to use it in an unsafe way.

    However, I would argue that this is not a big concern in practice. Any well-designed API will never expose internal instance variables to the outside. (Even if there is a method to return the contents as an array, it would not return the internal variable directly; it would copy it, to prevent outside code from modifying the array directly.) So no method will be implemented like getArray() is anyway.