Search code examples
javacastingjvmruntime

How does the JVM actually cast objects and issue a ClassCastException?


What happens under the hood when you cast an object to a specific class like casted = (NewClass)obj;? I'm guessing the JVM somehow checks if the actual class of obj is a subclass of NewClass, but would there be a way for an object instance to know when it is being "casted"?

Pointers to the documentation/FAQ of some JVM implementations are also welcome, as I haven't been able to find any...

EDIT as to "why an object should know when it is being casted?":

I was recently thinking about implementing a kind of Pipe that would be both an InputStream and an OutputStream. As these are classes and not interfaces, it cannot be both (as Java cannot extend multiple classes), so I was wondering if there might be a way for an object to show a different view of itself, through a somehow-interceptable casting operation, like C++ cast operators.

Not that I wanted to implement it anyway (well, I would have for testing and funny hack purpose ;)) because it would be way too dangerous and would allow for all kind of crazy abuses and misuses.


Solution

  • The JVM has a bytecode, checkcast, which is used to check if a cast can be validly performed. The actual cast check semantics are described in the JLS§5.5.3, and the details of the checkcast bytecode are described in the JVM spec§6.5. As an example,

    public static void main(String args[]) {
       Number n = Integer.valueOf(66); // Autoboxing
    
       incr((Integer) n);
    
       System.out.println(n);
    }
    

    produces

     public static void main(java.lang.String[]);
        Code:
           0: bipush        66
           2: invokestatic  #3                  // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
           5: astore_1
           6: aload_1
           7: checkcast     #4                  // class java/lang/Integer
          10: invokestatic  #5                  // Method incr:(Ljava/lang/Integer;)V
          13: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
          16: aload_1
          17: invokevirtual #7                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          20: return
    

    Additionally, by delving into Hotspot's source code we can see two implementations of checkcast, one used in production and another used for simple test and early ports.

    First shown is the production template-based interpreter (thanks to apangin for making me aware of it) which generates code that corresponds to a null check of the reference to be cast-checked, the loading of the class information, a call to a subtype check, and a possible jump to code that throws a ClassCastException:

    void TemplateTable::checkcast() {
      transition(atos, atos);
      Label done, is_null, ok_is_subtype, quicked, resolved;
      __ testptr(rax, rax); // object is in rax
      __ jcc(Assembler::zero, is_null);
    
      // Get cpool & tags index
      __ get_cpool_and_tags(rcx, rdx); // rcx=cpool, rdx=tags array
      __ get_unsigned_2_byte_index_at_bcp(rbx, 1); // rbx=index
      // See if bytecode has already been quicked
      __ cmpb(Address(rdx, rbx,
                      Address::times_1,
                      Array<u1>::base_offset_in_bytes()),
              JVM_CONSTANT_Class);
      __ jcc(Assembler::equal, quicked);
      __ push(atos); // save receiver for result, and for GC
      call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
      // vm_result_2 has metadata result
      __ get_vm_result_2(rax, r15_thread);
      __ pop_ptr(rdx); // restore receiver
      __ jmpb(resolved);
    
      // Get superklass in rax and subklass in rbx
      __ bind(quicked);
      __ mov(rdx, rax); // Save object in rdx; rax needed for subtype check
      __ movptr(rax, Address(rcx, rbx,
                           Address::times_8, sizeof(ConstantPool)));
    
      __ bind(resolved);
      __ load_klass(rbx, rdx);
    
      // Generate subtype check.  Blows rcx, rdi.  Object in rdx.
      // Superklass in rax.  Subklass in rbx.
      __ gen_subtype_check(rbx, ok_is_subtype);
    
      // Come here on failure
      __ push_ptr(rdx);
      // object is at TOS
      __ jump(ExternalAddress(Interpreter::_throw_ClassCastException_entry));
    
      // Come here on success
      __ bind(ok_is_subtype);
      __ mov(rax, rdx); // Restore object in rdx
    
      // Collect counts on whether this check-cast sees NULLs a lot or not.
      if (ProfileInterpreter) {
        __ jmp(done);
        __ bind(is_null);
        __ profile_null_seen(rcx);
      } else {
        __ bind(is_null);   // same as 'done'
      }
      __ bind(done);
    }
    

    The simple non-production interpreter can show us another example at bytecodeInterpreter.cpp line 2048. We can actually see what happens in a sample compliant bytecode interpreter when a checkcast is reached:

      CASE(_checkcast):
          if (STACK_OBJECT(-1) != NULL) {
            VERIFY_OOP(STACK_OBJECT(-1));
            u2 index = Bytes::get_Java_u2(pc+1);
            if (ProfileInterpreter) {
              // needs Profile_checkcast QQQ
              ShouldNotReachHere();
            }
            // Constant pool may have actual klass or unresolved klass. If it is
            // unresolved we must resolve it
            if (METHOD->constants()->tag_at(index).is_unresolved_klass()) {
              CALL_VM(InterpreterRuntime::quicken_io_cc(THREAD), handle_exception);
            }
            Klass* klassOf = (Klass*) METHOD->constants()->slot_at(index).get_klass();
            Klass* objKlassOop = STACK_OBJECT(-1)->klass(); //ebx
            //
            // Check for compatibilty. This check must not GC!!
            // Seems way more expensive now that we must dispatch
            //
            if (objKlassOop != klassOf &&
                !objKlassOop->is_subtype_of(klassOf)) {
              ResourceMark rm(THREAD);
              const char* objName = objKlassOop->external_name();
              const char* klassName = klassOf->external_name();
              char* message = SharedRuntime::generate_class_cast_message(
                objName, klassName);
              VM_JAVA_ERROR(vmSymbols::java_lang_ClassCastException(), message);
            }
          } else {
            if (UncommonNullCast) {
                //              istate->method()->set_null_cast_seen();
                // [RGV] Not sure what to do here!
    
            }
          }
          UPDATE_PC_AND_CONTINUE(3);
    

    In a nutshell, it grabs the argument off the stack, gets the Class object from the constant pool (resolving if necessary), and checks if the argument is assignable to that class. If not, it gets the names of the object's type and of the class to which the cast was attempted, constructs an exception message, and throws a ClassCastException with that message. Oddly enough, the mechanism for throwing a ClassCastException is not the same as that used for the athrow bytecode (using VM_JAVA_ERROR instead of set_pending_exception).

    Response to edit: it would be better to just use the type system and OOP principles instead of odd Java internals. Just have a Pipe class (extending Object) that has a getInputStream and a getOutputStream method, each of which returns an instance of a corresponding inner class (i.e. Pipe$PipeInputStream and Pipe$PipeOutputStream, both of which access the private/protected state of Pipe)