What happens under the hood when you cast an object to a specific class like casted = (NewClass)obj;
? I'm guessing the JVM somehow checks if the actual class of obj
is a subclass of NewClass
, but would there be a way for an object instance to know when it is being "casted"?
Pointers to the documentation/FAQ of some JVM implementations are also welcome, as I haven't been able to find any...
EDIT as to "why an object should know when it is being casted?":
I was recently thinking about implementing a kind of Pipe that would be both an InputStream
and an OutputStream
. As these are classes and not interfaces, it cannot be both (as Java cannot extend multiple classes), so I was wondering if there might be a way for an object to show a different view of itself, through a somehow-interceptable casting operation, like C++ cast operators.
Not that I wanted to implement it anyway (well, I would have for testing and funny hack purpose ;)) because it would be way too dangerous and would allow for all kind of crazy abuses and misuses.
The JVM has a bytecode, checkcast
, which is used to check if a cast can be validly performed. The actual cast check semantics are described in the JLS§5.5.3, and the details of the checkcast
bytecode are described in the JVM spec§6.5. As an example,
public static void main(String args[]) {
Number n = Integer.valueOf(66); // Autoboxing
incr((Integer) n);
System.out.println(n);
}
produces
public static void main(java.lang.String[]);
Code:
0: bipush 66
2: invokestatic #3 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
5: astore_1
6: aload_1
7: checkcast #4 // class java/lang/Integer
10: invokestatic #5 // Method incr:(Ljava/lang/Integer;)V
13: getstatic #6 // Field java/lang/System.out:Ljava/io/PrintStream;
16: aload_1
17: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
20: return
Additionally, by delving into Hotspot's source code we can see two implementations of checkcast
, one used in production and another used for simple test and early ports.
First shown is the production template-based interpreter (thanks to apangin for making me aware of it) which generates code that corresponds to a null check of the reference to be cast-checked, the loading of the class information, a call to a subtype check, and a possible jump to code that throws a ClassCastException:
void TemplateTable::checkcast() {
transition(atos, atos);
Label done, is_null, ok_is_subtype, quicked, resolved;
__ testptr(rax, rax); // object is in rax
__ jcc(Assembler::zero, is_null);
// Get cpool & tags index
__ get_cpool_and_tags(rcx, rdx); // rcx=cpool, rdx=tags array
__ get_unsigned_2_byte_index_at_bcp(rbx, 1); // rbx=index
// See if bytecode has already been quicked
__ cmpb(Address(rdx, rbx,
Address::times_1,
Array<u1>::base_offset_in_bytes()),
JVM_CONSTANT_Class);
__ jcc(Assembler::equal, quicked);
__ push(atos); // save receiver for result, and for GC
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
// vm_result_2 has metadata result
__ get_vm_result_2(rax, r15_thread);
__ pop_ptr(rdx); // restore receiver
__ jmpb(resolved);
// Get superklass in rax and subklass in rbx
__ bind(quicked);
__ mov(rdx, rax); // Save object in rdx; rax needed for subtype check
__ movptr(rax, Address(rcx, rbx,
Address::times_8, sizeof(ConstantPool)));
__ bind(resolved);
__ load_klass(rbx, rdx);
// Generate subtype check. Blows rcx, rdi. Object in rdx.
// Superklass in rax. Subklass in rbx.
__ gen_subtype_check(rbx, ok_is_subtype);
// Come here on failure
__ push_ptr(rdx);
// object is at TOS
__ jump(ExternalAddress(Interpreter::_throw_ClassCastException_entry));
// Come here on success
__ bind(ok_is_subtype);
__ mov(rax, rdx); // Restore object in rdx
// Collect counts on whether this check-cast sees NULLs a lot or not.
if (ProfileInterpreter) {
__ jmp(done);
__ bind(is_null);
__ profile_null_seen(rcx);
} else {
__ bind(is_null); // same as 'done'
}
__ bind(done);
}
The simple non-production interpreter can show us another example at bytecodeInterpreter.cpp
line 2048. We can actually see what happens in a sample compliant bytecode interpreter when a checkcast
is reached:
CASE(_checkcast):
if (STACK_OBJECT(-1) != NULL) {
VERIFY_OOP(STACK_OBJECT(-1));
u2 index = Bytes::get_Java_u2(pc+1);
if (ProfileInterpreter) {
// needs Profile_checkcast QQQ
ShouldNotReachHere();
}
// Constant pool may have actual klass or unresolved klass. If it is
// unresolved we must resolve it
if (METHOD->constants()->tag_at(index).is_unresolved_klass()) {
CALL_VM(InterpreterRuntime::quicken_io_cc(THREAD), handle_exception);
}
Klass* klassOf = (Klass*) METHOD->constants()->slot_at(index).get_klass();
Klass* objKlassOop = STACK_OBJECT(-1)->klass(); //ebx
//
// Check for compatibilty. This check must not GC!!
// Seems way more expensive now that we must dispatch
//
if (objKlassOop != klassOf &&
!objKlassOop->is_subtype_of(klassOf)) {
ResourceMark rm(THREAD);
const char* objName = objKlassOop->external_name();
const char* klassName = klassOf->external_name();
char* message = SharedRuntime::generate_class_cast_message(
objName, klassName);
VM_JAVA_ERROR(vmSymbols::java_lang_ClassCastException(), message);
}
} else {
if (UncommonNullCast) {
// istate->method()->set_null_cast_seen();
// [RGV] Not sure what to do here!
}
}
UPDATE_PC_AND_CONTINUE(3);
In a nutshell, it grabs the argument off the stack, gets the Class object from the constant pool (resolving if necessary), and checks if the argument is assignable to that class. If not, it gets the names of the object's type and of the class to which the cast was attempted, constructs an exception message, and throws a ClassCastException with that message. Oddly enough, the mechanism for throwing a ClassCastException is not the same as that used for the athrow
bytecode (using VM_JAVA_ERROR
instead of set_pending_exception
).
Response to edit: it would be better to just use the type system and OOP principles instead of odd Java internals. Just have a Pipe
class (extending Object) that has a getInputStream
and a getOutputStream
method, each of which returns an instance of a corresponding inner class (i.e. Pipe$PipeInputStream
and Pipe$PipeOutputStream
, both of which access the private/protected state of Pipe
)