Search code examples
javamemory-managementcompiler-constructionbytecoderelative-addressing

Compiler-generated relative addresses and how they are represented in (preferably java) bytecode?


When address binding is not possible at compile time, it's done at load/link or runtime, to associate relative ( or perhaps we can call them relocatable addresses ) addresses with actual physical ones. Plus, the CPU also converts those relative addresses to logical ones prior to binding for physical addresses.

Converting from logical to physical is a known concept to me. But, I got confused about those relative addressing ( AFAIK, they called relative because they're given/assigned relative to zero by the compiler ). I'm not sure what relative addresses are used for ( in a bytecode ) or if they're really needed, or they are even identical to logical addresses?


Solution

  • You are mixing up a lot of concepts. A relative address is just an address that needs a base address to be converted to an absolute address. That conversion can happen in a lot of ways. One way is converting them at load time, but they may also just be used together with CPU instructions which intrinsically support relative addressing doing the math right when the memory location needs to be accessed.

    If an operating system supports virtual memory, all addresses used within an ordinary process are logical ones, whether they are referenced relative or absolute. The conversion from logical to physical addresses is outside the application’s scope and independent to any other concept you are referring to in your question.

    The class file format does not operate in terms of memory locations.

    If you want to apply the terms “absolute” and “relative” on that higher level, constant pool indices are absolute as they don’t require a base index to identify the actual index. Still, when you want to find the memory location within the loaded file, you not only have to use the address to which the class file was loaded, you also have to parse the entire constant pool up to the desired item, as constant pools have different byte sizes. For that reason, items are usually not looked up at all. Instead, the entire pool is converted to a JVM specific representation having constant item sizes in a single pass and later on, the JVM might look up entries of that table, which is independent of the class file’s memory location, instead.

    Within byte code instructions, relative offsets are used, which require adding the current instruction’s position to get an absolute position, but note how this doesn’t fit into the concepts named in your question. The absolute positions are still positions within an instruction sequence and hence, relative to the memory location of the code when talking about addresses. Further, the relative offsets are not used because “binding is not possible at compile time”, the resulting absolute positions are known at compile time. The Java byte code instruction set is just defined to use relative offsets to allow more compact code. From an instruction set’s perspective we could say that it intrinsically supports relative addressing. How the JVM actually implements its execution, is up to the JVM.

    Since you mentioned the JVM’s native code generation, when a JVM generates native code, it knows the target address of the code and can freely decide to use relative or absolute addresses, just as it fits.

    As already mentioned, everything described above happens within one process, so if the operating system uses virtual memory, it’s all in terms of logical addresses which might be adapted by the operating system, e.g. via MMU. These concepts are unrelated.