Search code examples
javaspecificationsboxingjava-13

Is caching of boxed Byte objects not required by Java 13 SE spec?


Reading the JAVA 13 SE specification, I found in chapter 5, section 5.1.7. Boxing Conversion the following guarantee:

If the value p being boxed is the result of evaluating a constant expression (§15.28) of type boolean, char, short, int, or long, and the result is true, false, a character in the range '\u0000' to '\u007f' inclusive, or an integer in the range -128 to 127 inclusive, then let a and b be the results of any two boxing conversions of p. It is always the case that a == b

I find it odd that values of type byte are left out from that wording.

For example, in a code such as:

Byte b1=(byte)4;
Byte b2=(byte)4;
System.out.println(b1==b2);

We have a constant expression of type byte, and after the boxing, the values of b1 and b2 may or may not be the same object.

It works actually the same way without the cast:

Byte b1=4;

Here, we have a constant expression of type int in an assignment context. So, according to the spec

A narrowing primitive conversion followed by a boxing conversion may be used if the variable is of type Byte, Short, or Character, and the value of the constant expression is representable in the type byte, short, or char respectively.

So the expression will be converted to byte, and that byte type value will be boxed, so there is no guarantee that the value is interned.

My question is am I right in interpreting the spec, or am I missing something? I have looked if the spec requires using of method Byte.valueOf() for the boxing (for which it would be guaranteed), but it does not.


Solution

  • TL;DR this has been fixed with JDK 14, which now includes byte.

    I consider this a specification bug, result of multiple rewritings.

    Note the text of the JLS 6 counterpart:

    If the value p being boxed is true, false, a byte, a char in the range \u0000 to \u007f, or an int or short number between -128 and 127, then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.

    Here, byte is explicitly mentioned as being boxed to an object with canonical identity, unconditionally. Since all bytes are in the -127..128 range, there was no need for adding such a restriction.

    But note that long has not been mentioned.

    Then, meet JDK-7190924, 5.1.7: JLS does not mention caching of autoboxed longs

    In the comments, you can see, how it happened.

    In his first comment, Alex Buckley criticizes that "byte is a type, not a value", not considering that "byte" could mean "all values in the byte range", but since he also assumes that "number" originally meant "literal" (instead of, e.g. "numeric value"), he focuses on the point that all integer literals are either int or long.

    His first draft uses the term "integer literal" and removes the types completely. A slightly modified version of it made it into the Java 8 JLS:

    If the value p being boxed is an integer literal of type int between -128 and 127 inclusive (§3.10.1), or the boolean literal true or false (§3.10.3), or a character literal between '\u0000' and '\u007f' inclusive (§3.10.4), then let a and b be the results of any two boxing conversions of p. It is always the case that a == b.

    So in Java 8, the type doesn't matter at all, but the guaranty is limited to literals.

    So this would imply that

    Byte b1 = 4;
    

    does evaluate to a canonical object due to the integer literal, where as

    Byte b1 = (byte)4;
    

    may not, as (byte)4 is a constant expression but not a literal.

    In his next comment, years later, he considers "constant expressions", which can indeed be typed, and reformulates the phrase, bringing back the types, "boolean, char, short, int, or long", having added long, but forgotten about "byte".

    This resulting phrase is what you've cited, which is in the specification since Java 9.

    The omission of byte surely isn't intentional, as there is no plausible reason to omit it, especially, when it was there before, so this would be a breaking change when taken literally.

    Though, restricting the caching to compile-time constants, when JLS 6 specified it for all values in the range without such a restriction, is already a breaking change (which doesn't matter in practice, as long as it is implemented via valueOf, which has no way of knowing whether the value originated from a compile-time constant or not).

    As a side note, the documentation of Byte.valueOf(byte) explicitly says:

    ...all byte values are cached

    as long as since Java 7.