Search code examples
javanumberscompiler-construction

Why do we use f and L for float and long numbers in Java and we don't use b and s for byte and short?


Why using f and L for float and long while we dont use b and s for byte and short?

I asked ChatGPT and it said that since byte and short can fit in int there is no ambiguity. but if we look to it this way float can also fit in double.

Note : maybe the programmers want the compiler to work this way. If there is no answer i will take this as one.


Solution

  • It's one of those endless games of 'why'?

    So, in rough order of 'why':

    1. Because the Java Language Specification says so.

    (why does the spec say so?)

    1. Because there are such things as 'long', 'float', 'int', and 'double' literals, but there are no short and byte literals. Instead, the language spec states that if you attempt to assign an int literal directly to a byte variable and it fits, then it's okay, no cast needed, i.e. byte b = 5; compiles. Only because the spec says so. Note that if you have method println(byte b) as well as println(int b) and you call println(5), you call the int variant. You can't call the byte variant other than with a cast (println((byte) 5)) or by using a variable (byte b = 5; println(b)). That's what '5 is an int literal' means in this context.

    (but why are there no byte and short literals?)

    1. Because at the class file/JVM level, there are the major primitives (long, float, double, and int), vs the minor primitives (byte, short, char, and boolean): At the bytecode level, there are almost no instructions for the minors, only for the majors. There is FADD to add 2 floats together. Similarly there's IADD, LADD, and DADD (for int/long/double). However, BADD (add 2 bytes together, I guess) does not exist (there is also no CADD or SADD to add chars or floats together). In fact, byte b = 5, c = 10; byte d = b + c; does not compile; the compiler complains that this is trying to assign an 'int' to a byte - evidently, the result of adding 2 bytes together is, somehow, an int. This is because at the bytecode level it really does work like that: that would load in the 2 bytes, upconvert them both to an int, add the two ints together, and if you really want to assign it back to a byte, that would then downcast the int back to a byte. A big song and dance number for something so simple. Given that byte/short/char/boolean are the 'minor primitives' and almost all operations you can do on them requires this complexity (everything except declaring them and reading/writing them out of/into an array is not possible with the minors), presumably the language designers decided to take the footgun away from you and remove the concept of a 'byte' and 'short' literal from you entirely. Just to be sure.

    (But hold on - there are char and boolean literals!)

    1. Yeah there sure are; 'c' is a char, never an int (it will silently upgrade itself to an int if needed, but if you have println(int x) and println(char x), calling println('c') will call the char version, thus proving it's a char literal) - same goes for boolean (which doesn't even upgrade itself to an int under any circumstance, not even with a cast. A boolean cannot be turned into any other kind of value). Even though char and boolean are minor primitive types. I guess the designers felt that whilst the equivalence between '5' the integer and '5' the byte is so similar, that's fine, trying to decree by language specification that 'c' is simply a different way to write 99 (99 is the unicode value of 'c'), was too convoluted. So, on the basis of a cost v benefit analysis, adding the concept of short and byte literals didn't make the cut, but char and boolean literals did. If char and boolean literals didn't exist (and 'c' was just a weird way to write 99), that would mean println('c') prints 99, or println(99) prints 'c' - one of the two would have to be true. They are both ridiculous, so, to avoid ridiculousness, char literals were required. Similar rules about println(true) giving you a '1' which is confusing (for example, in most linux scripting systems, 0 is true and non-zero is false. Trying to equate 1 to true and 0 to false leads to confusion, so the lang designers presumably did not want to do that, which therefore means boolean literals are required). Also, the 'cost' of adding char and bool literals is cheap - nobody is going to be confused about what true means. In contrast, 5B to try to indicate 'that is 5 as a byte value' is confusing, given that B is also a hex digit.

    (But why are there minor primitives in the first place?)

    1. Because computer systems like operating on some minimum sized batch of data. This is called 'word size' and it depends on your system's architecture. However, when java was designed, most systems had a word size of 32-bit or higher and the few systems around with less were unlikely to even be able to run java (e.g. lacking the stuff needed to add multi-threading, and it was decided that java would have threading baked into the core libraries from the get go, meaning, java could not run on systems that don't support that - and the overlap of 16-bit word sized arch and cannot-do-threading arch as a venn diagram? Pretty much a circle). In other words, java was overwhelmingly unlikely to run on a sub-32-bit word-size arch, so there is no point to a hypothetical BADD (add 2 bytes) opcode; it would not be faster (in fact, it would be quite a bit slower, generally modern CPU design optimizes the common paths, and 'add 2 bytes' is not common). These days most arch is 64-bit so in many ways one could say java is 'misdesigned' and should relegate ints and floats to the minors. But, that would be backwards incompatible.

    (But why though?)

    1. Wellll, it all started with the big ban... oh, perhaps we can end it here.