Search code examples
c++cvariablestypesint

Why is it bad to use short


It is very common that even in script where the developer have guarantees that the variable will never exceed one byte and sometimes two bytes; Many people decide to use int types for every possible variable used to represent numbers nay in the range of 0-1.

Why does it hurt so much to use char or short instead?

I think I heard someone saying int is "more standard" type of.. type. What does this mean. My question is does the data type int have any defined advantages over short (or other lesser data types), because of which advantages, people used to almost always resort to int?


Solution

  • As a general rule, most arithmetic in C is performed using type int (that is, plain int, not short or long). This is because (a) the definition of C says so, which is related to the fact that (b) that's the way many processors (at least, the ones C's designers had in mind) prefer to work.

    So if you try to "save space" by using short ints instead, and you write something like

    short a = 1, b = 2;
    short c = a + b;
    

    the compiler may have to emit code to, in effect, convert a from short to int, convert b from short to int, do the addition, and convert the sum back to short. You may have saved a little bit of space on the storage for a, b, and c, but your code may end up being bigger (and slower).

    If you instead write

    int a = 1, b = 2;
    int c = a + b;
    

    you might spend a little more storage space on a, b, and c, but the code might be smaller and quicker.

    This is somewhat of an oversimplified argument, but it's behind your observation that usage of type short is rare, and plain int is generally recommended. Basically, since it's the machine's "natural" size, it's presumed to be the most straightforward type to do arithmetic in, without extra conversions to and from less-natural types. It's sort of a "When in Rome, do as the Romans do" argument, but it generally does make using plain int advantageous.

    If you have lots of not-so-large integers to store, on the other hand (a large array of them, or a large array of structures containing not-so-large integers), the storage savings for the data might be large, and worth it as traded off against the (relatively smaller) increase in the code size, and the potential speed increase.

    See also this previous SO question and this C FAQ list entry.


    Addendum: like any optimization problem, if you really care about data space usage, code space usage, and code speed, you'll want to perform careful measurements using your exact machine and processor. Your processor might not end up requiring any "extra conversion instructions" to convert to/from the smaller types, after all, so using them might not be so much of a disadvantage. But at the same time you can probably confirm that, for isolated variables, using them might not yield any measurable advantage, either.


    Addendum 2. Here's a data point. I experimented with the code

    extern short a, b, c;
    
    void f()
    {
        c = a + b;
    }
    

    I compiled with two compilers, gcc and clang (compiling for an Intel processor on a Mac). I then changed short to int and compiled again. The int-using code was 7 bytes smaller under gcc, and 10 bytes smaller under clang. Inspection of the assembly language output suggests that the difference was in truncating the result so as to store it in c; fetching short as opposed to int doesn't seem to change the instruction count.

    However, I then tried calling the two different versions, and discovered that it made virtually no difference in the run time, even after 10000000000 calls. So the "using short might make the code bigger" part of the answer is confirmed, but maybe not "and also make it slower".