Search code examples
ctypesx86language-designlow-level

Why do we use explicit data types? (from a low level point of view)


When we take a look at some fundamental data types, such as char and int, we know that a char is simply an unsigned byte (depending on the language), int is just a signed dword, bool is just a char that can only be 1 or 0, etc. My question is, why do we use these types in compiled languages instead of just declaring a variable of type byte, dword, etc, since the operations applied to the types mentionned above are pretty much all the same, once you differentiate signed and unsigned data, and floating point data?

To extend the context of the question, in the C language, if and while statements can take a boolean value as an input, which is usually stored as a char, which exausts the need for an explicit boolean type.

In practice, the 2 pieces of code should be equivilant at the binary level:

int main()
{
    int x = 5;
    char y = 'c';
    printf("%d %c\n", x - 8, y + 1);
    return 0;
}

//outputs: -3 d

-

signed dword main()
{
    signed dword x = 5;
    byte y = 'c';
    printf("%d %c\n", x - 8, y + 1);
    return 0;
}

//outputs: -3 d

Solution

  • A programming language defines an "abstract" data model, that a computer designer is free to implement his way. For instance, nothing mandates to store a Boolean in a byte, it could be "packed" as a single bit along with others. And if you read carefully the C standard, you will notice that a char has no defined size.

    [Anecdotically, I recall an old time when FORTRAN variables, including integers, floats but also booleans, were stored on 72 bits on IBM machines.]

    Language designers should put little constraints on machine architecture, to leave opportunities for nice designs. In fact, languages have no "low level", they implicitly describe a virtual machine not tied to a particular hardware (it could be implemented with cogwheels and ropes).

    As far as I know, only the ADA language went to the point of specifying in details all the characteristics of the arithmetic, but not to the point of enforcing a number of bits per word.


    Ignoring the boolean type was one of the saddest design decision in the C language. I took as late as C99 to integrate it :-(

    Another sad decision is to have stopped considering the int type as the one that naturally fits in a machine word (and should have become 64 bits in current PCs).