Search code examples
cchargdbnon-ascii-characters

ASCII character 14 (and others) when assigned in string in C


Debugging the following code in GDB:

char* input = malloc(2);
input[0] = 14;
input[1] = 0;

The value of the string in memory according to GDB is:

input = "\016"

Likewise,

char* input = malloc(2);
input[0] = 16;
input[1] = 0;

input in memory is "\020".

Why is this the case? Why doesn't ASCII value 14 map to char \016? Then, why does ASCII value 16 map to \020 in memory?

EDIT: To add further confusion, using the following code:

char* input = malloc(2);
input[0] = 20;
input[1] = 0;

By running the above code segment in gdb and using the following command:

p input

the resulting value that is printed is:

$1 = 0x604010 "\020"

This leads me to believe that value of string input is "\020".

Two ASCII numbers map to the same contents "\020" (namely 16 and 20).


Solution

  • 14 is written 016 in octal (base eight). The syntax '\016' uses octal for historical reasons, namely antique computers from the 60s that had 6-bit chars crammed in 12-bit, 18-bit or even 36-bit words for which octal digits seemed a perfect representation for groups of 3 bits.

    Traces of this can be found in the C syntax for character and string constants (borrowed from C by many languages) and the permission flags in Unix file systems (eg: chmod and umask arguments).

    16 is '\020', 20 is '\024' and 32 (ASCII space) is '\040' or '\x20'.