Search code examples
carraysbase64charactersquare-bracket

C: Is a char possible between the square brackets of an array?


Until today my answer would have been: "No, there has to be an integer in it, that determines the position of the array."

But now I got this code snippet (for base64-decoding) from our professor and I also found it here on stackoverflow and other internet sites:

static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                                'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                                'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                                'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                                'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                                'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                                'w', 'x', 'y', 'z', '0', '1', '2', '3',
                                '4', '5', '6', '7', '8', '9', '+', '/'};
static char *decoding_table = NULL;

void build_decoding_table1() {

    int i;
    decoding_table = malloc(256);

    for (i = 0; i < 64; i++)
        decoding_table[(unsigned char) encoding_table[i]] = i;
}

The line, that surprises me, is:

decoding_table[(unsigned char) encoding_table[i]] = i;

What happens here -- at least I think that's what happens -- is that when for example i == 0, we get the first position of the encoding_table-array, so encoding_table[0] == 'A'. That gets casted to unsigned char, so it's still 'A'. So we have: decoding_table['A'] = 0;

A char, that determines an array position is new to me. How does this work? Is the integer-equivalent of the ASCII-table used instead (65 instead of 'A')? Or do I misunderstand, what this code does, and I'm outing me as a complete noob?


Solution

  • Literal 'A' is - according to your system's character set, represented as an integral value, e.g. 65 in ASCII. BTW - the data type of the literal is integer, not char, but that does not matter much here.

    Your encoding table is an array of char, and if your system's default is signed char for char, then the integral value 65 will be stored as a signed char, i.e. an 8 bit signed integral value.

    The other way round, if you write decoding_table[(unsigned char) encoding_table[i]], then the signed 8 bit integral value 65 from encoding_table[i] is casted to an unsigned 8 bit integral value, still giving 65. Casting to unsigned is a good idea, since an 8 bit signed char might be negative, giving something like, for example, decoding_table[-10]. This would be undefined behaviour, since it accessed the array out of its bounds.

    So you are assuming right: you may consider character literals as integral values, and hence you can use it as array index.