I recently learnt about typecasting pointers. So I played around, and made this bad code:
#include <stdio.h>
int main()
{
int* a;
char b[]= "Hye";
a = (int*)&b;
*a='B';
printf("%s\n", b);
return 0;
}
This code gives the output 'B'.
Upon further playing around, I wrote this code:
#include <stdio.h>
int main()
{
int* a;
char b[]= "Hye";
a = (int*)&b;
*a='B';
b[1]='y';
b[2]='e';
printf("%s\n", b);
return 0;
}
The output is "Bye".
I am unable to understand what the statement *a='B';
does, and why does redefining the second and the third elements of the array changes the output?
I already used stuffs like GPT-4o, searching stuffs like typecasting pointer on google, then reading articles, and watching videos related to it to get a answer to my question. But I got none. So here I am.
int* a
and char b[]
are two different types usually you run afoul with the strict aliasing rule that makes this access pattern undefined behavior (6.5, 7).
As @TomKarzes points out you may run into problems if your platform require int
be aligned, say, at 4 byte, while char b[]
may perhaps only need 1 byte alignment. If you are on such a platform your program will may fail with a "bus error" or in other ways.
The C standard only guarantees that an int
is at least 16 bits. On a 64 bit platform it's usually 4 bytes so *a='B';
probably writes 4 bytes to the array b
. On a little-endian platform (amd64) the first program will print "B" {'B', 0, 0, 0}
and the 2nd program "Bye" {'B'¸'y', 'e' 0}
, while both programs would print "" on a big-endian platform {0, 0, 0, 'B'}
and {'0, 'y', 'e', 'B'}
respectively.
You will also run into problems if you sizeof b < sizeof(int)
. sizeof(char)
is defined to be 1
. stdint.h
gives you access int
types of known size.