Why this simple program in C crashes (array VS pointer)

I have two files:

In file 1.c I have the following array:

char p[] = "abcdefg";

In file 0.c I have the following code:

#include <stdio.h>

extern char *p; /* declared as 'char p[] = "abcdefg";' in 1.c file */

int main()
{
    printf("%c\n", p[3]);   /* crash */
    return 0;
}

And this is the command line:

gcc  -Wall -Wextra     0.c  1.c

I know that extern char *p should've been: extern char p[];, but I just want an explanation of why it doesn't work in this particular case. While it works here:

int main()
{
    char a[] = "abcdefg";
    char *p = a;

    printf("%c\n", p[3]);   /* d */
    return 0;
}

Solution

Your two examples are not comparable.

In your second example, you have

char a[] = "abcdefg";
char *p = a;

So a is an array, and p is a pointer. Drawing that in pictures, it looks like

      +---+---+---+---+---+---+---+---+
   a: | a | b | c | d | e | f | g | \0|
      +---+---+---+---+---+---+---+---+
        ^
        |
   +----|----+
p: |    *    |
   +---------+

And this is all fine; no problems with that code.

But in your first example, in file 1.c you define an array named p:

   +---+---+---+---+---+---+---+---+
p: | a | b | c | d | e | f | g | \0|
   +---+---+---+---+---+---+---+---+

You can name an array "p" if you want (the compiler certainly doesn't care), but then, over in file 0.c, you change your mind and declare that p is a pointer. You also declare (with the "extern" keyword) that p is defined somewhere else. So the compiler takes your word for it, and emits code that goes to location p and expects to find a pointer there -- or, in pictures, it expects to find a box, containing an arrow, that points somewhere else. But what it actually finds there is your string "abcdefg", only it doesn't realize it. It will probably end up trying to interpret the bytes 0x61 0x62 0x63 0x64 (that is, the bytes making up the first part of the string "abcdefg") as a pointer. Obviously that doesn't work.

You can see this clearly if you change the printf call in 0.c to

printf("%p\n", p);

This prints the value of the pointer p as a pointer. (Well, of course, p isn't really a pointer, but you lied to the compiler and told it that it was, so what you'll see is the result when the compiler treats it as a pointer, which is what we're trying to understand here.) On my system this prints

0x67666564636261

That's all 8 bytes of the string "abcdefg\0", in reverse order. (From this we can infer that I'm on a machine which (a) uses 64-bit pointers and (b) is little-endian.) So if I tried to print

printf("%c\n", p[3]);

it would try to fetch a character from location 0x67666564636264 (that is, 0x67666564636261 + 3) and print it. Now, my machine has a fair amount of memory, but it doesn't have that much, so location 0x67666564636264 doesn't exist, and therefore the program crashes when it tries to fetch from there.

Two more things.

If arrays are not the same as pointers, how did you get away with saying

char *p = a;

in your second example, the one I said was "all fine; no problems"? How can you assign an array on the right-hand side to a pointer on the left? The answer is the famous (infamous?) "equivalence between arrays and pointers in C": what actually happens is just as if you had said

char *p = &a[0];

Whenever you use an array in an expression, what you get is actually a pointer to the array's first element, just as I showed in the first picture in this answer.

And when you asked, "why it doesn't work, while it works here?", there were two other ways you could have asked it. Suppose we have the two functions

void print_char_pointer(char *p)
{
    printf("%s\n", p);
}

void print_char_array(char a[])
{
    printf("%s\n", a);
}

And then suppose we go back to your second example, with

char a[] = "abcdefg";
char *p = a;

and suppose that we call

print_char_pointer(a);

print_char_array(p);

If you try it, you'll find that there are no problems with either of them. But how can this be? How can we pass an array to a function that expects a pointer, when we call print_char_pointer(a)? And how can we pass a pointer to a function that expects an array, when we call print_char_array(p)?

Well, remember, whenever we mention an array in an expression, what we get is a pointer to the array's first element. So when we call

print_char_pointer(a);

what we get is just as if we had written

print_char_pointer(&a[0]);

What actually gets passed to the function is a pointer, which is what the function expects, so we're fine.

But what about the other case, where we pass a pointer to a function that's declared as if it accepts an array? Well, there's actually another tenet to the "equivalence between arrays and pointers in C". When we wrote

void print_char_array(char a[])

the compile treated it just as if we had written

void print_char_array(char *a)

Why would the compiler do such a thing? Why, because it knows that no array will ever be passed to a function, so it knows that no function will actually ever receive an array, so it knows that the function will receive a pointer instead. So that's the way the compiler treats it.

(And, to be very clear, when we talk about the "equivalence between arrays and pointers in C", we are not saying that pointers and arrays are equivalent, just that there is this special equivalence relationship between them. I've mentioned two of the tenets of that equivalence already. Here are all three of them, for reference: (1) Whenever you mention the name of an array in an expression, what you automatically get is a pointer to the array's first element. (2) Whenever you declare a function that seems to accept an array, what it actually accepts is a pointer. (3) Whenever you use the "array" subscripting operator, [], on a pointer, as in p[i], what you actually get is just as if you had written *(p + i). And, in fact, if you think about it carefully, due to tenet (1), even when you use the array subscripting operator on something that looks like an array, you're actually using it on a pointer. But that's a pretty strange notion, which you don't have to worry about if you don't want to, because it just works.)