Search code examples
arrayscpointerspointer-arithmetic

trouble understanding advanced pointer arithmatic syntax


Lets say we are given the next setting:

int (*p)[9];
  1. is it a regular pointer, or some kind of a special pointer to a block of memory that is 9*sizeof(int) big?

  2. how do I refer to such syntax?

  3. lets say I have a given matrix:

int mat[200][9];
int (*p)[9] = mat;

how would pointer arithmetic work with it, for example, if we were to increase p

  1. how do I cast to such type?

  2. the next code's output is 2 5 and I think that it has a link to the special syntax I've shown above. can someone explain to me why isn't the output 2 1?

int main()
{
    int a[5] = {1,2,3,4,5};
    int *ptr = (int*)(&a+1);
    printf("%d %d", *(a+1), *(ptr-1));
    return 0;
}

Solution

  • Here are the basic rules for pointer declarations:

    T *p;              // p is a pointer to T
    T *ap[N];          // ap is an array of pointers to T
    T *fp();           // fp is a function returning a pointer to T
    
    T (*pa)[N];        // pa is a pointer to an array of T
    T (*pf)();         // pf is a function returning a pointer to T
    
    T * const p;       // p is a const pointer to T - *p is writable, but p is not
    const T *p;        // p is a non-const pointer to const T - p is writable, but *p is not
    T const *p;        // same as above
    const T * const p; // p is a const pointer to const T - neither p nor *p are writable
    T const * const p; // same as above
    

    To read a hairy declaration, find the leftmost identifier and work your way out per the rules above, applying them recursively to any function parameters. For example, here's how the declaration of the signal function in the C standard library breaks down:

           signal                                       -- signal
           signal(                          )           -- is a function taking
           signal(    sig                   )           --   parameter sig
           signal(int sig                   )           --   is an int
           signal(int sig,        func      )           --   parameter func
           signal(int sig,      (*func)     )           --   is a pointer to
           signal(int sig,      (*func)(   ))           --     a function taking
           signal(int sig,      (*func)(   ))           --       unnamed parameter
           signal(int sig,      (*func)(int))           --       is an int
           signal(int sig, void (*func)(int))           --     returning void
         (*signal(int sig, void (*func)(int)))          -- returning a pointer to
         (*signal(int sig, void (*func)(int)))(   )     --   a function taking
         (*signal(int sig, void (*func)(int)))(   )     --     unnamed parameter
         (*signal(int sig, void (*func)(int)))(int)     --     is an int
    void (*signal(int sig, void (*func)(int)))(int);    --   returning void
    

    You can build more complex pointer types through substitution. For example, if you want to declare a function that returns a pointer to an array, you can build it as

    T     a     [N];    // a is an array of T
          |
      +---+----+
      |        |
    T (*  pa   )[N];    // pa is a pointer to an array of T 
          |
         ++--+
         |   |
    T (* fpa() )[N];    // fpa is a function returning a pointer to an array of T
    

    If you want an array of pointers to functions that return pointers to T, then you can build it as

     T *     p           ;       // p is a pointer to T
             |
             +----------+
             |          |
     T *     fp        ();       // fp is a function returning pointer to T
             |
         +---+-----+
         |         |
     T * (*  pf    )   ();       // pf is a pointer to a function returning pointer to T
             |
            ++---+
            |    |
     T * (* apf[N] )   ();       // apf is an array of pointers to functions returning pointer to T
    

    Pointer arithmetic is done in terms of objects, not bytes. If p stores the address of an object of type T, then p+1 yields the address of the next object of that type. If pa stores the address of an N-element array of T, then pa+1 yields the address of the next N-element array of T.

    In your code

    int main()
    {
        int a[5] = {1,2,3,4,5};
        int *ptr = (int*)(&a+1);
        printf("%d %d", *(a+1), *(ptr-1));
        return 0;
    }
    

    the expression &a + 1 yields the address of the next 5-element array of int following a. This address value is cast to int *, so it's treated as the address of the first int following the last element of a. A diagram may help:

     int[5]        int           int *
     ------        ---           -----
            +---+        +---+        +---+
         a: |   |  a[0]: | 1 |   ptr: |   |
            + - +        +---+        +---+
            |   |  a[1]: | 2 |          |
            + - +        +---+          |
            |   |  a[2]: | 3 |          |
            + - +        +---+          |
            |   |  a[3]: | 4 |          |
            + - +        +---+          |
            |   |  a[4]: | 5 |          |
            +---+        +---+          |
     a + 1: |   |        | ? | <--------+
            + - +        +---+
            |   |        | ? |
            + - +        +---+
             ...          ...
    

    The expressions a and a+1 have type int [5], each a[i] has type int, and ptr has type int *.

    Thus, ptr-1 yields the address of a[4].


    Remember that C declaration syntax mirrors expression syntax - if you have a pointer to an array named pa and you want to access the i'th element of the pointed-to array, you have to dereference pa and then subscript the result (assume T is int for this example):

    printf( "indexed value = %d\n", (*pa)[i] );
    

    In both expressions and declarations, postfix operators [] and () have higher precedence than unary *, so *pa[i] would be parsed as *(pa[i]), which isn't what we want in this case. We need to explicitly group the * operator with pa so we dereference the right thing.

    The type of the expression (*pa)[i] is int, so the declaration of pa is written as

    int (*pa)[N];
    

    Thus, the shape of the declaration tells you the shape of the expression in the code. From there it's just remembering how the various subexpressions are typed:

    Expression        Type
    ----------        ----
            pa        int (*)[N];
           *pa        int [N];
      (*pa)[i]        int
    

    The array subscript operation a[i] is defined as *(a + i) - given a starting address a, offset i elements (not bytes - remember the discussion of pointer arithmetic above) from that address and dereference the result. This means that if p is a pointer, then

    *p == *(p + 0) == p[0]
    

    meaning you can subscript a pointer expression as though it were an array1.

    So given your declarations

    int mat[200][9];
    int (*p)[9] = mat;
    

    then you can index into p the same way you can index into mat:

    (*p)[j] == (*(p + 0))[j] == p[0][j]
    

    Thus p[i][j] yields the same value as mat[i][j].


    1. Arrays are not pointers, but array expressions are converted to pointer expressions as necessary.