Search code examples
cpointerssizeofargv

Finding the size of a string in argv using sizeof


This is more of a conceptual question at this point rather than a practical one but it is really bothering me.

Let us say I have a c program called "test.c" and I want to find the number of spaces in array there are for a word the user types in as an argument. For example "./test.c test_run" should print 9 because there are 8 characters and then one for the null terminating character. When I try to use sizeof on argv though I am having some trouble.

int  main(int argc, char *argv[]) {
    char buf10[10];
    printf("The size of buf10 is: %i.\n", sizeof(buf10));
    return 0;
}

Prints the result: "The size of buf10 is: 10.". This makes sense because I chose a char array. In C, the size of a char is 1 byte. If I chose int, this number would be 4.

Now my question is why can't I do this with argv?

int  main(int argc, char *argv[]) {
    printf("argv[1] has the value: %s\n", argv[1]);
    printf("strlen of argv[1] is: %i\n", strlen(argv[1]));
    printf("sizeof of argv[1] is: %i\n", sizeof(argv[1]));
    return 0;
}

Ran with "./test Hello_SO" gives the output:

argv[1] has the value: Hello_SO
strlen of argv[1] is: 8
sizeof of argv[1] is: 4

The string length makes sense because it should be 9 but minus the "\0" makes 8.

However I do not understand why sizeof is returning 4 (the size of the pointer). I understand that *argv[] can be thought of as **argv. But I accounted for this already. In my first example i print "buf" but here i print "argv[1]". I know I could easily get the answer by using strlen but as I said earlier this is just conceptual at this point.


Solution

  • Pointers and arrays are not the same thing, though they are quite similar in many situations. sizeof is a key difference.

    int arr[10];
    assert(sizeof arr == (sizeof(int) * 10));
    int *ip;
    assert(sizeof ip == sizeof(int*));
    

    The type of arr above is int[10]. Another way to see the difference between array types and pointers is by trying to assign to them.

    int i;
    ip = &i; // sure, fine
    arr = &i; // fails, can't assign to an int[10]
    

    arrays cannot be assigned to.

    What is most confusing is that when you have an array as a function parameter, it actually is the same has having a pointer.

    int f(int arr[10]) {
        int x;
        arr = &x; // fine, because arr is actually an int*
        assert(sizeof arr == sizeof(int*));
    }
    

    To address your question of why you can't use sizeof argv[1] and get the size of the string (plus the 1 for the \0), it's because it's a ragged array. In this case the first dimension is of unknown size, as well as the second. sizeof behaves like a compile time operation in this case, and the length of the string is not known until run time.

    Consider the following program:

    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
        printf("%zu\n", sizeof argv[1]);
    }
    

    The assembly generated for this is:

    .LC0:
        .string "%zu\n"
        .text
        .globl  main
        .type   main, @function
    main:
    .LFB3:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movl    $8, %esi        # this 8 is the result of sizeof
        movl    $.LC0, %edi     # the format string
        movl    $0, %eax
        call    printf          # calling printf
        movl    $0, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
    

    as you can see, the result of sizeof argv[1] is done at compile time, nothing above is computing the length of the string. I'm on 64-bit so my pointers are 8 bytes.