Search code examples
cpointersargvkernighan-and-ritchie

How does (*++argv)[0] work in this code from K&R2?


This question is about the meaning of some syntax from Chapter 5, section 5.10 of "The C Programming Language" K&R2, pg. 117.

Earlier in this Chapter of the K&R2 book, they explain that the name of an array is a pointer to its first element; therefore, I know that argv is a pointer to the first command-line argument, e.g. ./Exercise5-10.

How does the syntax work in the while loops of this program? (*++argv)[0] I know gets the first character of a string: argv naturally points to index 0 which is the program name, and is incremented to the next char* or string, then it is de-referenced with the * and its first char is accessed by the [0] notation. I just do not understand in the inner loop, how *++argv[0] goes parses through the string.

I even looked ahead in the book and the explanation still is not clear. To me, *++argv[0] first goes to the 0'th char* in argv, the program name. Then it is incremented to the next string, -x or whatever is the case. Then de-referenced again thus leading to the first char.

I do not understand this and would like more clarification.

#include <stdio.h>
#include <string.h>
#define MAXLINE 1000

int getinput(char s[], int lim);

/* find: print lines that match pattern from 1st arg */
int main(int argc, char *argv[])
{
    char line[MAXLINE];
    long lineno = 0;
    int c, except = 0, number = 0, found = 0;

    while (--argc > 0 && (*++argv)[0] == '-')
        while ((c = *++argv[0]))
            switch (c)
            {
                case 'x':
                    except = 1;
                    break;
                case 'n':
                    number = 1;
                    break;
                default:
                    printf("find: illegal option %c\n", c);
                    argc = 0;
                    found = -1;
                    break;
            }
    if (argc != 1)
        printf("Usage: find -x -n pattern\n");
    else
        while (getinput(line, MAXLINE) > 0)
        {
            lineno++;
            if ((strstr(line, *argv) != NULL) != except)
            {
                printf("%ld:", lineno);
                printf("%s", line);
                found++;
            }
        }
    return found;
}

/* getline: read a line into s, return length */
int getinput(char s[], int lim)
{
    int c, i;

    for (i = 0; i < lim-1 && (c = getchar()) != EOF && c != '\n'; ++i)
        s[i] = c;
    if (c == '\n')
    {
        s[i] = c;
        ++i;
    }
    s[i] = '\0';
    return i;
}

Solution

  •  while (--argc > 0 && (*++argv)[0] == '-')
    

    "while there are some arguments left, and the first character of the next argument is a dash (and by the way, increment argv to point to that argument)". This implicitly ignores argv[0], which is the program name, since the first thing it does is increment argv to point to (what was formerly) argv[1].

    while ((c = *++argv[0]))
    

    "while there are some characters left in the argument pointed to by argv[0] (i.e. we haven't reached the null-terminator), and by the way, increment that pointer, but also save the pointed-to character as c". Again, this ignores the first character of the argument (which we already know is a '-') by incrementing before dereferencing.

    switch (c)
    

    do something with each flag character. This is using the rather old Unix flag style where flags have one-character names, and -abc is the same thing as -a -b -c.

    This code in-place modifies argc, argv, and every element of argv that contains a flag, which would be considered fairly sloppy and confusing today, but in 1970 space was at a premium (a typical machine that C was targeting would probably have a few tens of kilobytes of memory, and sometimes even less), and making a copy of something when you would never need the original again would have been considered a bit sinful. It leaves the program in a state where argc is the number of non-flag arguments, and argv[0] points to the first non-flag argument.