Search code examples
cstdin

How to use sscanf with loops correctly with ill-formatted input?


I'm trying to read from stdin the commands for a stack data structure, the valid command are "push [number]" and "pop", here's my code:

while(getline(&input, &len, stdin) > 0){
while ((n = sscanf(input,"%64s%d%n",cmd,&num,&offset)) > 0){
    if (n == 1){
        if (!strcmp("pop",cmd)){
            pop(&head);
        } else if (!strcmp("push",cmd)){
            //Doing something
        } else if (isNumeric(cmd) && need_num){
            push(&head, num);
        } else{
            //error
        }
    }
    else if (n == 2){
        if (!strcmp("push",cmd)){
            push(&head, num);
        } else {
            //error
        }
    }
    else {
        //error
    }
    input += offset;
}
}

This Program has many flaws, of course, since I'm not familiar with sscanf with loops. The first problem is that if I read in a series of commands in a line such as:

push 1 push 2 pop  push 3

It will actually cause error, since there's number following push but no number after pop, in this way, this line of code is wrong for "pop" command:

input += offset;

But I don't know how to fix that. The other problem is that if I suppose split a "push [number]" command into 2 lines is acceptable:

push 5 pop push
4

I don't know if there's any easier way other than what I did to decide whether the line end up with push and the following line start with a number:

else if (!strcmp("push",cmd)){
        //Doing something
    } else if (isNumeric(cmd) && need_num){
        push(&head, num);
    }

Any help would be appreciated!


Solution

  • Well there were not any glaring errors in your use, but there were a number of subtle errors that caused you problems.

    The primary problem you were faced with was taking only a single offset after both the string and integer were converted. In the case of "pop", a matching failure occurs with %d before you ever reach %n in your format string causing n to remain unset (retaining the last set value -- causing your increment of offset to be too large in that case)

    Instead, you need to employ two such checks in your format string, e.g.:

        while ((rtn = sscanf (input, "%63s %n%d %n", 
                                cmd, &off1, &num, &off2)) > 0) {
    

    That way if "pop" is read, you have a proper offset in off1 and if "push num" is read, you have the proper offset in off2. (the additional whitespace before each %n is optional (but suggested). With both %s and %d leading whitespace is consumed -- but best to get in the habit of accounting for it, as %[...] or %c do not consume leading whitepace.

    Putting it together in a short example, you could do something similar to the following:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #define MAXCMD 64   /* max characters for command */
    
    int main (void) {
    
        char *buf = NULL;   /* buffer for getline */
        size_t n = 0;       /* alloc size (0 getline decides) */
        ssize_t nchr = 0;   /* getline returns (no. of chars read */
    
        while ((nchr = getline (&buf, &n, stdin)) > 0) {    /* read each line */
            char cmd[MAXCMD] = "",  /* buffer for command */
                *input = buf;       /* pointer to advance */
            int num = 0,            /* number to push */
                off1 = 0,           /* offset if single conversion */
                off2 = 0,           /* offset if double conversion */
                rtn = 0;            /* sscanf return */
            while ((rtn = sscanf (input, "%63s %n%d %n", 
                                    cmd, &off1, &num, &off2)) > 0) {
                switch (rtn) {      /* switch on sscanf return */
                    case 1:         /* handle "pop" case */
                        if (strcmp (cmd, "pop") == 0) {
                            printf ("pop\n");
                            input += off1;  /* set offset based on off1 */
                        }
                        else
                            fprintf (stderr, "error: invalid single cmd '%s'.\n",
                                            cmd);
                        break;
                    case 2:         /* handle "push num" case */
                        if (strcmp (cmd, "push") == 0) {
                            printf ("push %d\n", num);
                            input += off2;  /* set offset based on off2 */
                        }
                        else
                            fprintf (stderr, "error: invalid single cmd '%s'.\n",
                                    cmd);
                        break;
                    default:
                        fprintf (stderr, "error: invalid input.\n");
                        break;
                }
            }
        }
        free (buf);     /* free memory allocated by getline */
    
        return 0;
    }
    

    (note: I have used a switch() statement, but you are free to use if, else if, else if you prefer)

    Example Use/Output

    Validation cases:

    $ echo "pop pop push 1 push 2 pop push 3 pop" | ./bin/sscanfpushpop
    pop
    pop
    push 1
    push 2
    pop
    push 3
    pop
    
    $ echo "push 5 pop push 6 push 7 pop push 8" | ./bin/sscanfpushpop
    push 5
    pop
    push 6
    push 7
    pop
    push 8
    

    Look things over and let me know if you have any further questions.