Search code examples
cparsinguser-inputscanf

Parsing Input in C revisited


I have an input of the form "a b.c d.e"(quotes excluded) I want to parse into integer values like a1=a, a2=b, a3=c,..and so on. But trick is the value just after "." can also be missing so, "a b d.e", "a b c", "a b.c d" are valid inputs. But "a b. d.e" isn't valid as after "b" there's a "." but no number following it. Assume the numbers ie. "a", "b", etc. are integers greater than 0 and less than 1000.

I can only think about the hard way of doing it. Tried this ->

scanf("%d %2d[^. ].%2d[^. ] %2d[^. ].%2d[^\n]",&a,&b,&c,&d,&e);

but it didn't work. I appreciate any help to parse it using scanf() parameter trick on this.


Solution

  • If I understand your answer to my comment, then in this case, I don't think you can shoehorn the input into a scanf format string and handle the variations you want to handle. In this case, you can however, fall back to the tried and true, parse anything, walking a pointer down the string with help from strtol.

    strtol provides an end-pointer which is set to the next character after a valid conversion to long in the input string. Therefore, if you are working your way down a complicated input string, you start with the strtol (p, &ep, 10) where p is a pointer to your string, ep will be set to the next character in the string after the first number converted. (in your case either a '.' or ' '). If the next character is a '.', you can preform further tests on the character that follows and make a determination of whether your input string is valid or not as you collect the values one-by-one.

    After you test the needed conditions and advance ep to point to the beginning of the next valid number to convert, you simply set p = ep; and repeat until you have filled a, b & c or hit an error condition.

    When using strtol there are several additional validations you can use beyond checking errno, but that will suffice for this example. See man strtol for the remaining validations.

    If I have your logic correctly understood, you can do something similar to the following:

    #include <stdio.h>
    #include <stdlib.h>
    #include <errno.h>
    
    #define MAXC 32
    
    int main (void) {
    
        char buf[MAXC] = "", *p = buf, *ep = NULL;
        int a = 0, b = 0, c = 0, tmp = 0;
    
        printf ("enter values: ");
        if (!fgets (buf, MAXC, stdin))
            return 1;
    
        for (;;) {
            tmp = strtol (p, &ep, 10);
            if (errno) {    /* minimal conversion validation */
                fprintf (stderr, "error: failed conversion.\n");
                return 1;
            }
            if (!a)         /* handle a */
                a = tmp;
            else if (!b) {  /* handle b */
                if (*ep && *ep == '.' && *(ep + 1) &&  *(ep + 1) == ' ') {
                    fprintf (stderr, "error: invalid input = '.' alone.\n");
                    return 1;
                }
                b = tmp;
                if (*ep && *ep == '.' && *(ep + 1) && 
                    '0' <= *(ep + 1) && *(ep + 1) <= '9')
                    ep++;
                else
                    while (*ep && (*ep < '0' || '9' < *ep))
                        ep++;
                if (!*ep) {
                    fprintf (stderr, "error: invalid input - no 'c'.\n");
                    return 1;
                }
            }
            else if (!c) {  /* handle c */
                if (*ep && *ep == '.' && *(ep + 1) && 
                    (*(ep + 1) == ' ' || *(ep + 1) == '\n')) {
                    fprintf (stderr, "error: invalid input = '.' alone.\n");
                    return 1;
                }
                c = tmp;
                break;
            }
            p = ep;
        }
    
        printf ("a : %d\nb : %d\nc : %d\n", a, b, c);
    
        return 0;
    }
    

    Example Use/Output

    $ ./bin/triplet
    enter values: 1 2.3 4.5
    a : 1
    b : 2
    c : 3
    
    $ ./bin/triplet
    enter values: 1 2 3.4
    a : 1
    b : 2
    c : 3
    
    $ ./bin/triplet
    enter values: 1 2 3
    a : 1
    b : 2
    c : 3
    
    $ ./bin/triplet
    enter values: 1 2.3 4
    a : 1
    b : 2
    c : 3
    
    $ ./bin/triplet
    enter values: 1 2. 3.4
    error: invalid input = '.' alone.
    

    Look things over and let me know if I misunderstood your comment. If not, always remember you can parse anything by simply walking a pointer (or pair of pointers) down the string.