Search code examples
cspecificationsstrptime

Is the specified behaviour of `strptime` defined if date is underspecified, overspecified or inconsistent?


I have not found any (or few) indications of the appropriate behaviour for strptime if the date is:

  1. underspecified: contains not enough data to uniquely fill out tm (eg fx tail = strptime("%Y %p", "2015 p.m", &tm);)
  2. inconsistent: contains possibly more than enough data to fill the tm, but the data is inconsistent (eg tail = strptime("%Y-%m-%d %T %Y", "2015-09-15 07:48:29 2016", tm, noting that two different years are given)
  3. overspecified: contains more than enough data to fill the tm, but the data is consistent (eg tail = strptime("%Y-%m-%d %T %Y", "2015-09-15 07:48:29 2015", noting that the year is given twice)
  4. invalid: the given data is out of range (eg tail = strptime("%Y-%m-%d %T", 2015-09-32 07:48:29", noting the the 32th in a month of 30 days).

I suppose that case 2 and 4 should/could be considered errors and it should return "2016" and "32 07:48:29" as error indication, but do it need to fill in the tm struct in any way?

I also suppose that case 3 would be considered success and NULL should be returned and tm filled.

What about case 1? Should/could it be considered successful? I suppose that case 3 has a little bit of 1 in it (since it doesn't get the milliseconds), it sounds reasonable to assume that there are "default" values for fields that are not given in the input data (fx missing second-specification could be interpreted as the seconds being zero).

Or is it that strptime is only supposed to blindly fill in the tm structure with whatever junk is received in that order? That is the case 2, strptime would set year to 2016 because that's the last one seen, and in case 1 it wouldn't alter any other fields than tm_year and tm_hour (to enforce it to be p.m), leaving the rest of the fields unmodified?


Solution

  • After reading the reference given by Jonathan Leffler, I noted Any other conversion specification [other than white space of normal character] is executed by scanning characters until a character matching the next directive is scanned, or until no more characters can be scanned. These characters, except the one matching the next directive, are then compared to the locale values associated with the conversion specifier. If a match is found, values for the appropriate tm structure members are set to values corresponding to the locale information. Case is ignored when matching items in buf such as month or weekday names. If no match is found, strptime() fails and no more characters are scanned. (emphasize mine).

    My understanding of it is that strptime should blindly fill in the tm structure with whatever junk is received in that order.

    And effectively:

    #include <stdio.h>
    #include <time.h>
    
    int main() {
        struct tm tm = {0};
        char *tail;
    
        tail = strptime("2015 pm", "%Y %p", &tm);
        printf("%s", asctime(&tm));
        if (tail != NULL) printf(" [%s]", tail);
        puts("\n");
        tail = strptime("2015-09-15 07:48:29 2016", "%Y-%m-%d %T %Y", &tm);
        printf("%s", asctime(&tm));
        if (tail != NULL) printf(" [%s]", tail);
        puts("\n");
        tail = strptime("2015-09-15 07:48:29 2015", "%Y-%m-%d %T %Y", &tm);
        printf("%s", asctime(&tm));
        if (tail != NULL) printf(" [%s]", tail);
        puts("\n");
        tail = strptime("2015-09-32 07:48:29", "%Y-%m-%d %T", &tm);
        printf("%s", asctime(&tm));
        if (tail != NULL) printf(" [%s]", tail);
        puts("\n");
    
        return 0;
    }
    

    outputs:

    Sun Jan  0 12:00:00 2015
     []
    
    Sun Sep 15 07:48:29 2016
     []
    
    Sun Sep 15 07:48:29 2015
     []
    
    Sun Sep 15 07:48:29 2015
    

    meaning that 1, 2, and 3 and processed without errors but only the relevant fields of tm struct are modified and but the last occurence, and 4 gives an error.