Search code examples
windowsdatetimeparsingdatetime-formatstring-parsing

Win32: How to convert string to a date?


In Windows, i want to parse a string as a date using an exact format string.

For example, given the string

"6/12/2010"

and the format:

"M/d/yyyy"

i want to convert the string to a date, while ensuring that the date matches the format.

i also need to be able to specify the Y2K sliding window, pivot. This means that if a 2-digit year is (correctly) entered, i will specify that number of years in the future i would consider the year to be. e.g.:

Two-digit Year    Pivot    Four-digit year
==============    =====    ===============
30                +0       1929
30                +18      1929
30                +19      1929
30                +20      2029
30                +21      2029
30                +100     2029

.NET already provides a DateTime.ParseExact function, which performs nearly exactly what i need:

date = DateTime.ParseExact("6/12/2010", 
         DateTimeFormatInfo.ShortDatePattern,
         Thread.CurrentThread.CurrentCulture);

Except i can't tell it the 100-year pivot value.

More examples:

String        Format Specifier        Date
"6/7/2029"   "M/d/yyyy"             6/7/2029
"6/7/29"     "M/d/yyyy"             (invalid, year too short)
"6/7/29"     "M/d/yy"               6/7/1929  (+0 pivot)
"6/7/29"     "M/d/yy"               6/7/2029  (+100 pivot
"6/7/29"     "M/d/yy"               6/7/2029  (+50 pivot)
"6/7/29"     "M/d/yy"               6/7/2029
"6/7/2029"   "M.d.yyyy"             (invalid, incorrect separators)
"6.7.2029"   "M.d.yyyy"             6/7/2029
"6.7.2029"   "M-d-yyyy"             (invalid, incorrect separators)
"6/7/2029"   "M/dd/yyyy"            (invalid, days requires leading zero)
"6/07/2029"  "M/dd/yyyy"            (invalid, days requires leading zero)    
"6/07/2029"  "MM/dd/yyyy"           (invalid, months requires leading zero)
"06/07/2029  "MM/dd/yyyy"           6/7/2029
"06/07/2029" "MM/d/yyyy"            (invalid, days should not have leading zero)
"06/7/2029"  "MM/d/yyyy"            6/7/2029

i know that Windows doesn't have native API to convert a string to a date.

Is there any established code out there that will convert a string to a date using a format specifier? Computers have been around for a while now; someone must have solved this problem already.

Here is a list of some sample format specifiers that you could be expected to see in Windows:

  • M/d/yyyy
  • M/d/yy
  • M/dd/yyyy
  • M/dd/yy
  • MM/d/yyyy
  • MM/d/yy
  • MM/dd/yyyy
  • MM/dd/yy
  • d/M/yyyy
  • d/M/yyy
  • d/MM/yyyy
  • d/MM/yy
  • dd/M/yyyy
  • dd/M/yy
  • dd/MM/yyyy
  • dd/MM/yy
  • yyyy/M/d
  • yy/M/d
  • yyyy/MM/d
  • yy/MM/d
  • yyyy/MM/dd
  • yy/MM/dd
  • dd MM yyyy
  • dd.MM.yyyy

See also


Solution

  • i eventually had to cave in and write one. Not trivial to parse a date

    class function TDateTimeUtils.TryStrToDateExact(const S, DateFormat: string; out Value: TDateTime): Boolean;
    begin
    {
        Assume Microsoft's de-facto standard for y2k fixup: 2029
            1930-2029
    }
        Result := TDateTimeUtils.TryStrToDateExact(S, DateFormat, 2029, {out}Value);
    end;
    
    class function TDateTimeUtils.TryStrToDateExact(const S, DateFormat: string; PivotYear: Integer;
            out Value: TDateTime): Boolean;
    var
        Month, Day, Year: Integer;
        Tokens: TStringDynArray;
        CurrentToken: string;
        i, n: Integer;
        Partial: string;
        MaxValue: Integer;
        nCurrentYear: Integer;
    
        function GetCurrentYear: Word;
        var
            y, m, d: Word;
        begin
            DecodeDate(Now, y, m, d);
            Result := y;
        end;
    begin
        Result := False;
    {
        M/dd/yy
    
        Valid pictures codes are
    
            d       Day of the month as digits without leading zeros for single-digit days.
            dd      Day of the month as digits with leading zeros for single-digit days.
            ddd Abbreviated day of the week as specified by a LOCALE_SABBREVDAYNAME* value, for example, "Mon" in English (United States).
                    Windows Vista and later: If a short version of the day of the week is required, your application should use the LOCALE_SSHORTESTDAYNAME* constants.
            dddd    Day of the week as specified by a LOCALE_SDAYNAME* value.
    
            M       Month as digits without leading zeros for single-digit months.
            MM      Month as digits with leading zeros for single-digit months.
            MMM Abbreviated month as specified by a LOCALE_SABBREVMONTHNAME* value, for example, "Nov" in English (United States).
            MMMM    Month as specified by a LOCALE_SMONTHNAME* value, for example, "November" for English (United States), and "Noviembre" for Spanish (Spain).
    
            y       Year represented only by the last digit.
            yy      Year represented only by the last two digits. A leading zero is added for single-digit years.
            yyyy    Year represented by a full four or five digits, depending on the calendar used. Thai Buddhist and Korean calendars have five-digit years. The "yyyy" pattern shows five digits for these two calendars, and four digits for all other supported calendars. Calendars that have single-digit or two-digit years, such as for the Japanese Emperor era, are represented differently. A single-digit year is represented with a leading zero, for example, "03". A two-digit year is represented with two digits, for example, "13". No additional leading zeros are displayed.
            yyyyy   Behaves identically to "yyyy".
    
            g, gg   Period/era string formatted as specified by the CAL_SERASTRING value.
                    The "g" and "gg" format pictures in a date string are ignored if there is no associated era or period string.
    
    
            PivotYear
                    The maximum year that a 1 or 2 digit year is assumed to be.
                    The Microsoft de-factor standard for y2k is 2029. Any value greater
                    than 29 is assumed to be 1930 or higher.
    
                    e.g. 2029:
                        1930, ..., 2000, 2001,..., 2029
    
                    If the PivotYear is between 0 and 99, then PivotYear is assumed to be
                    a date range in the future. e.g. (assuming this is currently 2010):
    
                        Pivot   Range
                        0       1911..2010  (no future years)
                        1       1912..2011
                        ...
                        98      2009..2108
                        99      2010..2099  (no past years)
    
                    0 ==> no years in the future
                    99 ==> no years in the past
    }
        if Length(S) = 0 then
            Exit;
        if Length(DateFormat) = 0 then
            Exit;
    
        Month := -1;
        Day := -1;
        Year := -1;
    
        Tokens := TDateTimeUtils.TokenizeFormat(DateFormat);
        n := 1; //input string index
        for i := Low(Tokens) to High(Tokens) do
        begin
            CurrentToken := Tokens[i];
            if CurrentToken = 'MMMM' then
            begin
                //Long month names, we don't support yet (you're free to write it)
                Exit;
            end
            else if CurrentToken = 'MMM' then
            begin
                //Short month names, we don't support yet (you're free to write it)
                Exit;
            end
            else if CurrentToken = 'MM' then
            begin
                //Month, with leading zero if needed
                if not ReadDigitString(S, n, 2{MinDigits}, 2{MaxDigits}, 1{MinValue}, 12{MaxValue}, {var}Month) then Exit;
            end
            else if CurrentToken = 'M' then
            begin
                //months
                if not ReadDigitString(S, n, 1{MinDigits}, 2{MaxDigits}, 1{MinValue}, 12{MaxValue}, {var}Month) then Exit;
            end
            else if CurrentToken = 'dddd' then
            begin
                Exit; //Long day names, we don't support yet (you're free to write it)
            end
            else if CurrentToken = 'ddd' then
            begin
                Exit; //Short day names, we don't support yet (you're free to write it);
            end
            else if CurrentToken = 'dd' then
            begin
                //If we know what month it is, and even better if we know what year it is, limit the number of valid days to that
                if (Month >= 1) and (Month <= 12) then
                begin
                    if Year > 0 then
                        MaxValue := MonthDays[IsLeapYear(Year), Month]
                    else
                        MaxValue := MonthDays[True, Month]; //we don't know the year, assume it's a leap year to be more generous
                end
                else
                    MaxValue := 31; //we don't know the month, so assume it's the largest
    
                if not ReadDigitString(S, n, 2{MinDigits}, 2{MaxDigits}, 1{MinValue}, MaxValue{MaxValue}, {var}Day) then Exit;
            end
            else if CurrentToken = 'd' then
            begin
                //days
                //If we know what month it is, and even better if we know what year it is, limit the number of valid days to that
                if (Month >= 1) and (Month <= 12) then
                begin
                    if Year > 0 then
                        MaxValue := MonthDays[IsLeapYear(Year), Month]
                    else
                        MaxValue := MonthDays[True, Month]; //we don't know the year, assume it's a leap year to be more generous
                end
                else
                    MaxValue := 31; //we don't know the month, so assume it's the largest
    
                if not ReadDigitString(S, n, 1{MinDigits}, 2{MaxDigits}, 1{MinValue}, MaxValue{MaxValue}, {var}Day) then Exit;
            end
            else if (CurrentToken = 'yyyy') or (CurrentToken = 'yyyyy') then
            begin
                //Year represented by a full four or five digits, depending on the calendar used.
                {
                    Thai Buddhist and Korean calendars have five-digit years.
                    The "yyyy" pattern shows five digits for these two calendars,
                        and four digits for all other supported calendars.
                    Calendars that have single-digit or two-digit years, such as for
                        the Japanese Emperor era, are represented differently.
                        A single-digit year is represented with a leading zero, for
                        example, "03". A two-digit year is represented with two digits,
                        for example, "13". No additional leading zeros are displayed.
                }
                if not ReadDigitString(S, n, 4{MinDigits}, 4{MaxDigits}, 0{MinValue}, 9999{MaxValue}, {var}Year) then Exit;
            end
            else if CurrentToken = 'yyy' then
            begin
                //i'm not sure what this would look like, so i'll ignore it
                Exit;
            end
            else if CurrentToken = 'yy' then
            begin
                //Year represented only by the last two digits. A leading zero is added for single-digit years.
                if not ReadDigitString(S, n, 2{MinDigits}, 2{MaxDigits}, 0{MinValue}, 99{MaxValue}, {var}Year) then Exit;
    
                nCurrentYear := GetCurrentYear;
                Year := (nCurrentYear div 100 * 100)+Year;
    
                if (PivotYear < 100) and (PivotYear >= 0) then
                begin
                    //assume pivotyear is a delta from this year, not an absolute value
                    PivotYear := nCurrentYear+PivotYear;
                end;
    
                //Check the pivot year value
                if Year > PivotYear then
                    Year := Year - 100;
            end
            else if CurrentToken = 'y' then
            begin
                //Year represented only by the last digit.
                if not ReadDigitString(S, n, 1{MinDigits}, 1{MaxDigits}, 0{MinValue}, 9{MaxValue}, {var}Year) then Exit;
    
                nCurrentYear := GetCurrentYear;
                Year := (nCurrentYear div 10 * 10)+Year;
    
                if (PivotYear < 100) and (PivotYear >= 0) then
                begin
                    //assume pivotyear is a delta from this year, not an absolute value
                    PivotYear := nCurrentYear+PivotYear;
                end;
    
                //Check the pivot year value
                if Year > PivotYear then
                    Year := Year - 100;
            end
            else
            begin
                //The input string should contains CurrentToken starting at n
                Partial := Copy(S, n, Length(CurrentToken));
                Inc(n, Length(CurrentToken));
                if Partial <> CurrentToken then
                    Exit;
            end;
        end;
    
        //If there's still stuff left over in the string, then it's not valid
        if n <> Length(s)+1 then
        begin
            Result := False;
            Exit;
        end;
    
        if Day > MonthDays[IsLeapYear(Year), Month] then
        begin
            Result := False;
            Exit;
        end;
    
        try
            Value := EncodeDate(Year, Month, Day);
        except
            Result := False;
            Exit;
        end;
        Result := True;
    end;
    
    class function TDateTimeUtils.TokenizeFormat(fmt: string): TStringDynArray;
    var
        i: Integer;
        partial: string;
    
        function IsDateFormatPicture(ch: AnsiChar): Boolean;
        begin
            case ch of
            'M','d','y': Result := True;
            else Result := False;
            end;
        end;
    begin
        SetLength(Result, 0);
    
        if Length(fmt) = 0 then
            Exit;
    
        //format is only one character long? If so then that's the tokenized entry
        if Length(fmt)=1 then
        begin
            SetLength(Result, 1);
            Result[0] := fmt;
        end;
    
        partial := fmt[1];
        i := 2;
        while i <= Length(fmt) do
        begin
            //If the characters in partial are a format picture, and the character in fmt is not the same picture code then write partial to result, and reset partial
            if IsDateFormatPicture(partial[1]) then
            begin
                //if the current fmt character is different than the running partial picture
                if (partial[1] <> fmt[i]) then
                begin
                    //Move the current partial to the output
                    //and start a new partial
                    SetLength(Result, Length(Result)+1);
                    Result[High(Result)] := partial;
                    Partial := fmt[i];
                end
                else
                begin
                    //the current fmt character is more of the same format picture in partial
                    //Add it to the partial
                    Partial := Partial + fmt[i];
                end;
            end
            else
            begin
                //The running partial is not a format picture.
                //If the current fmt character is a picture code, then write out the partial and start a new partial
                if IsDateFormatPicture(fmt[i]) then
                begin
                    //Move the current partial to the output
                    //and start a new partial
                    SetLength(Result, Length(Result)+1);
                    Result[High(Result)] := partial;
                    Partial := fmt[i];
                end
                else
                begin
                    //The current fmt character is another non-picture code. Add it to the running partial
                    Partial := Partial + fmt[i];
                end;
            end;
    
            Inc(i);
            Continue;
        end;
    
        //If we have a running partial, then add it to the output
        if partial <> '' then
        begin
            SetLength(Result, Length(Result)+1);
            Result[High(Result)] := partial;
        end;
    end;
    
    class function TDateTimeUtils.ReadDigitString(const S: string; var Pos: Integer;
                MinDigits, MaxDigits: Integer; MinValue, MaxValue: Integer;
                var Number: Integer): Boolean;
    var
        Digits: Integer;
        Value: Integer;
        Partial: string;
        CandidateNumber: Integer;
        CandidateDigits: Integer;
    begin
        Result := False;
        CandidateNumber := -1;
        CandidateDigits := 0;
    
        Digits := MinDigits;
        while Digits <= MaxDigits do
        begin
            Partial := Copy(S, Pos, Digits);
            if Length(Partial) < Digits then
            begin
                //we couldn't get all we wanted. We're done; use whatever we've gotten already
                Break;
            end;
    
            //Check that it's still a number
            if not TryStrToInt(Partial, Value) then
                Break;
    
            //Check that it's not too big - meaning that getting anymore wouldn't work
            if (Value > MaxValue) then
                Break;
    
            if (Value >= MinValue) then
            begin
                //Hmm, looks good. Keep it as our best possibility
                CandidateNumber := Value;
                CandidateDigits := Digits;
            end;
    
            Inc(Digits); //try to be greedy, grabbing even *MORE* digits
        end;
    
        if (CandidateNumber >= 0) or (CandidateDigits > 0) then
        begin
            Inc(Pos, CandidateDigits);
            Number := CandidateNumber;
            Result := True;
        end;
    end;