Search code examples
shellunixksh

Check if string is in date format (UNIX)?


I would like to check if that variable is in the correct date format or the variable is empty... if it is in the correct date format then i will perform sth

I have tried:

dada=2015-10-11
if [[ "$dada" = ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]
  then echo "Date $dada is valid  (YYYY-MM-DD)"

  else echo "Date $dada is not invalid format (YYYY-MM-DD)"
fi

And also

 if [ "`date '+%Y-%m-%d' -d $d 2>/dev/null`" = "$dada" ]

    then echo "Date $dada is valid  (YYYY-MM-DD)"

  else echo "Date $dada is not invalid format (YYYY-MM-DD)"
fi

But it seems like it will always return and telling me that my format is incorrect.

$dada is a dynamic variable wherby it can be a number '444.1' , date format '2017-11-12' or a string 'hello this is not valid'


Solution

  • Converting an extensive set of comments into an answer.

    How thorough a check do you want? Should the check reject 2015-02-29, for example?

    2015-02-29 should be also rejected yup!

    If you need to reject 2015-02-29, you're going to need much more checking than a single line — or the single line will be very long and complex and will have many alternatives in it.


    The classic way to validate the data pattern would use the pattern matching from the case statement — maybe using something like this:

    case "$dada" in
    ([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]) : OK;;
    (*) : Not OK;;
    esac
    

    but there are probably better modern ways of doing it. That mainly allows years 18xy, 19xy, 20xy, 21xy (though it does also let through 10xy, 11xy, 28xy, 29xy); you'll have to decide whether that's sensible. Similarly, it lets through months 13-19 (and 00), and days 32-39 (and 00); those are unconditionally invalid. Then you're left with "30 days hath September, …" to worry about.

    If you removed the leading ( around the patterns, that statement would work in antique and archaic shells such as the Bourne shell. It isn't tied to Korn shell — it is standard notation in POSIX-like shells, and pre-POSIX shells.

    How about just checking if the string format is in place like XXXX-XX-XX?

    The case command I showed does a reasonable job for years in the range 1800 through 2199. But it is 'old school' notation. The merit is it works and I don't have to read the manual. Test it — change the : commands into echo.

    I have tried the case but it seems like the code did not identify my data as a date. Is there any problem with my declaration of dada?

    On my Mac, I was able to run (verbatim — a single line command):

    ksh -c 'dada=2015-10-11; case "$dada" in ([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]) echo OK;; (*) echo Not OK;; esac'
    

    and I got OK as the output. For values such as 2215-10-11 and 2015-20-11, I got Not OK. It would be better, but isn't actually crucial, to use dada="2015-01-11"; instead of the unquoted form.

    How about if I were to add a time at the back of the date — 2015-20-11 23:21? Can I write it as 'case "$HELLO" in ([0-3][0-9]/[01][0-9]/[0-9][0-9] [0-2][0-9]:[0-6][0-9])'

    You could certainly add a glob expression that would match the time. I don't understand why the one you propose might be correct, but other patterns could be used.

    For example:

    dada="2015-11-20 23:21"
    case "$dada" in
    ([12][0189][0-9][0-9]-[01][0-9]-[0-3][0-9]\ [012][0-9]:[0-5][0-9])
        echo OK;;
    (*) echo Not OK;;
    esac
    

    Note that the backslash is needed before the space in the pattern. When run with the data shown, the script reports OK. Change 23 to 32 and it reports Not OK.

    There probably is a way to do this with the [[ command instead of writing out the case statement.

    Doing more complex (thorough) validation using case is probably not a good idea. You'd do better to invoke a tool that validates dates properly. You might be able to use the (GNU) date command, or you could use Perl or Python or one of those scripting languages. These would reject 2015-02-29 23:21 but allow 2016-02-29 23:21 without problem.