Search code examples
c++date

sorting vector of string of dates in C++


I found that code:

  sort(filenames.begin(), filenames.end(), [](const string& a, const string& b)
    {
        auto getEpochTime = [](const string& s) -> time_t
        {
            std::string a = s.substr(s.find("trades") + 7, 10);
            std::istringstream date_s(a);
            struct tm date_c;
            date_s >> std::get_time( &date_c, "%Y-%m-%d" );

            std::time_t seconds = std::mktime( & date_c );
            return seconds;
        };

        time_t partA = getEpochTime(a), partB = getEpochTime(b);
        //time_t partA = GetDate(a), partB = GetDate(b);
        cout << a << " " << partA << " " << partB << " " << b << endl;
        return partA < partB;
    });

my filenames is a vector with those values:

./files/BTCUSDT-trades-2020-01-01.csv
./files/BTCUSDT-trades-2020-01-03.csv
./files/BTCUSDT-trades-2020-01-04.csv
./files/BTCUSDT-trades-2020-01-08.csv
./files/BTCUSDT-trades-2020-01-05.csv
./files/BTCUSDT-trades-2020-01-06.csv
./files/BTCUSDT-trades-2020-01-09.csv
./files/BTCUSDT-trades-2020-01-10.csv

And as you can see, after the sorting, it clearly did not work. For some reason, when printing a and partA inside the sorting function, I find weird values:

./files/BTCUSDT-trades-2020-01-09.csv 1578525308 1577834108 ./files/BTCUSDT-trades-2020-01-01.csv
./files/BTCUSDT-trades-2020-01-09.csv 1536583284 1578400884 ./files/BTCUSDT-trades-2020-01-07.csv
./files/BTCUSDT-trades-2020-01-09.csv 1578573684 1578660084 ./files/BTCUSDT-trades-2020-01-10.csv
./files/BTCUSDT-trades-2020-01-09.csv 1578573684 1578314484 ./files/BTCUSDT-trades-2020-01-06.csv
./files/BTCUSDT-trades-2020-01-02.csv -504396806053 1577834108 ./files/BTCUSDT-trades-2020-01-01.csv
  1. Some dates give negative values
  2. Same dates (2020-01-09 for instance) give different values (1578525308, 1536583284).

Any idea what I did wrong please?


Solution

  • Short answer - don't use any time conversion, you can simply sort the strings lexicographically:

    std::sort(filenames.begin(), filenames.end());
    

    Long answer - you failed to initialise properly your std::tm object, so it contains garbage values.

    From the documentation of std::mktime:

    If the std::tm object was obtained from std::get_time or the POSIX strptime, the value of tm_isdst is indeterminate, and needs to be set explicitly before calling mktime.

    And from the documentation for std::get_time:

    [...] it's unspecified if this function zero out the fields in *tmb that are not set directly by the conversion specifiers that appear in fmt: portable programs should initialize every field of *tmb to zero before calling std::get_time.

    You have to initialise everything that is not set by std::get_time:

    // ...
    struct tm date_c;
    date_c.tm_sec = 0;
    date_c.tm_min = 0;
    date_c.tm_hour = 0;
    date_c.tm_isdst = 0;
    date_s >> std::get_time( &date_c, "%Y-%m-%d" );
    // ...