Search code examples
c++serializationdeserializationc++-chronoc++23

How to serialize/deserialize std::chrono::zoned_time?


The std::chrono::zoned_time is not a POD type (since it's not a trivial type) and so it cannot be written/read to/from a file as a sequence of raw data. It has a time point member (std::chrono::time_point). It also has a pointer member that points at a std::chrono::time_zone object.

The problem is that the pointer member cannot be serialized/deserialized. But I want to store and retrieve the time zone of the zoned_time objects as well (and not just the time point). I guess one approach could be to get the name of the std::chrono::time_zone using std::chrono::time_zone::name and save that string_view and its size in the file.

I need to somehow fetch the member objects and serialize them. But how?

#include <chrono>
#include <fstream>


std::ofstream& operator<<( std::ofstream& ofs, const std::chrono::zoned_seconds& time )
{
    // What should go here?
    return ofs;
}

std::ifstream& operator>>( std::ifstream& ifs, std::chrono::zoned_seconds& time )
{
    // What should go here?
    return ifs;
}

Solution

  • A std::chrono::zoned_seconds is a very simple data structure under the hood: {std::chrono::time_zone const*, std::chrono::sys_seconds}.

    And each of these data members are both easily retrievable from an existing zoned_seconds, and a zoned_seconds is constructible from these two pieces of information.

    So you can reduce your problem to two parts:

    1. Serialize/deserialize a std::chrono::time_zone const*.
    2. Serialize/deserialize a std::chrono::sys_seconds.

    Also, big picture, I strongly recommend that you not use operator<< / operator>> for the names of these functions. This will lead you to inconvenient ADL (Argument Dependent Lookup) issues. I recommend you choose other names that are put into your own namespace. I'll arbitrarily choose these names to refer to these functions, but any descriptive names will do:

    std::ostream&
    put(std::ostream& os, std::chrono::zoned_seconds const& time);
    
    std::istream&
    get(std::istream& is, std::chrono::zoned_seconds& time);
    

    Also note that I chose to use the more generic ostream and istream as opposed to the file versions ofstream and ifstream. It is going to be the same coding either way. And with the more generic versions you can easily test with std::stringstream.

    So something like:

    std::ostream&
    put(std::ostream& os, std::chrono::zoned_seconds const& time)
    {
        auto tz = time.get_time_zone();
        auto tp = time.get_sys_time();
        put(os, tz);
        put(os, tp);
        return os;
    }
    
    std::istream&
    get(std::istream& is, std::chrono::zoned_seconds& time)
    {
        auto tz = get_time_zone(is);
        auto tp = get_sys_seconds(is);
        time = std::chrono::zoned_seconds{tz, tp};
        return is;
    }
    

    The put function extracts the time_zone const* and the sys_time, at the precision of seconds, so sys_seconds, and then calls functions to serialize each of those pieces.

    The get functions deserializes each piece, and then constructs a zoned_seconds with the two pieces of data and assigns that to time.

    Now we have to look at how to implement these lower level functions:

    put first:

    std::ostream&
    put(std::ostream& os, std::chrono::time_zone const* tz)
    {
        return os << tz->name() << ' ';
    }
    

    This serializes the time_zone const* by extracting its name, and writing that out. The time_zone names follow the rules laid down by the IANA time zone database. Valid characters are ASCII alphanumeric, along with a few other details. You will need to follow the name with a delimiter that is not a valid character in an IANA time zone name. ' ' is a convenient delimiter.

    std::chrono::time_zone const*
    get_time_zone(std::istream& is)
    {
        std::string tz_name;
        is >> tz_name;
        auto delimiter = static_cast<char>(is.get());
        return std::chrono::locate_zone(tz_name);
    }
    

    To deserialize the time_zone const*, just read in the name and the delimiter. If you would like to error check that the delimiter is ' ', or any other error checking, do that here. Then the string can be turned into a time_zone const* by calling locate_zone.

    Note: The above function is modified from my original answer. It now reads delimiter with the unformatted function is.get(). I previously read delimiter with the formatted stream operator. Formatted stream functions skip over whitespace prior to beginning the parse. It skipped over the character I was attempting to read into delimiter.

    Next we need to serialize the sys_seconds. Under the hood, sys_seconds just holds a std::chrono::seconds. And a std::chrono::seconds holds a signed integral type that has at least 35 bits (so in practice an int64_t).

    std::ostream&
    put(std::ostream& os, std::chrono::sys_seconds tp)
    {
        put(os, tp.time_since_epoch().count());
        return os;
    }
    

    One can extract the internal integer with .time_since_epoch().count(). This first extracts the underlying duration of precision seconds from the time_point sys_seconds, and then extracts the integral value from the seconds duration.

    Now serialize the integral type. I won't go into details about that as that is covered in good detail elsewhere. For example here.. There is also a boost library for this if desired.

    std::chrono::sys_seconds
    get_sys_seconds(std::istream& is)
    {
        return std::chrono::sys_seconds{std::chrono::seconds{get_int64_t(is)}};
    }
    

    To deserialize the sys_seconds, first deserialize the int64_t, convert that to seconds, and then convert that to sys_seconds.

    These simple steps will give you the most compact representation in your database possible. The only way to get it more compact is to use a smaller integral type than int64_t, which of course is a design choice for you, not me.

    If you choose to use int32_t (for example), your range will be limited to approximately the years 1902 to 2038. And 2038 is coming up quickly. So I don't recommend that.

    If you choose uint32_t your range will be the years 1970 to about 2106. This means you won't be able to store my birthday. ;-)

    You might also choose to serizalize a signed 6 byte integer, saving 2 bytes per entry. This would give you plenty of range (about +/- 4 million years). I will leave it as an exercise how to modify this code to serialize 6 bytes instead of 4 or 8.