Search code examples
vega-litevega

Date parsing and when to use utc/TimeUnits in Vega Lite?


I am attempting to understand how date parsing in Vegalite works. Specifically, I am a bit confused in my understanding of default timezone assumptions and date parsing from a non-timezone denoted string.

Consider the minimal example that works

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {
    "values": [
      {"date": "2020-10-01", "distance": 1},
      {"date": "2020-11-01", "distance": 5}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {"field": "date", 
          "type": "temporal",  
          "timeUnit": {"unit": "yearmonthdate", "utc": true},
          "axis": {"format": "%b. %y"}
          },
    "y": {"field": "distance", "aggregate": "sum"}
  }
}

In the above example, if I omit the line (or just the utc flag):

"timeUnit": {"unit": "yearmonthdate", "utc": true}

the dates seem to get parsed as:

Wed, 30 Sep 2020 05:00:00 GMT   
Sat, 31 Oct 2020 05:00:00 GMT   

Any guidance or explanation on the default assumption here would be extremely helpful. I understand from the docs that given non-ISO string inputs, Vega will parse times as local (https://vega.github.io/vega-lite/docs/timeunit.html#utc) but that does not seem to be the case here?

Thank you


Solution

  • There are two important things to know about how Vega/Vega-Lite handles dates:

    1. Dates are always displayed in local time, unless otherwise specified (e.g. by passing "utc": true to a timeUnit)
    2. Dates are parsed using standard javascript date parsing.

    Why is #2 important? Well, because Javascript date parsing assumes different timezones depending on how input dates are formatted, and the timezones used can even depend on what browser you are using! (read more than you ever wanted to know at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/parse#Description)

    The distilled summary is that full ISO date strings will (on all browsers) be parsed as local time (I ran this on a computer set to PDT):

    > new Date("2020-10-01T00:00:00")
      Thu Oct 01 2020 00:00:00 GMT-0700 (Pacific Daylight Time)
    

    whereas partial dates or timestamps will (on most browsers) be parsed as UTC time:

    > new Date("2020-10-01")
      Wed Sep 30 2020 17:00:00 GMT-0700 (Pacific Daylight Time)
    

    What this means is that if you are passing non-ISO time strings to Vega-Lite, you must use UTC time units on the axes in order to see the correct representation of the data in the vega/vega-lite chart. If you do not, the dates will be parsed in UTC time and displayed in local time, resulting in an offset equal to the timezone offset of the browser used to view the visualization.