Search code examples
linechartvega-liteairtable

Vegalite - ignore NaN and empty Values


I'm new to Vegalite and have been progressing well, but I'm struggling to find the correct method to omit/ignore NaN and empty field values from my input data. In my example, I have fields that contain a cholesterol panel along the X axis, whilst the Y axis represents dates when each blood sample was taken. As some of the samples are incomplete, fields are empty and I'd like my line chart to simply continue on to the next valid point rather than breaking and leaving a gap.

Example of the broken line chart showing null/empty/NaN data;

Example of broken line chart

{  "title": "Blood Panel Data",  
  "width": 600,
  "height": 300,
 
  "config": {
     "axis": {
         "grid": true,
         "gridColor": "DarkSlateGrey",
          "background": "white"}},
  
 "repeat": {
   "layer": ["Cholesterol","Triglycerides","HDL Chol","LDL Chol","Non-HDL","Cholesterol/HDL-C Ratio"]},
        
 "spec": {

 "mark" : {
  "type" : "line",
  "interpolate": "monotone",
   "point": true },
  
   
   "encoding": {
     "x": {"field": "Date", "type": "temporal"},
     "y": {
       "field": {"repeat": "layer"},
       "type": "quantitative",
       "title": "Value"},
       
       "color": { "datum": {"repeat": "layer"}, "type": "nominal" }        
       
   }
 }
}

Reading through the Vegalite documentation is really hard for inexperienced users - as the majority of their examples skip into complex methodology. I'm assuming that I need to ignore the empty data values via the transform? But my attempts to execute this keep failing.

Appreciate any help with this. My example data is on Airtable.

Example Source Data layout


Solution

  • The best way to do this within a repeated layer chart would be to use a filter transform with isValid to filter out the invalid values for each line. Unfortunately, this is currently not possible within a repeated chart (the feature request is here: https://github.com/vega/vega-lite/issues/7398).

    Instead, you can achieve roughly the same thing using a fold transform followed by a filter transform; for your chart it might look something like this:

    {
      "title": "Blood Panel Data",
      "width": 600,
      "height": 300,
      "config": {
        "axis": {"grid": true, "gridColor": "DarkSlateGrey", "background": "white"}
      },
      "transform": [
        {
          "fold": [
            "Cholesterol",
            "Triglycerides",
            "HDL Chol",
            "LDL Chol",
            "Non-HDL",
            "Cholesterol/HDL-C Ratio"
          ],
          "as": ["Column", "Value"]
        },
        {"filter": "isValid(datum.Value)"}
      ],
      "mark": {"type": "line", "interpolate": "monotone", "point": true},
      "encoding": {
        "x": {"field": "Date", "type": "temporal"},
        "y": {"field": "Value", "type": "quantitative"},
        "color": {"field": "Column", "type": "nominal"}
      }
    }