Search code examples
powerbiregressionvega-litevegadeneb

In Vega lite (Deneb) how do I apply a parameter list of regression types to a calculated regression line


I have created a "poly" regression line for a scatterplot in vega lite similar to the below.

I have also created a parameter list, listing the types of regression (linear, log, etc.)

However, I am unable to correctly map the list so that I am able to switch between different types of regression calcs. So I am looking for some advice.

scatterplot with regression

In the below code, I have tried replacing: "method": "poly",

with

"method": {"expr": "regressionline"}

where "regressionline" is the name of my parameter list, but I dont think it is correct.

I was wondering if it is actually possible, and if there is another method.

enter image description here

Many thanks

{
  "data": {"name": "dataset"},
  "params": [
    {
      "name": "regressionline",
      "value": "linear",
      "bind": {
        "input": "radio",
        "options": [
          "linear",
          "log",
          "exp",
          "pow",
          "quad",
          "poly"
        ]
      }
    }
  ],
  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "x": {
          "bin": {"maxbins": 20},
          "field": "Total Ref Estimate - RF",
          "title": "Total Estimate"
        },
        "y": {
          "bin": {"maxbins": 10},
          "field": "Premarket % Costs",
          "title": "Premarket % Costs"
        },
        "color": {
          "aggregate": "count",
          "scale": {
            "scheme": "goldgreen"
          },
          "legend": {
            "direction": "horizontal",
            "gradientLength": 120,
            "orient": "none",
            "legendX": 500,
            "legendY": 10
          }
        }
      }
    },
    {
      "mark": {
        "type": "point",
        "tooltip": true,
        "color": "white",
        "stroke": "black",
        "strokeWidth": 0.5,
        "size": 25
      },
      "encoding": {
        "x": {
          "field": "Total Ref Estimate - RF",
          "type": "quantitative",
          "axis": {
            "format": "$.1s",
            "labelFontSize": 10
          },
          "scale": {
            "domain": [0, 70000000]
          }
        },
        "y": {
          "field": "Premarket % Costs",
          "type": "quantitative",
          "scale": {"domain": [0, 1]},
          "axis": {
            "format": ".0%",
            "labelFontSize": 10
          }
        },
        "xOffset": {
          "field": "External ID"
        }
      }
    },
    {
      "mark": {
        "type": "line",
        "color": "firebrick",
        "strokeWidth": 2.5,
        "strokeDash": [1, 4]
      },
      "transform": [
        {
          "regression": "Premarket % Costs",
          "on": "Total Ref Estimate - RF",
          "method": "poly",
          "extent": [1000000, 30000000]
        }
      ],
      "encoding": {
        "x": {
          "field": "Total Ref Estimate - RF",
          "type": "quantitative"
        },
        "y": {
          "field": "Premarket % Costs",
          "type": "quantitative"
        }
      }
    },
    {
      "transform": [
        {
          "regression": "Premarket % Costs",
          "on": "Total Ref Estimate - RF",
          "method": "poly",
          "extent": [1000000, 30000000],
          "params": true
        },
        {
          "calculate": "'R²: '+format(datum.rSquared, '.2f')",
          "as": "R2"
        }
      ],
      "mark": {
        "type": "text",
        "color": "firebrick",
        "x": "width",
        "align": "right",
        "y": -5
      },
      "encoding": {
        "text": {
          "type": "nominal",
          "field": "R2"
        }
      }
    }
  ]
}

Solution

  • The following works for me.

    "method": {"signal": "regressionline"},

    The schema throws a validation error (as VL doesn't support signals explicitly) but this does work and I can successfully switch between linear, quad and poly. Not sure about the validity of the other methods as I haven't looked at linear regressions in a while.