Search code examples
vega-litevega

Transforming Vega dataset to calculate and use min/max range


I have a dataset with a column called pval. I would like to get the minimum and maximum from this column and put it into an array.

I have the following transform array containing a formula expression:

transform = [{
  type: 'formula',
  expr: `[min(datum['pval']), max(datum['pval'])]`,
  as: 'pvalDomain'
}]

The idea here is to return a two-value array called pvalDomain, which contains the minimum and maximum of values in the pval column.

In the plot specification's encoding, I have a y.scale.domain property that currently works with a predefined array:

encoding: {
  y: {
    field: "pval",
    type: "quantitative",
    scale: {
      domain: [7e-5, 1e-9],
      type: "log"
    },
    ...
  },
  ...
}

If I replace the value of y.scale.domain with pvalDomain:

encoding: {
  y: {
    field: "pval",
    type: "quantitative",
    scale: {
      domain: "pvalDomain",
      type: "log"
    },
    ...
  },
  ...
}

I get the following error: TypeError: n.every is not a function.

What is the correct way to use a min-max range array built via a transform? Or is there a different way to calculate this range?


Solution

  • One way to do this is to use zero: false in the encoding, e.g.:

    encoding: {
      y: {
        field: "pval",
        type: "quantitative",
        scale: {
          zero: false,
          type: "log"
        },
        ...
      },
      ...
    }
    

    This doesn't seem to offer control over padding. Another way to do that is to perhaps use d3 or similar to build the domain:

    function paddedLogarithmicDomainForPValues(table, padding) {
      const rawPValues = d3.tsvParse(table, d3.autoType).map((d) => +d['pval']);
      return [d3.min(rawPValues) / padding, d3.max(rawPValues) * padding];
    }
    

    Then use the result in the encoding, specifying a reasonable padding value to taste:

    encoding: {
      y: {
        field: "pval",
        type: "quantitative",
        scale: {
          domain: paddedLogarithmicDomainForPValues(table, 1.5),
          type: "log"
        },
        ...
      },
      ...
    }