Search code examples
powerbivisualizationpowerbi-desktopvega-litedeneb

How to log transform my y-variable and set a domain scale to adjust max/min for various plots?


I'm trying to do the following: create a scatterplot with a loess regression with one dependent variable and two independent variables. I need to log transform my dependent variable and set the domain so that it can change based on the variability of the data based on the x-axis. How can I do both?

Problem: Whenever I use the log scale alone, and filter through my data in power-bi the chart disappears because the y-axis values are too high for some parameters. Using the domain function to set the min/max solves that issue, but I still need to log transform my dependent variable to make the chart meaningful( the y-axis varies from 4,000-1,000,000 depending on the independent variable) What do I want to do: log-transform my dependent variable and set the scale using the domain function as +/- 10% from my data.

Previously I set the log scale in the encoding for my y-variable "scale":{"type":log"}.but after following this example my domain is now: "scale":{"domain": {"expr": "[min,max]"}}.I also tried using the log-scaled histogram calculate documentation https://vega.github.io/vega-lite/examples/histogram_log.html. enter image description hereenter image description here Here's my code currently: `

{
  "data": {"name": "dataset"},
  "transform": [{
      "calculate": "log(datum.virus)/log(10)", "as": "log_v"
  }, 
  {
    "calculate": "pow(10, datum.bin_log_v)", "as": "log_virus"
    
  } 
  ],
    "transform": [
    {
      "joinaggregate": [
        {"op": "max", "field": "log_v", "as": "max"},
        {"op": "min", "field": "log_v", "as": "min"}
      ]
    },
    {"calculate": "datum.max * 1.2", "as": "max"},
    {"calculate": "datum.min * 0.8", "as": "min"}
  ],
  "params": [
    {"name": "max", "expr": "data('data_0')[0]['max']"},
    {"name": "min", "expr": "data('data_0')[0]['min']"}
  ],
  "layer": [
     {
      "mark": {
        "type": "point",
        "tooltip":true,
        "filled": true
      },
      "encoding": {
        "x": {
          "field": "date",
          "title": "Date",
          "type": "temporal"
        },
        "y": {
          "field": "virus",
          "title": "(Normalized Levels (N1/PMMOV Concentrations)",
          "type": "quantitative",
          "scale":{"domain": {"expr": "[min,max]"}}
        },
         "tooltip": [
          {
            "field": "virus",
            "type": "quantitative"
          },
          {
            "field": "date",
            "type": "temporal",
            "format": "%B %d %Y"
          },
          {
            "field":"siteno",
            "type":"nominal"
          },
          {
            "field": "virus_o",
            "type": "nominal"
          }
        ]
      }
  
    },
    {
      "name": "Regression Line",
      "transform": [
        {
          "loess": "virus",
          "on": "date",
          "bandwith":0.25
        }
      ],
      "mark": {
        "type": "line",
        "color": "#D64550"
      },
      "encoding": {
        "x": {
          "field": "date",
          "type": "temporal",
            "axis": {
        "format": "MMMM yyyy",
        "formatType": "pbiFormat"
      }
        },
        "y": {
          "field": "virus",
          "type": "quantitative"
        }
      }
    }
    
    ]
}

Solution

  • When using Loess Regression and a log scale, ensure that negative numbers are pruned.