I'm trying to do the following: create a scatterplot with a loess regression with one dependent variable and two independent variables. I need to log transform my dependent variable and set the domain so that it can change based on the variability of the data based on the x-axis. How can I do both?
Problem: Whenever I use the log scale alone, and filter through my data in power-bi the chart disappears because the y-axis values are too high for some parameters. Using the domain function to set the min/max solves that issue, but I still need to log transform my dependent variable to make the chart meaningful( the y-axis varies from 4,000-1,000,000 depending on the independent variable) What do I want to do: log-transform my dependent variable and set the scale using the domain function as +/- 10% from my data.
Previously I set the log scale in the encoding for my y-variable "scale":{"type":log"}.but after following this example my domain is now: "scale":{"domain": {"expr": "[min,max]"}}.I also tried using the log-scaled histogram calculate documentation https://vega.github.io/vega-lite/examples/histogram_log.html. Here's my code currently: `
{
"data": {"name": "dataset"},
"transform": [{
"calculate": "log(datum.virus)/log(10)", "as": "log_v"
},
{
"calculate": "pow(10, datum.bin_log_v)", "as": "log_virus"
}
],
"transform": [
{
"joinaggregate": [
{"op": "max", "field": "log_v", "as": "max"},
{"op": "min", "field": "log_v", "as": "min"}
]
},
{"calculate": "datum.max * 1.2", "as": "max"},
{"calculate": "datum.min * 0.8", "as": "min"}
],
"params": [
{"name": "max", "expr": "data('data_0')[0]['max']"},
{"name": "min", "expr": "data('data_0')[0]['min']"}
],
"layer": [
{
"mark": {
"type": "point",
"tooltip":true,
"filled": true
},
"encoding": {
"x": {
"field": "date",
"title": "Date",
"type": "temporal"
},
"y": {
"field": "virus",
"title": "(Normalized Levels (N1/PMMOV Concentrations)",
"type": "quantitative",
"scale":{"domain": {"expr": "[min,max]"}}
},
"tooltip": [
{
"field": "virus",
"type": "quantitative"
},
{
"field": "date",
"type": "temporal",
"format": "%B %d %Y"
},
{
"field":"siteno",
"type":"nominal"
},
{
"field": "virus_o",
"type": "nominal"
}
]
}
},
{
"name": "Regression Line",
"transform": [
{
"loess": "virus",
"on": "date",
"bandwith":0.25
}
],
"mark": {
"type": "line",
"color": "#D64550"
},
"encoding": {
"x": {
"field": "date",
"type": "temporal",
"axis": {
"format": "MMMM yyyy",
"formatType": "pbiFormat"
}
},
"y": {
"field": "virus",
"type": "quantitative"
}
}
}
]
}
When using Loess Regression and a log scale, ensure that negative numbers are pruned.