Thanks by advance for people who will try to help me. This is the first time I ask a question as I have been struggling for days on this one! Eternal glory to the one helping me out with this!
Let me explain my problem with a few lines of codes and screens.
I want to create a treemap showing the growth of values between 2 dates. In order to be more precise, I want this treemap to: -Have squares that have a size proportional to a value x at date 2 AND be coloured according to a scale showing the growth of this value x from date 1 to date 2.
Let us consider the following example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly
data = {'variable': ['a', 'b', 'c'],
'parent': ['I', 'I', 'II'],
'value_1': [1,4,5],
'value_2': [4,2,5]
}
df = pd.DataFrame(data)
df['growth'] = 100 * (df['value_2'] / df['value_1'] - 1)
fig = px.treemap(df, path=['parent', 'variable'], values = 'value_2', color='growth',
color_continuous_scale='plasma')
fig.show()
It gives me the beautiful treemap here: Growth treemap
But here is the problem. As you may see on the following screen, the growth for I is 183%: a wrong growth!
However, when calculating manually, a going from 1 to 4, and b from 4 to 2, the growth should be: 1/5 * 300% + 4/5 * -50% = 20% (I goes from 5 to 6).
This is due because the calculus that is made is 4/6 * 300% + 2/6 * -50% = 183%. The method is calculating the weighting average wrt to the new coefficients, and not the former ones as it should in theory.
Is there a way, to have the correct growth when aggregating to a parent class?
Thank you very much for your help, and let me know if I can help further
I couldn't find a way to get the data across as you're trying to depict it. However, I did come up with a workaround.
This requires the use of plotly.io
.
I want to point out that the nice contrast you have with the colors is lost, when you change the parent to 20% from 183.333333--- essentially that parent is nearly the same color as II, because the values are 20 and 0, whereas 'a' is 300 and the low is only -50.
Additionally, I added px.Constant
so that you don't get a really useless hover label for the root (the black-ish background parent of the parents).
import pandas as pd
import plotly.express as px
import plotly.io as pio
fig = px.treemap(df, path=[px.Constant('Total'), 'parent', 'variable'],
values = 'value_2', color='growth',
color_continuous_scale='plasma')
Now when you use pio
, you will create an external file, but this is only way, short of using Jupyter, to add Javascript to your plot. This will automatically open in your browser, like fig.show()
, except this will reflect that the parent I
has a growth of 20%
in the hover data.
pio.write_html(fig, 'index.html', auto_open = True, div_id = 'thisPlot',
include_mathjax = 'cdn', include_plotlyjs = 'cdn', full_html = True,
post_script = "setTimeout(function() {" +
"el = document.getElementById('thisPlot');" +
"el.data[0].marker.colors[3] = 20; /* change the calc value */" +
"Plotly.newPlot(el, el.data, el.layout); /* re-plot it */"
"}, 200)")
You may notice that there is el.data[0].marker.colors[3]
called to change. That's the parent I
.
Here's all of the data that is captured in el.data[0].marker.colors
before this change is made: [300, -50, 0, 183.33333333333334, 0, 100]
.
By the way, whenever I go the route of pio.write_html
, I always name the file the same thing, so it's always overwriting itself. I'm not interested in the saved file personally, just the outcome of post_script
.