Search code examples
pythonplotlybar-chartpolar-coordinatesplotly-express

Bar polar with areas proportional to values


Based on this question I have the plot below. The issue is plotly misaligns the proportion between plot area and data value. I mean, higher values (e.g. going from 0.5 to 0.6) lead to a large increase in area (big dark green block) whereas from 0 to 0.1 is not noticiable (even if the actual data increment is the same 0.1).

Plot

import numpy as np
import pandas as pd
import plotly.express as px

df = px.data.wind()
df_test = df[df["strength"]=='0-1']

df_test_sectors = pd.DataFrame(columns=df_test.columns)

## this only works if each group has one row
for direction, df_direction in df_test.groupby('direction'):
    frequency_stop = df_direction['frequency'].tolist()[0]
    frequencies = np.arange(0.1, frequency_stop+0.1, 0.1)
    df_sector = pd.DataFrame({
        'direction': [direction]*len(frequencies),
        'strength': ['0-1']*len(frequencies),
        'frequency': frequencies
    })
    df_test_sectors = pd.concat([df_test_sectors, df_sector])
df_test_sectors = df_test_sectors.reset_index(drop=True)
df_test_sectors['direction'] = pd.Categorical(
    df_test_sectors['direction'], 
    df_test.direction.tolist() #sort the directions into the same order as those in df_test
)
df_test_sectors['frequency'] = df_test_sectors['frequency'].astype(float)
df_test_sectors = df_test_sectors.sort_values(['direction', 'frequency'])

fig = px.bar_polar(df_test_sectors, r='frequency', theta='direction', color='frequency', color_continuous_scale='YlGn')

fig.show()

Is there any way to make the plot with proportional areas to blocks to keep a more "truthful" alignment between the aesthetics and the actual data? So the closer to the center, the "longer" the blocks so the areas of all blocks are equal? Is there any option in Plotly for this?


Solution

  • You can construct a new column called r_outer_diff that stores radius differences (as you go from the inner most to outer most sector for each direction) to ensure the area of each sector is equal. The values for this column can be calculated inside the loop we are using to construct df_test_sectors using the following steps:

    • we start with the inner sector of r = 0.1 and find the area of that sector as a reference since we want all subsequent sectors to have the same area
    • then to construct the next sector, we need to find r_outer so that pi*(r_outer-r_inner)**2 * (sector angle/360) = reference sector area
    • we solve this formula for r_outer for each iteration of the loop, and use r_outer as r_inner for the next iteration of the loop. since plotly will draw the sum of all of the radiuses, we actually want to keep track of r_outer-r_inner for each iteration of the loop and this is the value we will store in the r_outer_diffs column

    Putting this into code:

    import numpy as np
    import pandas as pd
    import plotly.express as px
    
    df = px.data.wind()
    df_test = df[df["strength"]=='0-1']
    
    df_test_sectors = pd.DataFrame(columns=df_test.columns)
    
    ## this only works if each group has one row
    for direction, df_direction in df_test.groupby('direction'):
        frequency_stop = df_direction['frequency'].tolist()[0]
        frequencies = np.arange(0.1, frequency_stop+0.1, 0.1)
    
        r_base = 0.1
        sector_area = np.pi * r_base**2 * (16/360) 
    
        ## we can populate the list with the first radius of 0.1
        ## since that will stay fixed
        ## then we use the formula: sector_area = pi*(r_outer-r_inner)^2 * (sector angle/360)
        r_adjusted_for_area = [0.1]
        r_outer_diffs = [0.1]
        for i in range(len(frequencies)-1):
            r_inner = r_adjusted_for_area[-1]
            inner_sector_area = np.pi * r_inner**2 * (16/360)
            outer_sector_area = inner_sector_area + sector_area
            r_outer = np.sqrt(outer_sector_area * (360/16) / np.pi)
            r_outer_diff = r_outer - r_inner
            r_adjusted_for_area.append(r_outer)
            r_outer_diffs.append(r_outer_diff)
        df_sector = pd.DataFrame({
            'direction': [direction]*len(frequencies),
            'strength': ['0-1']*len(frequencies),
            'frequency': frequencies,
            'r_outer_diff': r_outer_diffs
        })
        df_test_sectors = pd.concat([df_test_sectors, df_sector])
    df_test_sectors = df_test_sectors.reset_index(drop=True)
    df_test_sectors['direction'] = pd.Categorical(
        df_test_sectors['direction'], 
        df_test.direction.tolist() #sort the directions into the same order as those in df_test
    )
    df_test_sectors['frequency'] = df_test_sectors['frequency'].astype(float)
    df_test_sectors = df_test_sectors.sort_values(['direction', 'frequency'])
    
    fig = px.bar_polar(df_test_sectors, r='r_outer_diff', theta='direction', color='frequency', color_continuous_scale='YlGn')
    
    fig.show()
    

    enter image description here