Search code examples
pythonpython-3.xpandasradar-chart

pandas.DataFrame.drop dropped wrong label


I'm using the code from https://python-graph-gallery.com/391-radar-chart-with-several-individuals/ and after I change some label of it, it is not working anymore. I have a dataframe:

df = pd.DataFrame({
    'group': ['A', 'B', 'C', 'D'],
    'var1': [38, 1.5, 30, 4],
    'var2': [29, 10, 9, 34],
    'var3': [8, 39, 23, 24],
    'var4': [7, 31, 33, 14],
    'var5': [28, 15, 32, 14]
})

values = df.loc[0].drop('group').values.flatten().tolist()
values += values[:1]

values = df.loc[1].drop('group').values.flatten().tolist()
values += values[:1]

values = df.loc[2].drop('group').values.flatten().tolist()
values += values[:1]

It is just the same code from the website, and the radar graph is dropping group correctly. correct radar graph

But, if I change var1 to a or anything else, it will not drop group correctly.Incorrect radar

I have tried all the way that I can try but it still didn't solve the issue. Whenever the name of var2 changed, it is not dropping group. Please help me to solve it or tell me where is wrong, thanks!

Full Code:

# Libraries
import matplotlib.pyplot as plt
import pandas as pd
from math import pi

# Set data
df = pd.DataFrame({
    'group': ['A', 'B', 'C', 'D'],
    'var1': [38, 1.5, 30, 4],
    'var2': [29, 10, 9, 34],
    'var3': [8, 39, 23, 24], # if you change var3 to asdfs(some random thing), the issue will exist
    'var4': [7, 31, 33, 14],
    'var5': [28, 15, 32, 14]
})

categories = list(df)[1:]
N = len(categories)

angles = [n / float(N) * 2 * pi for n in range(N)]
angles += angles[:1]

ax = plt.subplot(111, polar=True)

ax.set_theta_offset(pi / 2)
ax.set_theta_direction(-1)

plt.xticks(angles[:-1], categories)

# Draw ylabels
ax.set_rlabel_position(0)
plt.yticks([10, 20, 30], ["10", "20", "30"], color="grey", size=7)
plt.ylim(0, 40)

values = df.loc[0].drop('group').values.flatten().tolist()
values += values[:1]
ax.plot(angles, values, linewidth=1, linestyle='solid', label="group A")
ax.fill(angles, values, 'b', alpha=0.1)

values = df.loc[1].drop('group').values.flatten().tolist()
values += values[:1]
ax.plot(angles, values, linewidth=1, linestyle='solid', label="group B")
ax.fill(angles, values, 'r', alpha=0.1)

values = df.loc[1].drop('group').values.flatten().tolist()
values += values[:1]
ax.plot(angles, values, linewidth=1, linestyle='solid', label="group C")
ax.fill(angles, values, 'r', alpha=0.1)

plt.legend(loc='upper right', bbox_to_anchor=(0.1, 0.1))

plt.show()

Solution

  • The issue is that while you drop the correct values, you do not always drop the correct names. The problem is in the first few lines:

    df = pd.DataFrame({
        'group': ['A', 'B', 'C', 'D'],
        'var1': [38, 1.5, 30, 4],
        # ...
    })
    
    categories = list(df)[1:]
    

    You construct the DataFrame from a dict. And dicts do not retain the order you write them in, as you have assumed. So list(df)[1:] may contain any arbitrary ordering of the column names from df, with one (arbitrary) name removed.

    An easy fix is:

    categories = df.columns.drop('group').tolist()
    

    But note this may still leave you with a plot whose categories move around seemingly at random. To control the order, here is one solution:

    df = pd.DataFrame.from_items([
        ('group', ['A', 'B', 'C', 'D']),
        ('var1', [38, 1.5, 30, 4]),
        # ...
    ])
    

    By using a list instead of a dict, the ordering will be preserved, and list(df)[1:] will always exclude group.