Search code examples
pythonpandasdata-visualizationseabornswarmplot

What is the problem with hue in my swarmplot?


I have this dataset: https://www.kaggle.com/abcsds/pokemon/download. I loaded it and did some changes:

import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

pokemons=pd.read_csv('../input/pokemon/Pokemon.csv')

del pokemons['Type 2']
pokemons.rename(columns={'Type 1':'Type'},inplace=True)

What I want is to make some swarmplots for each stat of the each pokemons type with hue=Legendary. I want to visualize how are legendary pokemons situated. I already did swarmplots without hue. Firstly, I needed to melt the dataframe:

pok_melt=pd.melt(pokemons,id_vars=['Name','Type','Legendary'],value_vars=['HP','Defense','Attack','Sp. Atk','Sp. Def','Speed'])
pok_melt.head()  

Then, the code for swarmplots (At one point I needed the types names alphabetically ordered for another plot so that's why they are ordered):

list_types=pokemons['Type'].unique().tolist() 
list_types.sort()
list_types

plt.figure(figsize=(17,22))
k=1
for i in list_types:
    plt.subplot(6,3,k)
    k=k+1
    sns.swarmplot(x=pok_melt.variable,y=pok_melt[pok_melt.Type==i].value,palette='gist_stern')
    plt.title(i)
    plt.xlabel('')

These are some of the swarmplots:

enter image description here

So I tried to do this:

plt.figure(figsize=(17,22))
k=1
for i in list_types:
    plt.subplot(6,3,k)
    k=k+1
    sns.swarmplot(x=pok_melt.variable,y=pok_melt[pok_melt.Type==i].value,palette='gist_stern',
    hue=pok_melt.Legendary)
    plt.title(i)
    plt.xlabel('')

And i get this error: IndexError: boolean index did not match indexed array along dimension 0; dimension is 69 but corresponding boolean dimension is 800


Solution

  • Filter column Legendary like y parameter:

    plt.figure(figsize=(17,22))
    k=1
    for i in list_types:
        plt.subplot(6,3,k)
        k=k+1
        sns.swarmplot(x=pok_melt.variable,
                      y=pok_melt[pok_melt.Type==i].value,
                      hue=pok_melt[pok_melt.Type==i].Legendary,
                      palette='gist_stern')
        plt.title(i)
        plt.xlabel('')
    

    Or better is filter only once fo variable df and assign columns df['value'] to y and df['Legendary'] to hue:

    plt.figure(figsize=(17,22))
    k=1
    for i in list_types:
        plt.subplot(6,3,k)
        k=k+1
        df = pok_melt.loc[pok_melt.Type==i]
    
        sns.swarmplot(x=pok_melt.variable,
                      y=df['value'],
                      hue=df['Legendary'],
                      palette='gist_stern')
        plt.title(i)
        plt.xlabel('')