python matplotlib seaborn legend histplot

Duplicated labels in the legend of Seaborn histplot

I am trying to generate a combined histplot with two datasets together with the following codes


_,bins = np.histogram([150, 600], bins=30)
alpha = 0.4

fig, ax = plt.subplots(1,1)
sns.histplot(df1['Tm/K Pred.'], bins=bins, alpha=alpha, label='df1')
sns.histplot(vispilsExp298Tm_bert['Tm/K Pred.'], bins=bins, alpha=alpha, label='df2')

plt.yscale('log')
plt.legend()
plt.show()

the histplot with duplicated labels

However, the labels were duplicated in the legend. May I ask how can I remove them ??

I checked:

handles, labels = ax.get_legend_handles_labels()
handles, labels

([<BarContainer object of 1 artists>,
  <BarContainer object of 30 artists>,
  <BarContainer object of 1 artists>,
  <BarContainer object of 30 artists>],
 ['df1', 'df1', 'df2', 'df2'])

Solution

You could remove the duplicated labels using a dictionary:

lgd, keys = ax.get_legend_handles_labels()
d = dict(zip(keys, lgd))
plt.legend(d.values(), d.keys())

Output:

Alternatively, what about merging the datasets and letting seaborn handle the legend?

import pandas as pd

sns.histplot(pd.concat({'df1': df1[['Tm/K Pred.']],
                        'df2': vispilsExp298Tm_bert[['Tm/K Pred.']]},
                       names=['dataset']).reset_index('dataset'),
             x='Tm/K Pred.', hue='dataset', bins=bins, alpha=alpha,
             hue_order=['df2', 'df1']
            )

Or, without pandas:

sns.histplot({'df1': df1['Tm/K Pred.'],
              'df2': vispilsExp298Tm_bert['Tm/K Pred.']},
             bins=bins, alpha=alpha,
             hue_order=['df2', 'df1']
            )

Output: