I have multiple datasets with different size and I want to plot a violin plot from them. My dataset looks like below:
Input.CSV:
city_A city_B city C city_D
cluster1 2 5 4 4
cluster2 3 3 2 8
cluster3 2 4 5 5
cluster4 3 5 4
cluster5 3 3
cluster6 5
Note: Each city has a different size and number of clusters.
I looked into a few posts such as here and I could not understand how to plot this dataset in one plot like:
Some of the example from seaborn or matplotlib is with fake data and my data is in CSV format as I showed above. It would be great if you can provide your help with code that use data like mine.
If you have multiple lists you want to plot, you can put them as list of lists and plot them. You can read the documentation here https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.violinplot.html
from matplotlib import pyplot as plt
A = [2, 5, 6, 10, 12, 8, 5]
B = [2, 2, 6, 8, 14, 5, 5]
C = [5, 7, 5, 13, 17, 7, 5]
D = [1, 4, 7, 12, 12, 5, 5]
E = [4, 1, 2, 11, 13, 7, 5]
fig, ax = plt.subplots(figsize=(5,5))
ax.violinplot([A,B,C,D,E][::-1],positions =[5,4,3,2,1],vert=False,showmeans=True)
def set_axis_style(ax, labels):
ax.get_yaxis().set_tick_params(direction='out')
ax.xaxis.set_ticks_position('bottom')
ax.set_yticks(np.arange(1, len(labels) + 1))
ax.set_yticklabels(labels)
ax.set_ylim(0.25, len(labels) + 0.75)
ax.set_ylabel('Sample name')
set_axis_style(ax,['A','B','C','D','E'][::-1])
Seaborn looks like a better and more aesthetic solution for the dataframe.
from matplotlib import pyplot as plt
import seaborn as sns
fig, axes = plt.subplots(figsize=(5,5))
sns.set(style="whitegrid")
sns.violinplot(data=df, ax = axes, orient ='h')