Search code examples
pythonpandasmatplotliberrorbar

How can I control coherence between error bar plots in Python?


I have this data:

import pandas as pd
URL = "https://stepik.org/media/attachments/lesson/9250/atherosclerosis.csv"
data = pd.read_csv(URL)
atherosclerosis = data.groupby(["age", "dose"]).agg(['mean', 'std'])
atherosclerosis.columns = ['_'.join(col) for col in atherosclerosis.columns]
atherosclerosis

Result:

          expr_mean  expr_std
age dose                      
1   D1    104.758464  5.863454
    D2    105.545864  4.369024
2   D1    101.004805  5.116310
    D2    102.273629  5.135374

And I plot error bars like this:

plot_data1 = atherosclerosis.xs('D1', level=1, drop_level=False)
plot_data2 = atherosclerosis.xs('D2', level=1, drop_level=False)
plot_index1 = [str(idx) for idx in plot_data1.index]
plot_index2 = [str(idx) for idx in plot_data2.index]
plt.errorbar(plot_index1, plot_data1["expr_mean"], 
             yerr=plot_data1["expr_std"]/2,
             marker="s", mfc='green', 
             markeredgewidth=2, capsize=4, capthick=2, 
             fmt='o-', ecolor="magenta")
plt.errorbar(plot_index2, plot_data2["expr_mean"], 
             yerr=plot_data2["expr_std"]/2,
             marker="s", mfc='green', 
             markeredgewidth=2, capsize=4, capthick=2, 
             fmt='o-', ecolor="magenta")
plt.show()

Result:

[Result]1

Can I somehow connect (1, 'D1') with (1, 'D2') and (2, 'D1') with (2, 'D2'). Like this:

[Like there]2


Solution

  • You can just change your plot data:

    plot_data1 = atherosclerosis.xs(1, level=0, drop_level=False)
    plot_data2 = atherosclerosis.xs(2, level=0, drop_level=False)
    

    Output:

    enter image description here


    Update: to get what you are asking for, I would sort the data, and plot it against the range:

    atherosclerosis = atherosclerosis.sort_index(level=(1,0)) atherosclerosis['range'] = np.arange(len(atherosclerosis))

    plot_data1 = atherosclerosis.xs(1, level=0, drop_level=False)
    plot_data2 = atherosclerosis.xs(2, level=0, drop_level=False)
    plot_index1 = [str(idx) for idx in plot_data1.index]
    plot_index2 = [str(idx) for idx in plot_data2.index]
    
    # atherosclerosis.expr_mean.sort_index(level=['dose','age']).plot(alpha=0)
    plt.errorbar(plot_data1['range'], plot_data1["expr_mean"], 
                 yerr=plot_data1["expr_std"]/2,
                 marker="s", mfc='green', 
                 markeredgewidth=2, capsize=4, capthick=2, 
                 fmt='o-', ecolor="magenta")
    plt.errorbar(plot_data2['range'], plot_data2["expr_mean"], 
                 yerr=plot_data2["expr_std"]/2,
                 marker="s", mfc='green', 
                 markeredgewidth=2, capsize=4, capthick=2, 
                 fmt='o-', ecolor="magenta")
    
    plt.xticks(atherosclerosis['range'], atherosclerosis.index);
    

    Output:

    enter image description here