I have this data:
import pandas as pd
URL = "https://stepik.org/media/attachments/lesson/9250/atherosclerosis.csv"
data = pd.read_csv(URL)
atherosclerosis = data.groupby(["age", "dose"]).agg(['mean', 'std'])
atherosclerosis.columns = ['_'.join(col) for col in atherosclerosis.columns]
atherosclerosis
Result:
expr_mean expr_std
age dose
1 D1 104.758464 5.863454
D2 105.545864 4.369024
2 D1 101.004805 5.116310
D2 102.273629 5.135374
And I plot error bars like this:
plot_data1 = atherosclerosis.xs('D1', level=1, drop_level=False)
plot_data2 = atherosclerosis.xs('D2', level=1, drop_level=False)
plot_index1 = [str(idx) for idx in plot_data1.index]
plot_index2 = [str(idx) for idx in plot_data2.index]
plt.errorbar(plot_index1, plot_data1["expr_mean"],
yerr=plot_data1["expr_std"]/2,
marker="s", mfc='green',
markeredgewidth=2, capsize=4, capthick=2,
fmt='o-', ecolor="magenta")
plt.errorbar(plot_index2, plot_data2["expr_mean"],
yerr=plot_data2["expr_std"]/2,
marker="s", mfc='green',
markeredgewidth=2, capsize=4, capthick=2,
fmt='o-', ecolor="magenta")
plt.show()
Result:
]1
Can I somehow connect (1, 'D1') with (1, 'D2') and (2, 'D1') with (2, 'D2'). Like this:
]2
You can just change your plot data:
plot_data1 = atherosclerosis.xs(1, level=0, drop_level=False)
plot_data2 = atherosclerosis.xs(2, level=0, drop_level=False)
Output:
Update: to get what you are asking for, I would sort the data, and plot it against the range:
atherosclerosis = atherosclerosis.sort_index(level=(1,0)) atherosclerosis['range'] = np.arange(len(atherosclerosis))
plot_data1 = atherosclerosis.xs(1, level=0, drop_level=False)
plot_data2 = atherosclerosis.xs(2, level=0, drop_level=False)
plot_index1 = [str(idx) for idx in plot_data1.index]
plot_index2 = [str(idx) for idx in plot_data2.index]
# atherosclerosis.expr_mean.sort_index(level=['dose','age']).plot(alpha=0)
plt.errorbar(plot_data1['range'], plot_data1["expr_mean"],
yerr=plot_data1["expr_std"]/2,
marker="s", mfc='green',
markeredgewidth=2, capsize=4, capthick=2,
fmt='o-', ecolor="magenta")
plt.errorbar(plot_data2['range'], plot_data2["expr_mean"],
yerr=plot_data2["expr_std"]/2,
marker="s", mfc='green',
markeredgewidth=2, capsize=4, capthick=2,
fmt='o-', ecolor="magenta")
plt.xticks(atherosclerosis['range'], atherosclerosis.index);
Output: