I have a dataframe with columns including POSITION_x.VALUE, where x is 1 to 6. I have created a loop and carried out some statistical analysis to create OUTLIER_x. I would like to plot POSITION_x.VALUE and OUTLIER_x for each relevant column as part of [6] subplots in a figure.
Each individual plot shows up, with a blank subplot output instead of adding the axes to the subplot (7 figures not 1). The code is structured like:
## Initialize stats & axes
SD = {}; LCL = {}; UCL = {}; ax = {}
for (name, values) in df.items():
## If the columnname includes '.VALUE'...
if '.VALUE' in name:
## Get the position number
pos = int(name[9])
## Add a moving range column for each position
df['MR_'+str(pos)] = np.absolute(df[name].shift(1) - df[name])
## Create (temporary) mean of data & moving range
Mean = df[name].mean()
Mean_MR = df['MR_'+str(pos)].mean()
## Calculate StdDev, LCL and UCL for the VALUE column (using MR data to create StdDev)
SD[pos-1] = Mean_MR/1.128
LCL[pos-1] = df[name].mean() - SD[pos-1]*3
UCL[pos-1] = df[name].mean() + SD[pos-1]*3
## Add conditional 'OUTLIER_x' column for outside of 3SD
df['Outlier_' + str(pos)] = np.where((df[name] > UCL[pos-1]) | (df[name] < LCL[pos-1]), df[name], np.nan)
## Create axis for each VALUE & OUTLIER combination
ax[pos-1].plot(use_index = True, y = [df[name], df['Outlier_' + str(pos)]], style = ['o', 'o'], color = ['blue', 'red'], markersize = 2)
fig, ax = plt.subplots(6, 1, sharex=True)
plt.show()
Each subplot is correctly showing like:
(except that I don't want to plot them individually)
Solved: The solution was to:
fig, ax = plt.subplots(...
to above the loop (thank you @Lfppfs) and change ax to axesax[pos-1]
command to
df.plot(ax=axes[pos-1], use_index = True, y = [name, 'Outlier_' + str(pos)], style = ['x', 'o'], color = ['blue', 'red'], markersize = 2)
The full code is now:
fig, axes = plt.subplots(6, 1, sharex=True)
for (name, values) in df.items():
## If the columnname includes '.VALUE'...
if '.VALUE' in name:
## Add conditional 'OUTLIER_x' column for outside of 3SD
pos = int(name[9])
## Replace '-' with null and convert to float
df[name].replace({'-': np.nan},inplace =True)
df[name] = df[name].astype(float)
## Add a moving range column for each position
df['MR_'+str(pos)] = np.absolute(df[name].shift(1) - df[name])
## Create (temporary) mean of data & moving range
Mean = df[name].mean()
Mean_MR = df['MR_'+str(pos)].mean()
## Calculate StdDev, LCL and UCL for the VALUE column (using MR data to create StdDev)
SD[pos-1] = Mean_MR/1.128
LCL[pos-1] = df[name].mean() - SD[pos-1]*3
UCL[pos-1] = df[name].mean() + SD[pos-1]*3
## Add conditional 'OUTLIER_x' column for outside of 3SD
df['Outlier_' + str(pos)] = np.where((df[name] > UCL[pos-1]) | (df[name] < LCL[pos-1]), df[name], np.nan)
## Create axis for each VALUE & OUTLIER combination
df.plot(ax=axes[pos-1], use_index = True, y = [name, 'Outlier_' + str(pos)], style = ['x', 'o'], color = ['blue', 'red'], markersize = 2)
plt.show()
Note - ax[index].plot(...
has now changed to df.plot(ax=axes[index]...
This now gives the desired result (although I obviously need to tweak a couple of things!)