I have a dataframe to that have rows of survey response values (1 - 5) which I'm trying to get a standard deviation for.
The final column indicates which survey group the data belongs to (column name = Respondants). Because these columns have text instead of an integer, standard deviation returns NaN. skipna=True doesn't work in this scenario. I need to keep that column as the analysis will compare the responses from each group in a single scatterplot. I can't seem to find a way that I can get the standard deviation to ignore that column. I don't want to delete the column for the aforementioned reasons.
The code being used is:
df1['std dev']=df.std(skipna=True)
df1.head()
I'm not sure what I can add to ignore the column "Respondants" for the std.
EDIT
I found a workaround, wasn't ideal, but it did the job.
I split my data into 2 Excel sheets, dropped the offending column in each. Then performed my standard deviations, added the "Respondants" columns back into each dataframe and merged them into a new DF.
Try:
df.iloc[:, :-1].std()
In english, this means use all rows, and use all but the last column.
If you want a standard deviations per row, then you will need:
df.iloc[:, :-1].std(axis=1)