I have a dataframe of which I wan't to create subsets in a loop according to the values of one column.
Here is an example df :
c1 c2 c3
A 1 2
A 2 2
B 0 2
B 1 1
I would like to create subsets like so in a loop
first iteration, select all rows in which C1=A, and only columns 2 and 3, second, all rows in which C1=B, and only C2 and 3.
I've tried the following code :
for level in enumerate(df.loc[:,"C1"].unique()):
df_s = df.loc[df["C1"]==level].iloc[:, 1:len(df.columns)]
#other actions on the subsetted dataframe
but the subset isn't performed. How to iterate throudh the levels of a column
For instance in R it would be
for (le in levels(df$C1){
dfs <- df[df$C1==le,2:ncol(df)]
}
Thanks
There is no need for the enumerate
which gives both index and values, just loop through c1
column directly:
for level in df.c1.unique():
df_s = df.loc[df.c1 == level].drop('c1', 1)
print(level + ":\n", df_s)
#A:
# c2 c3
#0 1 2
#1 2 2
#B:
# c2 c3
#2 0 2
#3 1 1
Most likely, what you need is df.groupby('c1').apply(lambda g: ...)
, which should be a more efficient approach; Here g
is the sub data frame with a unique c1
value.