I have some data and want to build some categories.
Now, the data looks like this:
Var Category
a cat1
a cat1
b cat2
a cat1
b cat2
a cat1
But it should look like this:
Var Category
a cat1
a cat1
b cat2
a cat2
b cat3
a cat3
So, whenever 'Var' != 'a' 'Category' should move on to the next category and so on. How could I do this?
You can compare for not equal and then add cumulative sum by Series.cumsum
, add 1
if necessary, convert to strings and add to cat
:
df['Category'] = 'cat' + df.Var.ne('a').cumsum().add(1).astype(str)
Alternative:
df['Category'] = 'cat' + (df.Var != 'a').cumsum().add(1).astype(str)
print (df)
Var Category
0 a cat1
1 a cat1
2 b cat2
3 a cat2
4 b cat3
5 a cat3