Does anyone know a way to count values while annotating them? For example.
We have a reparative state with 3 different values. A, B, C. We want to count how many times we saw state A but values A are different length of A's in a column. We have Column A and want to get column B. So we have time series A, B, C and we want to count time wise that A is 1 then B is 1 and C is 1 but when we again see A is 2 and B is 2 and C is 2 and the A again is 3 and so on... (The order is always the same first A then B then C and again A...)
Any ideas?
Column A | Column B |
---|---|
A | 1 |
A | 1 |
A | 1 |
B | 1 |
B | 1 |
C | 1 |
A | 2 |
B | 2 |
C | 2 |
A | 3 |
A | 3 |
A | 3 |
A | 3 |
A | 3 |
A | 3 |
B | 3 |
B | 3 |
B | 3 |
C | 3 |
C | 3 |
Tried to get a loop but don't know how to count the state.
Try:
from itertools import count
from collections import defaultdict
d = defaultdict(count)
df['Column B new'] = df.groupby((df['Column A'] != df['Column A'].shift()).cumsum())['Column A'].transform(lambda x: next(d[x.iat[0]]) + 1)
print(df)
Prints:
Column A Column B Column B new
0 A 1 1
1 A 1 1
2 A 1 1
3 B 1 1
4 B 1 1
5 C 1 1
6 A 2 2
7 B 2 2
8 C 2 2
9 A 3 3
10 A 3 3
11 A 3 3
12 A 3 3
13 A 3 3
14 A 3 3
15 B 3 3
16 B 3 3
17 B 3 3
18 C 3 3
19 C 3 3