Ok, the question is if there is a fast way with pandas or numpy to xor an array and update the next row with the results.
Basically I have a pandas data frame named 'ss' like so:
rst no1 no2 no3 no4 no5 no6 no7
0 1 6 2 15 14 9 5 1
1 11 0 0 0 0 0 0 0
2 9 0 0 0 0 0 0 0
3 11 0 0 0 0 0 0 0
4 3 0 0 0 0 0 0 0
5 15 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0
Use: ss = pd.read_clipboard()
to copy paste the dataframe into a variable use the above command
What I want to do is to update each 'no' column with a xor from the next 'rst' column such that each no row in is equal to ss.loc[1:, ['no1', 'no2', 'etc']) = [ss.loc[1, ('rst')] ^ ss.loc[0, [0, ['no1', 'no2', 'etc']) or something like that so the first step would create a dataframe like this:
rst no1 no2 no3 no4 no5 no6 no7
0 1 6 2 15 14 9 5 1
1 11 13 9 4 5 2 14 10
2 9 0 0 0 0 0 0 0
3 11 0 0 0 0 0 0 0
4 3 0 0 0 0 0 0 0
5 15 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0
which is basically ss.loc[1, ('rst')] which is 11 so 11 ^ np.array([ 6, 2, 15, 14, 9, 5, 1]) which the result is np.array([13, 9, 4, 5, 2, 14, 10]) which then I set to each no column in sequence as you can see above.
and the next step is to take ss.loc[2, ('rst')] which is 9 and do the next sequence:
rst no1 no2 no3 no4 no5 no6 no7
0 1 6 2 15 14 9 5 1
1 11 13 9 4 5 2 14 10
2 9 4 0 13 12 11 7 3
3 11 0 0 0 0 0 0 0
4 3 0 0 0 0 0 0 0
5 15 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0
so 9 ^ np.array([13, 9, 4, 5, 2, 14, 10]) which the result is np.array([4, 0, 13, 12, 11 , 7, 3]) which then I set in each no column in sequence as you can see above.
My question is how do I do this with numpy or pandas in a fast/quick way, and can I do the without the use of any loops as I'm working with a data set of one million and looping is slow so I'm hoping there is a shortcut or better method of setting each 'no*' column with the xor of the next 'rst' row to the corresponding 'no' column in the same row as the 'rst' column.
IIUC, you can use numpy.bitwise_xor
, once in its accumulate
variant on rst
, then combined to the no
columns:
rst = ss['rst'].to_numpy(copy=True)[:,None]
rst[0] = 0
no = ss.filter(like='no').iloc[0].to_numpy()
x = np.bitwise_xor(np.bitwise_xor.accumulate(rst, axis=0), no)
out = ss[['rst']].join(
pd.DataFrame(x, index=ss.index, columns=list(ss.filter(like='no')))
)
This works because XOR is commutative and associative, so A^B^C
equals (A^C)^B
. Here we fist accumulate the XOR on rst
to then apply it on the first row for each intermediate.
Output:
rst no1 no2 no3 no4 no5 no6 no7
0 1 6 2 15 14 9 5 1
1 11 13 9 4 5 2 14 10
2 9 4 0 13 12 11 7 3
3 11 15 11 6 7 0 12 8
4 3 12 8 5 4 3 15 11
5 15 3 7 10 11 12 0 4
6 0 3 7 10 11 12 0 4