Basic Example:
# Given params such as:
params = {
'cols': 8,
'rows': 4,
'n': 4
}
# I'd like to produce (or equivalent):
col0 col1 col2 col3 col4 col5 col6 col7
row_0 0 1 2 3 0 1 2 3
row_1 1 2 3 0 1 2 3 0
row_2 2 3 0 1 2 3 0 1
row_3 3 0 1 2 3 0 1 2
Axis Value Counts:
df.apply(lambda x: x.value_counts(), axis=1)
0 1 2 3
row_0 2 2 2 2
row_1 2 2 2 2
row_2 2 2 2 2
row_3 2 2 2 2
df.apply(lambda x: x.value_counts())
col0 col1 col2 col3 col4 col5 col6 col7
0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
My attempt thus far:
import itertools
import pandas as pd
def create_df(cols, rows, n):
x = itertools.cycle(list(itertools.permutations(range(n))))
df = pd.DataFrame(index=range(rows), columns=range(cols))
df[:] = np.reshape([next(x) for _ in range((rows*cols)//n)], (rows, cols))
#df = df.T.add_prefix('row_').T
#df = df.add_prefix('col_')
return df
params = {
'cols': 8,
'rows': 4,
'n': 4
}
df = create_df(**params)
Output:
0 1 2 3 4 5 6 7
0 0 1 2 3 0 1 3 2
1 0 2 1 3 0 2 3 1
2 0 3 1 2 0 3 2 1
3 1 0 2 3 1 0 3 2
# Correct on this Axis:
>>> df.apply(lambda x: x.value_counts(), axis=1)
0 1 2 3
0 2 2 2 2
1 2 2 2 2
2 2 2 2 2
3 2 2 2 2
# Incorrect on this Axis:
>>> df.apply(lambda x: x.value_counts())
0 1 2 3 4 5 6 7
0 3.0 1 NaN NaN 3.0 1 NaN NaN
1 1.0 1 2.0 NaN 1.0 1 NaN 2.0
2 NaN 1 2.0 1.0 NaN 1 1.0 2.0
3 NaN 1 NaN 3.0 NaN 1 3.0 NaN
So, I have the conditions I need on one axis, but not on the other.
How can I update my method/create a method to meet both conditions?
You can tile
you input and use a custom roll to shift each row independently:
c = params['cols']
r = params['rows']
n = params['n']
a = np.arange(params['n']) # or any input
b = np.tile(a, (r, c//n))
# array([[0, 1, 2, 3, 0, 1, 2, 3],
# [0, 1, 2, 3, 0, 1, 2, 3],
# [0, 1, 2, 3, 0, 1, 2, 3],
# [0, 1, 2, 3, 0, 1, 2, 3]])
idx = np.arange(r)[:, None]
shift = (np.tile(np.arange(c), (r, 1)) - np.arange(r)[:, None])
df = pd.DataFrame(b[idx, shift])
Output:
0 1 2 3 4 5 6 7
0 0 1 2 3 0 1 2 3
1 3 0 1 2 3 0 1 2
2 2 3 0 1 2 3 0 1
3 1 2 3 0 1 2 3 0
Alternative order:
idx = np.arange(r)[:, None]
shift = (np.tile(np.arange(c), (r, 1)) + np.arange(r)[:, None]) % c
df = pd.DataFrame(b[idx, shift])
Output:
0 1 2 3 4 5 6 7
0 0 1 2 3 0 1 2 3
1 1 2 3 0 1 2 3 0
2 2 3 0 1 2 3 0 1
3 3 0 1 2 3 0 1 2
Other alternative: use a custom strided_indexing_roll
function.