I have a problem with a df in pandas. Lets suppose I have this dataframe:
k = [1,2,3,4,5,6,7,8,9,10,11,12]
k = pd.DataFrame(k).T
Which is a 1x12 dataframe and I want to get a df with 4 columns of it, like k4:
k1 = pd.DataFrame([1,2,3,4])
k2 = pd.DataFrame( [5,6,7,8])
k3 = pd.DataFrame([9,10,11,12])
frames = [k1,k2,k3]
k4 = pd.concat(frames, axis = 1).T
My original df is much larger than k but its number of columns is multiple of 4 and I want to slice it into a 4 columns df. I guess it could be something related to i%4 == 0 but I dont really know how to do it.
Thanks in advance.
I miss a problem. I should have trasposed k4. Sorry guys.
TO sum up, i have a large row with a len multiple of 4, much larger than 12:
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11 12
And I need to make a df with 4 columns, with a change of row on each 4 elements:
0 1 2 3
0 1 2 3 4
0 5 6 7 8
0 9 10 11 12
You can create MultiIndex
in columns first by floor divide and modulo and then use stack
, for remove first level of MultiIndex
of index
add reset_index
:
k = [1,2,3,4,5,6,7,8,9,10,11,12]
k = pd.DataFrame(k).T
k.columns = [k.columns // 4, k.columns % 4]
print (k)
0 1 2
0 1 2 3 0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 10 11 12
print (k.stack().reset_index(level=0, drop=True))
0 1 2
0 1 5 9
1 2 6 10
2 3 7 11
3 4 8 12
EDIT:
Only need 0
for swap first level of MultiIndex
, not default last level
print (k.stack(0).reset_index(level=0, drop=True))
0 1 2 3
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
Or swap modulo with floor dividing:
k = [1,2,3,4,5,6,7,8,9,10,11,12]
k = pd.DataFrame(k).T
k.columns = [k.columns % 4, k.columns // 4]
print (k)
0 1 2 3 0 1 2 3 0 1 2 3
0 0 0 0 1 1 1 1 2 2 2 2
0 1 2 3 4 5 6 7 8 9 10 11 12
print (k.stack().reset_index(level=0, drop=True))
0 1 2 3
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
Another numpy solution with numpy.ndarray.reshape
is faster:
k = [1,2,3,4,5,6,7,8,9,10,11,12]
print (pd.DataFrame(np.array(k).reshape(-1,4)))
0 1 2 3
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12