I have a dataframe
as shown below:
8964_real 8964_imag 8965_real 8965_imag 8966_real 8966_imag 8967_real ... 8984_imag 8985_real 8985_imag 8986_real 8986_imag 8987_real 8987_imag
0 112.653120 0.000000 117.104887 0.000000 127.593406 0.000000 129.522106 ... 0.000000 125.423552 0.000000 127.888477 0.000000 136.160979 0.000000
1 -0.315831 16.363974 -2.083329 22.443628 -2.166950 15.026253 0.110502 ... -26.613220 8.454297 -35.000742 11.871405 -24.914035 7.448329 -16.370041
2 -1.863497 10.672129 -6.152232 15.980813 -5.679352 18.976117 -5.775777 ... -11.131600 -18.990022 -9.520732 -11.947319 -4.641286 -17.104710 -5.691642
3 -6.749938 14.870590 -12.222749 15.012352 -10.501423 9.345518 -9.103459 ... -2.860546 -29.862724 -5.237663 -28.791194 -5.685985 -24.565608 -10.385683
4 -2.991405 -10.332938 -4.097638 -10.204587 -12.056221 -5.684882 -12.861357 ... 0.821902 -8.787235 -1.521650 -3.798446 -2.390519 -6.527762 -1.145998
I have to convert above dataframe such that values in columns "_real"
should come under one column and values under "_imag"
should come under another column
That is totally there should be two columns at the end , one for real and other for imag
.What could be the most efficient way to do it?
I refer this link . But this is good for one column,but I need two.
Another idea , I got was use regex to select columns containing "real"
and do as said in above link (and similarly for imag) ,but felt it a bit round about.
Any help appreciated.
EDIT:
For example, real
should be like
real
112.653120
-0.315831
-1.863497
-6.749938
-2.991405
---------
117.104887
-2.083329
-6.152232
-12.222749
-4.097638
---------
127.593406
-2.166950
-5.679352
-10.501423
-12.056221
I have made a dotted line to make it clear
Create MultiIndex
by split
, so possible reshape by DataFrame.stack
:
df.columns = df.columns.str.split('_', expand=True)
print (df.head(10))
8964 8965 8966 \
real imag real imag real imag
0 112.653120 0.000000 117.104887 0.000000 127.593406 0.000000
1 -0.315831 16.363974 -2.083329 22.443628 -2.166950 15.026253
2 -1.863497 10.672129 -6.152232 15.980813 -5.679352 18.976117
3 -6.749938 14.870590 -12.222749 15.012352 -10.501423 9.345518
4 -2.991405 -10.332938 -4.097638 -10.204587 -12.056221 -5.684882
8967 8984 8985 8986 \
real imag real imag real imag
0 129.522106 0.000000 125.423552 0.000000 127.888477 0.000000
1 0.110502 -26.613220 8.454297 -35.000742 11.871405 -24.914035
2 -5.775777 -11.131600 -18.990022 -9.520732 -11.947319 -4.641286
3 -9.103459 -2.860546 -29.862724 -5.237663 -28.791194 -5.685985
4 -12.861357 0.821902 -8.787235 -1.521650 -3.798446 -2.390519
8987
real imag
0 136.160979 0.000000
1 7.448329 -16.370041
2 -17.104710 -5.691642
3 -24.565608 -10.385683
4 -6.527762 -1.145998
df = df.stack(0).reset_index(level=0, drop=True).rename_axis('a').reset_index()
print (df.head(10))
a imag real
0 8964 0.000000 112.653120
1 8965 0.000000 117.104887
2 8966 0.000000 127.593406
3 8967 NaN 129.522106
4 8984 0.000000 NaN
5 8985 0.000000 125.423552
6 8986 0.000000 127.888477
7 8987 0.000000 136.160979
8 8964 16.363974 -0.315831
9 8965 22.443628 -2.083329
EDIT: For new structure of data is possible reshape values by ravel
:
a = df.filter(like='real')
b = df.filter(like='imag')
c = a.columns.str.replace('_real', '').astype(int)
print (c)
Int64Index([8964, 8965, 8966, 8967, 8985, 8986, 8987], dtype='int64')
df = pd.DataFrame({'r':a.T.to_numpy().ravel(), 'i':b.T.to_numpy().ravel()},
index=np.tile(c, len(df)))
print (df.head(10))
r i
8964 112.653120 0.000000
8965 -0.315831 16.363974
8966 -1.863497 10.672129
8967 -6.749938 14.870590
8985 -2.991405 -10.332938
8986 117.104887 0.000000
8987 -2.083329 22.443628
8964 -6.152232 15.980813
8965 -12.222749 15.012352
8966 -4.097638 -10.204587