I have the following dataframe...
df1:
playerA playerB PlayerC PlayerD
kim lee b f
jackson kim d g
dan lee a d
I want to generate a new data frame with all possible combinations of two columns. For example,
df_new:
Target Source
kim lee
kim kim
kim lee
kim b
kim d
kim a
kim f
kim g
kim d
jackson lee
jackson kim
jackson lee
jackson b
.
.
.
.
lee kim
lee jackson
lee dan
lee b
lee d
.
.
.
Thus, I tried this code t
import itertools
def comb(df1):
return [df1.loc[:, list(x)].set_axis(['Target','Source'], axis=1)
for x in itertools.combinations(df1.columns, 2)]
However, It only shows combinations between columns in the same row.
Is there any way that I could generate all the possible combination between columns? Thanks in advance!
A way from itertools
via permutations
, product
and chain.from_iterable
:
from itertools import chain, permutations, product
df = pd.DataFrame(
chain.from_iterable(product(df1[col_1], df1[col_2])
for col_1, col_2 in permutations(df1.columns, r=2)),
columns=["Target", "Source"]
)
where we first get 2-permutations
of all columns, then for each pair, form a product
of their values. After doing this for all permutations, flatten them with chain.from_iterable
and pass to the dataframe constructor.
I get a 108 x 2 dataframe:
Target Source
0 kim lee
1 kim kim
2 kim lee
3 jackson lee
4 jackson kim
.. ... ...
103 g d
104 g a
105 d b
106 d d
107 d a
(where 108 = 3*9*4: 3 = rows, 9 = rows * other columns, 4 = total columns).