Search code examples
pythonpandastransformationshift

How to transform dataframe to from-to pairs?


If I have a dataframe:

>>> import pandas as pd
>>> df = pd.DataFrame([
...     ['A', 'B', 'C', 'D'],
...     ['E', 'B', 'C']
... ])
>>> df
   0  1  2     3
0  A  B  C     D
1  E  B  C  None
>>>  

I shoudl transform the dataframe to two columns format:

x, y
-----
A, B
B, C
C, D
E, B
B, C

For each row, from left to right, take two neighbor values and make a pair of it. It is kind of from-to if you consider each row as a path.

How to do the transformation?


Solution

  • We can do explode with zip

    s=pd.DataFrame(df.apply(lambda x : list(zip(x.dropna()[:-1],x.dropna()[1:])),axis=1).explode().tolist())
    Out[336]: 
       0  1
    0  A  B
    1  B  C
    2  C  D
    3  E  B
    4  B  C
    

    Update

    s=df.apply(lambda x : list(zip(x.dropna()[:-1],x.dropna()[1:])),axis=1).explode()
    s=pd.DataFrame(s.tolist(),index=s.index)
    
    s
    Out[340]: 
       0  1
    0  A  B
    0  B  C
    0  C  D
    1  E  B
    1  B  C