Search code examples
pythonpython-3.xpandasdataframeswap

Compare round-trip list values and change them Pandas


I have a df dataframe. The dataframe has three columns named "Source", "Destination" and "Path". The Path column is a list and the values ​​are the paths of various routers. The "Source" indicates the beginning of the path of the various routers and "Destination" indicates the end of the path of the various routers:

Source      Destination   Path
---------- ------------ -------------------------------------------------
10.0.13.100 10.0.14.100 ['10.0.1.26']
10.0.14.100 10.0.13.100 ['10.0.1.25']
10.0.13.100 10.0.12.100 ['10.0.1.17', '10.0.1.5', '10.0.1.2']
10.0.12.100 10.0.13.100 ['10.0.1.1', '10.0.1.6', '10.0.1.18']
10.0.13.100 10.0.22.100 ['10.0.1.17', '10.0.1.14', '10.0.4.6', '10.0.2.5']
10.0.22.100 10.0.13.100 ['10.0.2.6', '10.0.4.5', '10.0.1.13', '10.0.1.18']
10.0.11.100 10.0.13.100 ['10.0.1.6', '10.0.1.18']
10.0.13.100 10.0.11.100 ['10.0.1.17', '10.0.1.5']
.....................................................................

What I would like to create is an additional column called "Path_Def" where the return paths of the Path column are evaluated referring to the Source and Destination columns.

Example: I evaluate the origin -> 10.0.13.100 and destination -> 10.0.14.100 with path [....] and evaluate its inverse path or Origin -> 10.0.14.100 and Destination -> 10.0.13.100 with the its path [....]. What I would like to do is change the values ​​of the second path to those of the first path (always referring to the origin and the destination) by inverting those values.

So what you should do is like this for each value in the list:

  Source        Destination   Path
  10.0.13.100   10.0.22.100  ['10.0.1.17', '10.0.1.14', '10.0.4.6', '10.0.2.5']
  10.0.22.100   10.0.13.100  ['10.0.2.5', '10.0.4.6', '10.0.1.14', '10.0.1.17']
.............................................................................

..

The result I would like in the dataframe is like this:

Source      Destination   Path_Def                                      
10.0.13.100 10.0.14.100  ['10.0.1.26']                                
10.0.14.100 10.0.13.100  ['10.0.1.26']
10.0.13.100 10.0.12.100  ['10.0.1.17', '10.0.1.5', '10.0.1.2']
10.0.12.100 10.0.13.100  ['10.0.1.2', '10.0.1.5', '10.0.1.17']
10.0.13.100 10.0.22.100  ['10.0.1.17', '10.0.1.14', '10.0.4.6', '10.0.2.5']
10.0.22.100 10.0.13.100  ['10.0.2.5', '10.0.4.6', '10.0.1.14', '10.0.1.17']
10.0.11.100 10.0.13.100  ['10.0.1.6', '10.0.1.18']
10.0.13.100 10.0.11.100  ['10.0.1.18', '10.0.1.6']
.....................................................................

How can this be done? I really don't know where to start and would be very grateful

Edit after @Simon code:

enter image description here

Edit Merge DF:

enter image description here

Edit after last code:

enter image description here


Solution

  • Merge the dataframe to itself, and then applying a conditional and finally cleaning up the columns:

    import pandas as pd
    
    # setup
    source = ["10.0.14.100", "10.0.13.100", "10.0.12.100", "10.0.13.100", "10.0.22.100", "10.0.11.100", "10.0.13.100", "10.0.13.100"]
    destination = ["10.0.13.100", "10.0.12.100", "10.0.13.100", "10.0.22.100", "10.0.13.100", "10.0.13.100", "10.0.11.100", "10.0.14.100"]
    path = [['10.0.1.25'], ['10.0.1.17', '10.0.1.5', '10.0.1.2'], 
            ['10.0.1.1', '10.0.1.6', '10.0.1.18'], ['10.0.1.17', '10.0.1.14', '10.0.4.6', '10.0.2.5'], 
            ['10.0.2.6', '10.0.4.5', '10.0.1.13', '10.0.1.18'], ['10.0.1.6', '10.0.1.18'], ['10.0.1.17', '10.0.1.5'], ['10.0.1.26']]
    
    df = pd.DataFrame({"source":source, "destination":destination, "path":path})
    
    # join column onto itself
    df = df.merge(df, left_on=["destination", "source"], right_on=["source", "destination"])
    # Conditionally choose the right path
    df["Path_Def"] = np.where(df["source_x"] < df["destination_x"], df["path_x"], df["path_y"])
    # Flip the ones that need to be flipped
    df["Path_Def"] = [path[::-1] if source > destination else path for path, source, destination in zip(df["Path_Def"].values, df["source_x"].values, df["destination_x"].values)]
    
    
    # select columns you want to keep
    df = df[["source_x", "destination_x", "Path_Def"]]
    print(df)
    

    output:

          source_x destination_x                                    Path_Def
    0  10.0.14.100   10.0.13.100                                 [10.0.1.26]
    1  10.0.13.100   10.0.12.100             [10.0.1.18, 10.0.1.6, 10.0.1.1]
    2  10.0.12.100   10.0.13.100             [10.0.1.1, 10.0.1.6, 10.0.1.18]
    3  10.0.13.100   10.0.22.100  [10.0.1.17, 10.0.1.14, 10.0.4.6, 10.0.2.5]
    4  10.0.22.100   10.0.13.100  [10.0.2.5, 10.0.4.6, 10.0.1.14, 10.0.1.17]
    5  10.0.11.100   10.0.13.100                       [10.0.1.6, 10.0.1.18]
    6  10.0.13.100   10.0.11.100                       [10.0.1.18, 10.0.1.6]
    7  10.0.13.100   10.0.14.100                                 [10.0.1.26]