Search code examples
pythonpandasdataframegraphnetworkx

Detect presence of inverse pairs in two columns of a DataFrame


I have a dataframe with two columns; source, and target. I would like to detect inverse rows, i.e. for a pair of values (source, target), if there exists a pair of values (target, source) then assign True to a new column.

My attempt:

cols = ['source', 'target']
_cols = ['target', 'source']
sub_edges = edges[cols]
sub_edges['oneway'] = sub_edges.apply(lambda x: True if x[x.isin(x[_cols])] else False, axis=1)

Solution

  • You can apply a lambda function using similar logic to that in your example. We check if there are any rows in the dataframe with a reversed source/target pair.

    Incidentally, the column name 'oneway' indicates to me the opposite of the logic described in your question, but to change this we can just remove the not in the lambda function.

    Code

    import pandas as pd
    import random
    
    edges = {"source": random.sample(range(20), 20),
             "target": random.sample(range(20), 20)}
    
    df = pd.DataFrame(edges)
    
    df["oneway"] = df.apply(
        lambda x: not df[
            (df["source"] == x["target"]) & (df["target"] == x["source"]) & (df.index != x.name)
        ].empty,
        axis=1,
    )
    

    Output

        source  target  oneway
    0        9      11   False
    1       16       1    True
    2        1      16    True
    3       11      14   False
    4        4      13   False
    5       18      15   False
    6       14      17   False
    7       13      12   False
    8       19      19   False
    9       12       3   False
    10      10       6   False
    11      15       5   False
    12       3      18   False
    13      17       0   False
    14       6       7   False
    15       5      10   False
    16       7       2   False
    17       8       9   False
    18       0       4   False
    19       2       8   False