Search code examples
pythonpandasdataframe

Adding a combination in a datafra, which is missing. Pandas data frame


Problem: I have the following dataframe (this is a part of a large data frame)

        Name X      Name Y       Value
0       joe         mark         12
1       joe         alice        13
2       joe         bert         14
3       mark        joe          15
4       mark        alice        14
5       alice       joe          16
6       alice       bert         30
7       alice       mark         50

Requested change to the data frame:

        Name X      Name Y       Value
0       joe         joe          0
1       joe         mark         12
2       joe         alice        13
3       joe         bert         14
4       mark        mark         0
5       mark        joe          15
6       mark        alice        14
7       alice       alice        0
8       alice       joe          16
9       alice       bert         30
10       alice       mark        50

Note the order. Reason: I want the pivot table to get the diagonal combination (Joe,Joe) with value 0, (mark, mark) with value 0 etc.

I have looked at pandas documenation, and I do not find the proper way/implement to do it.


Solution

  • Create helper DataFrame by DataFrame.drop_duplicates and DataFrame.assign and add to original DataFrame by concat, last sort output by DataFrame.sort_index:

    out = (pd.concat([df.drop_duplicates('Name X')
                        .assign(**{'Name Y': lambda x: x['Name X'], 'Value':0}),
                     df])
             .sort_index(kind='stable', ignore_index=True))
    print (out)
       Name X Name Y  Value
    0     joe    joe      0
    1     joe   mark     12
    2     joe  alice     13
    3     joe   bert     14
    4    mark   mark      0
    5    mark    joe     15
    6    mark  alice     14
    7   alice  alice      0
    8   alice    joe     16
    9   alice   bert     30
    10  alice   mark     50