Search code examples
pythonpandascomparestring-comparison

Comparing two columns in pandas data frame and return difference


I am comparing the content of two excel sheets that I have turned into a pandas data frame and placed the columns next to each other.

I have created some code that compares the two columns and gives the output below, but some of the strings contain large amounts of text so I would like to show only the difference.

+---------------------+-------------------------+---------------------------------------------+
|         Old         |           New           |                   Changes                   |
+---------------------+-------------------------+---------------------------------------------+
| Apple, Egg, Ham     | Apple, Egg, Norway, Ham | Apple, Egg, Ham --> Apple, Egg, Norway, Ham |
| Instagram, Facebook | Instagram, Twitter      | Instagram, Facebook --> Instagram, Twitter  |
+---------------------+-------------------------+---------------------------------------------+

The optimal result would look something like this:

+---------------------+-------------------------+---------------------+
|         Old         |           New           |       Changes       |
+---------------------+-------------------------+---------------------+
| Apple, Egg, Ham     | Apple, Egg, Norway, Ham | +Norway             |
| Instagram, Facebook | Instagram, Twitter      | +Twitter, -Facebook |
+---------------------+-------------------------+---------------------+

Norway has been added in row 1, Twitter has been added to row 2, and Facebook has been removed from row 2.

How can I solve this?


Solution

  • Convert values to sets and then use their difference, also add + and - in f-strings and join by , in last step:

    def f(x):
        old, new = set(x['Old'].split(', ')), set(x['New'].split(', '))
        d = old.difference(new)
        e = new.difference(old)
        return ', '.join([f'+{y}' for y in e] + [f'-{y}' for y in d])
    
    
    df['Changes'] = df.apply(f, axis=1)
    print (df)
                       Old                      New              Changes
    0      Apple, Egg, Ham  Apple, Egg, Norway, Ham              +Norway
    1  Instagram, Facebook       Instagram, Twitter  +Twitter, -Facebook