Python - Add Missing Data to New Dataframe

I am comparing two dataframes by looping through both and determining if data (row) from one does not exist in the other. If there is no match, I want to add that specific row to a third dataframe. I was able to achieve this with the append() method, but Python warns me that it is deprecated and tells me to use concat instead. Unfortunately, I cannot get concat to achieve the same result.

import pandas as pd

df1 = pd.DataFrame(
    {
         "Animal": ["Corgi", 
                       "Labrador",
                       "Orange Cat",
                       "Black Cat"],
         "Owner": ["Bob",
                        "Joe",
                        "Sara",
                        "Glen"]         
     })

df2 = pd.DataFrame(
    {
         "Animal": ["Corgi", 
                       "Labrador",
                       "Black Cat"],
         "Owner": ["Bob",
                        "Joe",
                        "Glen"]         
     })

dfF = pd.DataFrame(columns=["Animal", "Owner"])

df1Sort = df1.sort_values(by=["Animal"]).reset_index(drop=True)
df2Sort = df2.sort_values(by=["Animal"]).reset_index(drop=True)

for ind in df1Sort.index:
    for subInd in df2Sort.index:   
        if df1Sort["Animal"][ind] == df2Sort["Animal"][subInd]:
            print("Found match:", 
                  df1Sort["Animal"][ind],
                  "-",
                  df1Sort["Owner"][ind])            
            break
        elif df2Sort.index[subInd] == df2Sort.index[-1]:
            print("No match for:",
                  df1Sort["Animal"][ind],
                  "-",
                  df1Sort["Owner"][ind])
            dfF = pd.concat([dfF, df1Sort.iloc[ind]], axis=0, ignore_index=True)
            break

print(dfF)

The output is:

Found match: Black Cat - Glen
Found match: Corgi - Bob
Found match: Labrador - Joe
No match for: Orange Cat - Sara
Animal Owner 0
0 NaN NaN Orange Cat
1 NaN NaN Sara

I would like the new dataframe to read as such:

Animal Owner
0 Orange Cat Sara

How do I alter the concat line to achieve this?

Solution

This issue happens because df1Sort.iloc[ind] returns a Series, not a Dataframe, and pd.concat expects a Dataframe. You need to wrap the Series in a DataFrame to ensure proper concatenation.

dfF = pd.concat([dfF, pd.DataFrame([df1Sort.iloc[ind]])], ignore_index=True)

Here's the updated code

import pandas as pd

df1 = pd.DataFrame({
    "Animal": ["Corgi", "Labrador", "Orange Cat", "Black Cat"],
    "Owner": ["Bob", "Joe", "Sara", "Glen"]
})

df2 = pd.DataFrame({
    "Animal": ["Corgi", "Labrador", "Black Cat"],
    "Owner": ["Bob", "Joe", "Glen"]
})

dfF = pd.DataFrame(columns=["Animal", "Owner"])

df1Sort = df1.sort_values(by=["Animal"]).reset_index(drop=True)
df2Sort = df2.sort_values(by=["Animal"]).reset_index(drop=True)

for ind in df1Sort.index:
    for subInd in df2Sort.index:
        if df1Sort["Animal"][ind] == df2Sort["Animal"][subInd]:
            print("Found match:", df1Sort["Animal"][ind], "-", df1Sort["Owner"][ind])
            break
        elif df2Sort.index[subInd] == df2Sort.index[-1]:
            print("No match for:", df1Sort["Animal"][ind], "-", df1Sort["Owner"][ind])
            dfF = pd.concat([dfF, pd.DataFrame([df1Sort.iloc[ind]])], ignore_index=True)
            break

print(dfF)

I hope this will help you a little.