Pandas: How can I preserve row ordering after subtraction?

I have two data frames with three columns, with identical column names. I want to subtract the value of the third column where the values of the first, and second columns match. I've tried the following:

# Common column names
columns = ["month", "category", "sum"]

# First data frame
data1 = [("jan", "j", 10), ("feb", "f", 20)]
df1 = pd.DataFrame.from_records(data1, columns=columns)

# Second data frame
data2 = [("jan", "j", 9.5), ("mar", "m", 30)]
df2 = pd.DataFrame.from_records(data2, columns=columns)

print(df1)  # Observe order of `month`s: jan, feb
print(df2)  # Observe order of `month`s: jan, mar

# Subtract `sum` where `month`, and `category` match:
df1.set_index(["month", "category"]).subtract(df2.set_index(["month", "category"])).reset_index()

This yields the following output. Observe that rows are sorted alphabetically on month.

  month category  sum
0   feb        f  NaN
1   jan        j  0.5
2   mar        m  NaN

How can I maintain the row order of the left hand operand? I.e. how can get the following output (or similar):

  month category  sum
1   jan        j  0.5
0   feb        f  NaN
2   mar        m  NaN

Solution

pd.merge will preserve the order of the left operand, then you can calculate the difference between two columns. You can do this for example:

df3 = pd.merge(df1, df2, on=["month", "category"], how="outer")
df3.loc[:, "difference"] = df3["sum_x"] - df3["sum_y"]

Which yields on your data:

  month category  sum_x  sum_y  difference
0   jan        j   10.0    9.5         0.5
1   feb        f   20.0    NaN         NaN
2   mar        m    NaN   30.0         NaN