Search code examples
intersectionpandas-loc

How to stop overwriting values with NA when referenced index missing?


I'm trying to copy the 'Name' column values from two different dataframes stocks and major_indices to another dataframe mport if the index value of mport is present in stocks and major_indices dataframe.

Here is what I have so far:

>>> print(port_tickers_temp)
['ABBV', 'ABT', 'ACN', 'AXP', 'BA', 'BABA', 'BAC', 'BCS', 'BHP', 'BMY', 'BP', 'BRK-A', 'BRK-B', 'CAT', 'COP', 'CRM', 'CVX', 'DE', 'DHR', 'DJI']
mport = pd.DataFrame(index=port_tickers_temp)
mport['Name'] = stocks['Name']
mport['Name'] = major_indices['Name']

However, when pulling the values from major_indices, all the other stock values in port_tickers_temp are overwritten since they aren't present in major_indices:

>>> print(mport)
                               Name
ABBV                            NaN
ABT                             NaN
ACN                             NaN
AXP                             NaN
BA                              NaN
BABA                            NaN
BAC                             NaN
BCS                             NaN
BHP                             NaN
BMY                             NaN
BP                              NaN
BRK-A                           NaN
BRK-B                           NaN
CAT                             NaN
COP                             NaN
CRM                             NaN
CVX                             NaN
DE                              NaN
DHR                             NaN
DJI    Dow Jones Industrial Average

I also tried this, but it does the exact same thing:

mport['Name'] = stocks.loc[stocks.index.intersection(port_tickers_temp), 'Name']
mport['Name'] = major_indices.loc[major_indices.index.intersection(port_tickers_temp), 'Name']

Is there a way to bypass the fact that the missing ticker values are being overwritten as NA?


Solution

  • If I understand you correctly, you want to use .update:

    mport["Name"] = stocks["Name"]
    mport["Name"].update(major_indices["Name"])
    
    print(mport)
    

    Prints:

                  Name
    ABBV   AABV_stocks
    ABT         ABT_mi
    ACN         ACN_mi
    AXP            NaN
    BA             NaN
    BABA           NaN
    BAC            NaN
    BCS            NaN
    BHP            NaN
    BMY            NaN
    BP             NaN
    BRK-A          NaN
    BRK-B          NaN
    CAT            NaN
    COP            NaN
    CRM            NaN
    CVX            NaN
    DE             NaN
    DHR            NaN
    DJI            NaN
    

    Initial dataframes:

    # stocks
                 Name
    ABBV  AABV_stocks
    ACN    ACN_stocks
    
    # major_indices
           Name
    ABT  ABT_mi
    ACN  ACN_mi
    

    If you want to leave previous non-NaN value as is, you can use .combine_first:

    mport["Name"] = stocks["Name"]
    mport["Name"] = mport["Name"].combine_first(major_indices["Name"])
    
    print(mport)
    

    Prints:

                  Name
    ABBV   AABV_stocks
    ABT         ABT_mi
    ACN     ACN_stocks
    AXP            NaN
    BA             NaN
    
    ...