Search code examples
pythonpandasmultiple-columnsfillna

How to use Pandas to fill some cells with a defined rule?


This's a sample data from real.

df = pd.DataFrame({'P/N':['302-462-326','302-462-012','302-462-009'],
                  'Description':['CAP CER 0402 100pF 5% 50V','CAP CER 0402 6.8pF 0.25pF 50V','CAP CER 0402 3.9pF 0.25pF 50V'],
                  'Supplier.0':['MURATA','AVX Corporation','AVX Corporation'],
                  'Supplier PN.0':['GRM1555C1H101JA01D',np.nan,'04025A3R9CAT2A'],
                  'Supplier.1':[np.nan,'KEMET',np.nan],
                  'Supplier PN.1':['CC0402JRNPO9BN101','C0402C689C5GACTU',np.nan],
                  'Supplier.2':['Murata Electronics North America',np.nan,np.nan],
                  'Supplier PN.2':['GRM1555C1H101JA01J',np.nan,np.nan]
                  })

The Supplier and Supplier PN are paired up in the columns. The requirement is that if one of the Supplier and supplier PN is empty, but another one isn't empty, then we must fill 'NOT CLEAR' in the empty cell. We need to keep the cell to be empty when both of the Supplier and Supplier PN are empty. enter image description here

How to use Pandas to get the expected result which just as the picture below?Thanks.

The real data has many rows and columns may be reached to Supplier.20 and Supplier PN.20.

enter image description here


Solution

  • Another approach would be unpivotting using wide_to_long the dataframe and then check if sum of notna is 1 then fillna else dont , and pivot back again, this would help if you have many combinations of Supplier and Supplier PN.

    unpivotted_df = (pd.wide_to_long(df,['Supplier','Supplier PN'],
                     ['P/N','Description'],'idx',sep='.'))
    
    cond = unpivotted_df.notna().sum(1).eq(1)
    unpivotted_df[:]=np.where(cond[:,None],unpivotted_df.fillna('NOT CLEAR'),unpivotted_df)
    out = unpivotted_df.unstack().sort_index(level=1,axis=1).reindex(df[['P/N','Description']])
    out.columns=out.columns.map('{0[0]}.{0[1]}'.format) 
    

    enter image description here

    This is how the unpivoted_df looks like after the fillna: enter image description here