Search code examples
pythonpandasdataframecell

Appending cell Value in Pandas for emptied cell in row


I have following table, based on the St_date, En_date is empty or not we have to merge the data in Des with upcoming rows till we find notnull

        St_date     En_date     Des         Ch      Deb     Cr      Tot
    0   01/06/18    01/06/18    CSH         NaN     NaN     1000    5786
    1   NaN         NaN         DEPOSIT     NaN     NaN     NaN     NaN
    2   01/06/18    01/06/18    DEP TFR     NaN     100     Nan     5686
    3   NaN         NaN         through     NaN     NaN     NaN     NaN

What I want is like below:

        St_date     En_date     Des             Ch      Deb     Cr      Tot
    0   01/06/18    01/06/18    CSH DEPOSIT     NaN     NaN     1000    5786    
    1   01/06/18    01/06/18    DEP TFR through NaN     100     Nan     5686

Anyone has Idea how to do using pandas?


Solution

  • You can do it like that (Note that I consider St_Date Nan like an empty string in the answer below):

    # Add a field containing previous index if St_date is empty
    df["idx"] = df.apply(lambda x: x.name if x.St_date!='' else None, axis=1).ffill()
    df
    

    Should return this :

        St_date     En_date     Des     Ch  Deb          Cr     Tot     idx
    0   01/06/18    01/06/18    CSH     nan     nan     1000    5786    0.0
    1                  nan    DEPOSIT   nan     nan     nan     nan     0.0
    2   01/06/18    01/06/18  DEP TFR   nan     100     nan     5686    2.0
    3                  nan    through   nan     nan     nan     nan     2.0
    

    Then you can group with this new column and concat your Des field:

    dfg = pd.DataFrame(df.groupby('idx')["Des"].apply(lambda x: "{%s}" % ', '.join(x)))
    # Then you merge the result with the original dataframe on index
    df = pd.merge(df.drop('Des',axis=1), dfg , left_index=True, right_index=True, how='left')
    # Filter rows with empty values in Des (not merged) and reset index
    df = df[df.Des.isna()==False].reset_index(drop=True)
    df
    

    There you go :

            St_date     En_date     Ch      Deb     Cr      Tot     idx     Des
      0     01/06/18    01/06/18    nan     nan     1000    5786    0.0     {CSH, DEPOSIT}
      1     01/06/18    01/06/18    nan     100     nan     5686    2.0     {DEP TFR, through}