Search code examples
pythondataframerows

Merge above and below rows in DataFrame


I have a dataframe, but I need to perform some cleaning, in some cases the values from above and below rows in column "Code" are separated, is there any way to merge these rows? df - is what I have, and df1 what I need to have (the values can be both in numeric and string format).

import pandas as pd
import numpy as np 


df = pd.DataFrame({"DT":[np.nan,65,np.nan, 65,45,np.nan,34,np.nan],
      "Code":[56,np.nan,"TD","TR",52,1,np.nan,76],
      "DV":[np.nan,"CAT",np.nan,"B54","T65",np.nan,765,np.nan],
      "GT":[np.nan,76,np.nan,np.nan,np.nan,np.nan,65,np.nan]})


df1 = pd.DataFrame({"DT":[65,65,45,34],
      "Code":["56 TD","TR",52,"1 76"],
      "DV":["CAT","B54","T65",765],
      "GT":[76,np.nan,np.nan,65]})

Solution

  • Here is my solution:

    Original dataset (df):

         DT Code   DV    GT
    0   NaN   56  NaN   NaN
    1  65.0  NaN  CAT  76.0
    2   NaN   TD  NaN   NaN
    3  65.0   TR  B54   NaN
    4  45.0   52  T65   NaN
    5   NaN    1  NaN   NaN
    6  34.0  NaN  765  65.0
    7   NaN   76  NaN   NaN
    

    Code:

    df['temp1']=df['Code'].shift()
    df['temp2']=df['Code'].shift(-1)
    df['temp1']=df['temp1'].fillna('')
    df['temp2']=df['temp2'].fillna('')
    df['temp1']=df['temp1'].astype(str)
    df['temp2']=df['temp2'].astype(str)
    df['Code']=np.where(df['Code'].isna(), df['temp1']+' '+df['temp2'], df['Code'])
    df['temp1']=df['temp1'].replace('', np.nan)
    df['temp2']=df['temp2'].replace('', np.nan)
    df.dropna(subset=['temp1', 'temp2'], how='any', inplace=True)
    df.drop(['temp1', 'temp2'], axis=1, inplace=True)
    df.reset_index(inplace=True, drop=True)
    df['DT']=df['DT'].astype(int)
    print(df)
    

    Output:

       DT   Code   DV    GT
    0  65  56 TD  CAT  76.0
    1  65     TR  B54   NaN
    2  45     52  T65   NaN
    3  34   1 76  765  65.0