Search code examples
pythonpandasdataframedrop

Python : Fill a specific column with the same value in a Dataframe and remove the rows useless


Let's say that I have this dataframe :

data3 = ['ID','ID','','','','','']
data4 = [12,34,465,678,896,'','']
data5 = [8798,67,2313,'','','','']
data6 = [56,67,'','','','','']

df2 = pd.DataFrame(list(zip(data3,data4,data5,data6)),columns = ['Name','Data1','Data2','Data3'])
print(df2)

  Name Data1 Data2 Data3
0   ID    12  8798    56
1   ID    34    67    67
2        465  2313
3        678
4        896
5
6

I want to fill the column "Name" with always the same value that we can find it and for all the rows where there are values and drop the rows useless where there is nothing. So I would like to get this result :

  Name Data1 Data2 Data3
0   ID    12   8798   56
1   ID    34   67     67
2   ID    465  2313
3   ID    678
4   ID    896

Anyone has an idea efficient to do that ?

Thank you


Solution

  • Use DataFrame.replace if empty strings, not NaNs, then DataFrame.dropna and last forward filling missing values in Name column by ffill:

    df2 = df2.replace('', np.nan)
    
    df2 = df2.dropna(how='all')
    df2['Name'] = df2['Name'].ffill()
    print(df2)
      Name  Data1   Data2  Data3
    0   ID   12.0  8798.0   56.0
    1   ID   34.0    67.0   67.0
    2   ID  465.0  2313.0    NaN
    3   ID  678.0     NaN    NaN
    4   ID  896.0     NaN    NaN