I have a dataframe with multiple columns, two of these columns have the same name ('mobilephone'), some values are empty, and some aren't but there will always be one of the two columns populated with a phone number:
mobilephone mobilephone
0 999000111 999000111
1 999000222
2 999000333
3 999000444 999000444
How would I keep only one of these columns but populate the empty values in the first column with the values in the second column?
First, find the index of the second duplicate column. Then change the name of this column and use .loc[]
to fill in the gaps:
col_name = 'mobilephone'
index_second_column = np.where(df.columns.duplicated())[0][0]
df.columns.values[index_second_column] += '_1'
df.loc[df[col_name].isnull(), colname] = df[col_name + '_1']
df.drop(columns = col_name + '_1', inplace=True)