I am not sure how intensive this problem is. But I am having issues and need help:
I have a sample pandas datframe as follows (say):
df
C A V D
9 apar 1 0
8 bpar 4 8
7 cpar 7 7
0 apar 8 6
8 apar 9 4
9 bpar 3 2
So what I need to do is to append the existing dataframe, as to whenever I have 'A' col has 'apar', then create a new value as 'apar_t' and also change the value of 'V' by say( 0.5) and get the dataframe updated. So in this toy example my dataframe should look like :
df
C A V D
9 apar 1.0 0
8 bpar 4.0 8
7 cpar 7.0 7
0 apar 8.0 6
8 apar 9.0 4
9 bpar 3.0 2
9 apar_t 0.5 0
0 apar_t 7.5 6
8 apar_t 8.5 4
I have been doing and able to do the problem , but I think its is not pythonic and not that efficient for huge dataset. I will request if I can find a better way to solve the problem;
What I did was the following:
sub_df = df[df['A']=='apar']
colsOrder = df.columns
sub_df = sub_df.rename(columns={'A': 'A1', 'V': 'V1'})
sub_df['A'] ='apar_t'
sub_df['V'] = sub_df['V1'] - 0.5
sub_df.drop(columns=['A1', 'V1'])
sub_df = sub_df[colsOrder]
frames =[df,sub_df]
DF = pd.concat(frames).reset_index(drop=True)
DF
The code works and I get what I want. But I was looking for a more elegant pythonic and efficient solution. Any help will be appreciated.
This is a clean way to do it in the case where you're only adding or subtracting values from row values:
pd.concat([df, df.loc[df.A == 'apar'].apply(
lambda row: row.add([0, '_t', -0.5, 0]), axis=1)])
Dataframe:
C A V D
0 9 apar 1.0 0
1 8 bpar 4.0 8
2 7 cpar 7.0 7
3 0 apar 8.0 6
4 8 apar 9.0 4
5 9 bpar 3.0 2
0 9 apar_t 0.5 0
3 0 apar_t 7.5 6
4 8 apar_t 8.5 4
Otherwise, you can define a function for your row transformation:
def transform_row(row):
row['A'] = row['A'] + '_t'
row['V'] = row['V'] - 0.5
return row
and then use apply
pd.concat([df, df.loc[df.A == 'apar'].apply(transform_row, axis=1)])
The resulting dataframe is identical to the above.