I have the following data frame:
import pandas as pd
data = {"A1": [22,25,27,35],
"A2": [23,54,71,52],
"A3": [21,27,26,31],
"A1_L1": [21,27,26,31],
"A1_L2": [21,27,26,31],
"A2_L1": [21,27,26,31],
"A3_L1": [21,27,26,31],
"A3_L2": [21,27,26,31],
"A3_L3": [21,27,26,31]
}
df = pd.DataFrame(data, columns = ["A1", "A2","A3","A1_L1","A1_L2","A2_L1","A3_L1","A3_L2","A3_L3"])
I want to add automaticaly a new column for each column with an L in it. The information in this column will be the subtraction column A - A_L
For example,
A1_L1_new=A1-A1_L1
A1_L2_new=A1-A1_L2
A2_L1_new=A2-A2_L1
A3_L1_new=A3-A3_L1
A3_L2_new=A3-A3_L2
A3_L3_new=A3-A3_L3
Any help will be appreciated.
First, import library regex:
import re
Then, take columns' names as list.
col_names = df.columns.tolist()
for i in col_names:
if 'L' in i: #checks if name contains 'L'
x = [x.group() for x in re.finditer(r'[A,L][1-9]', i)] # place in list, the 'A' and 'L'
foo = str(x[0]) + "-" + str(x[1]) + '_new' # Create new name, based on 'A' and 'L'
df[foo] = df[x[0]] - df[i] # Subtract the corresponding columns.
df
In case you wonder what regex does: