I have a list of column names that are in string format like below:
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
Now I want to add df[]
with " ' "
to each column name using regex and I did it which does that when the list has (wallet-phone)
this kind of string it gives an output like this df[('wallet']-df['phone')]
. How do I get like this (df['wallet']-df['phone']),
Is my pattern wrong. Please refer it below:
import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
x=[]
y=[]
for l in lst:
x.append(re.sub(r"([^+\-*\/'\d]+)", r"'\1'", l))
for f in x:
y.append(re.sub(r"('[^+\-*\/'\d]+')", r'df[\1]',f))
print(x)
print(y)
gives:
x:["'plug'", "'[plug'+'wallet]'", "'(wallet'-'phone)'"]
y:["df['plug']", "df['[plug']+df['wallet]']", "df['(wallet']-df['phone)']"]
Is the pattern wrong? Expected output:
x:["'plug'", "['plug'+'wallet']", "('wallet'-'phone')"]
y:["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]
I also tried ([^+\-*\/()[]'\d]+)
this pattern but it isn't avoiding () or []
It might be easier to locate words and enclose them in the dictionary reference:
import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
z = [re.sub(r"(\w+)",r"df['\1']",w) for w in lst]
print(z)
["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]