Search code examples
pythonregexstringlistpython-re

Not finding a good regex pattern to substitute the strings in a correct order(python)


I have a list of column names that are in string format like below:

lst = ["plug", "[plug+wallet]", "(wallet-phone)"]

Now I want to add df[] with " ' " to each column name using regex and I did it which does that when the list has (wallet-phone) this kind of string it gives an output like this df[('wallet']-df['phone')]. How do I get like this (df['wallet']-df['phone']), Is my pattern wrong. Please refer it below:

import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
x=[]
y=[]
for l in lst: 
    x.append(re.sub(r"([^+\-*\/'\d]+)", r"'\1'", l))
    for f in x:    
        y.append(re.sub(r"('[^+\-*\/'\d]+')", r'df[\1]',f))

print(x)
print(y)

gives:

x:["'plug'", "'[plug'+'wallet]'", "'(wallet'-'phone)'"]
y:["df['plug']", "df['[plug']+df['wallet]']", "df['(wallet']-df['phone)']"]

Is the pattern wrong? Expected output:

x:["'plug'", "['plug'+'wallet']", "('wallet'-'phone')"]
y:["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]

I also tried ([^+\-*\/()[]'\d]+) this pattern but it isn't avoiding () or []


Solution

  • It might be easier to locate words and enclose them in the dictionary reference:

    import re
    lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
    
    z = [re.sub(r"(\w+)",r"df['\1']",w) for w in lst]
    
    print(z)
    ["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]