Search code examples
pythonexcelopenpyxlpython-re

Using regex or wildcards in a python loop


I'm tidying up an Excel workbook and I need to find all instances of Microsoft.* - for example Microsoft.Compute and Microsoft.Cdn - and copy them into the column to the left.

I've put this code together but it doesn't like the regex

import openpyxl as op
import re

wb = op.load_workbook("original-workbook.xlsx")
ws = wb["Sheet1"]

pattern = re.compile(r'Microsoft.+')

for row in ws.iter_rows(min_row=1, max_col=3):
    if row[1].value in pattern:
        row[0].value = pattern[row[1].value]

wb.save("new-improved-workbook.xlsx")

This is the error:

TypeError: argument of type 're.Pattern' is not iterable

I'm sure what I want to do it really simple but it's got me stumped.

Thanks in advance.


Solution

  • Besides the TypeError, I assume that you need to check if the cells on column B match Microsoft* and then update the corresponding A cells. If so, regex doesn't seem necessary, you can simply check if the values startswith Microsoft.

    from openpyxl import load_workbook
    
    wb = load_workbook("original-workbook.xlsx")
    ws = wb["Sheet1"]
    
    # pattern = re.compile(r"Microsoft.+") # uncomment if regex needed
    
    for row in ws.iter_rows(min_row=1, max_col=3):
        if row[1].value and row[1].value.startswith("Microsoft"):
        # if row[1].value and pattern.match(row[1].value): # to fix your code
            row[0].value = row[1].value
    
    wb.save("new-improved-workbook.xlsx")
    

    Output :

    enter image description here