I'm tidying up an Excel workbook and I need to find all instances of Microsoft.* - for example Microsoft.Compute and Microsoft.Cdn - and copy them into the column to the left.
I've put this code together but it doesn't like the regex
import openpyxl as op
import re
wb = op.load_workbook("original-workbook.xlsx")
ws = wb["Sheet1"]
pattern = re.compile(r'Microsoft.+')
for row in ws.iter_rows(min_row=1, max_col=3):
if row[1].value in pattern:
row[0].value = pattern[row[1].value]
wb.save("new-improved-workbook.xlsx")
This is the error:
TypeError: argument of type 're.Pattern' is not iterable
I'm sure what I want to do it really simple but it's got me stumped.
Thanks in advance.
Besides the TypeError
, I assume that you need to check if the cells on column B
match Microsoft*
and then update the corresponding A
cells. If so, regex doesn't seem necessary, you can simply check if the values startswith
Microsoft
.
from openpyxl import load_workbook
wb = load_workbook("original-workbook.xlsx")
ws = wb["Sheet1"]
# pattern = re.compile(r"Microsoft.+") # uncomment if regex needed
for row in ws.iter_rows(min_row=1, max_col=3):
if row[1].value and row[1].value.startswith("Microsoft"):
# if row[1].value and pattern.match(row[1].value): # to fix your code
row[0].value = row[1].value
wb.save("new-improved-workbook.xlsx")
Output :