Search code examples
pythonregexpython-3.xjupyterregex-group

Regex for Address working in Regex 101 (Python) not in Python using re.match?


I have the following Python script (in Jupyter) which is supposed to extract address information using regex (unit numbers are already cleaned up and street types are abbreviated before this step):

type_opts = r"Terrace|Way|Walk|St|Rd|Ave|Cl|Ct|Cres|Blvd|Dr|Ln|Pl|Sq|Pde"
road_attrs_pattern = r"(?P<rd_no>\w?\d+(\-\d+)?\w?\s+)(?P<rd_nm>[a-zA-z \-]+)(?#\s+(?P<rd_tp>" + type_opts + ")"
print("Road Attr Pattern: ", road_attrs_pattern)
road_attrs = re.match(road_attrs_pattern, proc_addr)
road_num = road_attrs.group('rd_no').strip()
print("Road number: ", road_num)
road_name = road_attrs.group('rd_nm').strip()
print("Road name: ", road_name)
road_type = road_attrs.group('rd_tp').strip()
print("Road type: ", road_type)

I'm using this address:

Burrah lodge, 15 Anne Jameson Pl

This results in the following print-out:

Road Attr Pattern:  (?P<rd_no>\w?\d+(\-\d+)?\w?\s+)(?P<rd_nm>[a-zA-z \-]+)(?#\s+(?P<rd_tp>Terrace|Way|Walk|St|Rd|Ave|Cl|Ct|Cres|Blvd|Dr|Ln|Pl|Sq|Pde)

But then throws an error saying the street number is not available AttributeError: 'NoneType' object has no attribute 'group'.

However a copy-paste in Regex101 here says it should work, and looking over the Regex it's my view that it should work also...

It should print-out the following:

Road Attr Pattern:  (?P<rd_no>\w?\d+(\-\d+)?\w?\s+)(?P<rd_nm>[a-zA-z \-]+)(?#\s+(?P<rd_tp>Terrace|Way|Walk|St|Rd|Ave|Cl|Ct|Cres|Blvd|Dr|Ln|Pl|Sq|Pde)
Road number: 15
Road name: Anne Jameson
Road type: Pl

Solution

  • According to the docs, re.match checks for a match at the beginning of the string.

    Since you're looking for a match that starts partway through the string, you'll want re.search instead.