Search code examples
pythonpython-3.xpython-re

Getting an "unterminated subpattern" error I don't understand


I'm learning Python from No Starches book on Python Automate the Boring Stuff. For a project we write a script to scrape the phone numbers from the text on the clipboard.

I'm getting an error (shown below). What did I do wrong and how can I fix it?

My code:

import re

# Phone number regex
phoneRegex = re.compile(r'''(
(\d{3}|\(\d{3}\))?             #area code
(\s|-|\.)?                     #seperator
(\d{3}\)                       #first 3 digits
(\s|-|\.)                      #seperator
(\d{4})                        #last 4 digits
(\s*(ext|x|ext.)\s*(\d{2,5}))? #extension
)''', re.VERBOSE)

The error:

Traceback (most recent call last):
  File "*path snipped for privacy*/Automate the Boring Stuff with Python/Chapter 7 project phoneAndEmail.py", line 7, in <module>
    phoneRegex = re.compile(r'''(
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/sre_parse.py", line 948, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/Applications/Mu Editor.app/Contents/Resources/Python/lib/python3.8/sre_parse.py", line 836, in _parse
    raise source.error("missing ), unterminated subpattern",
re.error: missing ), unterminated subpattern at position 0 (line 1, column 1)

Solution

  • The mistake is in this part of the phoneRegex pattern:

    (\d{3}\)
    

    The opening ( is the beginning of a group, but the closing ) was escaped with a \, making it a literal ) instead of the end of the group.

    Thus, the opening ( of the group is never balanced by a closing ) (i.e., the group, a subpattern, is "unterminated").

    You should change \) to ).