Search code examples
pythonregexregex-negation

Regex to check for only one double-quote


If a string as a dot (.) surrounded by double-quotes, then it's valid. Dot on it's own or a single double-quote are invalid.

# Valid str examples
str1 = 'Don "B." White'
str10 = 'Don "M.dom" White'
str2 = 'Don "B." White "H." Joe'

# In-valid str examples
str3 = 'Don "B. White'
str4 = 'Don "B." White "H Simpson'
str5 = 'Don B. White' # dot must have double quotes around it e.g. "B."

I can check that a dot is surrounded by double quotes using

re.search(r'(?!")\.(?!")', str)

but struggling a bit to construct reg to detect single double in str3 or str4

I tried different variants of negative lookahead r'"(?!")' (i know it's wrong) or [^"] regex but can't seem to get it working. Any ideas?


Solution

  • You may be able to use this regex:

    ^(?:[^".\n]*"[^"\n.]*\.[^"\n]*")*[^".\n]*$
    

    RegEx Demo

    RegEx Demo:

    • ^: Start
    • (?:: Start non-capture group
      • [^".\n]*: Match 0 or more of any char that are not " and . and not line break
      • ": Match a "
      • [^"\n.]*: Match 0 or more of any char that are not " and . and not line break
      • \.: Match a .
      • [^"\n]*: Match 0 or more of any char that are not " and not line break
      • ": Match a "
    • )*: End non-capture group. Repeat this group 0 or more times
    • [^".\n]*: Match 0 or more of any char that are not " and not line break
    • $: End