If a string as a dot (.) surrounded by double-quotes, then it's valid. Dot on it's own or a single double-quote are invalid.
# Valid str examples
str1 = 'Don "B." White'
str10 = 'Don "M.dom" White'
str2 = 'Don "B." White "H." Joe'
# In-valid str examples
str3 = 'Don "B. White'
str4 = 'Don "B." White "H Simpson'
str5 = 'Don B. White' # dot must have double quotes around it e.g. "B."
I can check that a dot is surrounded by double quotes using
re.search(r'(?!")\.(?!")', str)
but struggling a bit to construct reg to detect single double in str3
or str4
I tried different variants of negative lookahead r'"(?!")'
(i know it's wrong) or [^"]
regex but can't seem to get it working. Any ideas?
You may be able to use this regex:
^(?:[^".\n]*"[^"\n.]*\.[^"\n]*")*[^".\n]*$
RegEx Demo:
^
: Start(?:
: Start non-capture group
[^".\n]*
: Match 0 or more of any char that are not "
and .
and not line break"
: Match a "
[^"\n.]*
: Match 0 or more of any char that are not "
and .
and not line break\.
: Match a .
[^"\n]*
: Match 0 or more of any char that are not "
and not line break"
: Match a "
)*
: End non-capture group. Repeat this group 0 or more times[^".\n]*
: Match 0 or more of any char that are not "
and not line break$
: End