I know that Regex is a pretty hot topic and that there's a plethora of similar questions, however, I have not found one which matches my needs.
I need to check the formatting of my string to be as follows:
I haven't tried to implement the colon, semicolon or parentheses, but so far I'm stuck at just the period. These characters are optional so I can't make a hard check, and I'm trying to catch them but I'm still getting a match in a case like
00000 *TEST .FINAL STATEMENT. //Matches, but it shouldn't match.
00001 *TEST2 . FINAL STATEMENT. //Matches, but it shouldn't match.
00002 *TEST3. FINAL STATEMENT. //Matches, **should** match.
This is the regex I have so far:
^\d{5}\s{6}[\s\*][^.]*([^.\s]+\.\s)?[^.]*\..*$
I really don't see how this is happening, especially because I'm using [^.] to indicate I'll accept anything except a period as a wildcard, and the optional pattern looks correct at a glance: If there's a period, it should not have white space behind it and it should have white space after it.
Try this:
^\d{5}\s{6}[\s\*] # Your original pattern
(?: # Repeat 0 or more times:
[^.:;()]*| # Unconstrained characters
(?<!\s)[.:;](?=\s)| # Punctuation after non-space, followed by space
\((?!\s)| # Opening parentheses not followed by space
(?<!\s)\) # Closing parentheses not preceeded by space
)*
\.$ # Period, then end of string
https://regex101.com/r/WwpssV/1
In the last part of the pattern, the characters with special requirements are .:;()
, so use a negative character set to match anything but those characters: [^.:;()]*
Then alternate with:
if there is any period, colon or semicolon before the final period, the character must not be preceded by a white space, but it must be followed by a white space.
Fulfilled by (?<!\s)[.:;](?=\s)
- match one of those characters only if not preceded by a space, and if followed by a space.
opening parentheses cannot be followed by a white space.
Fulfilled by \((?!\s)
closing parentheses cannot be preceded by a white space.
Fulfilled by (?<!\s)\)
Then just alternate between those 4 possibilities at the end of the pattern.