I need to match a specific pattern but I'm unable to do it with regular expressions. I'm looking for people's name. It follows always the same patterns. Some combinations are:
My problem comes when sometimes I have things like: Mr. Snow and Ms. Stark
. It captures also the and
. So I'm looking for a regular expression that does not capture the second name only if it is and
. Here I'm looking for ["Mr. Snow", "Ms. Stark"]
.
My best try is as follows:
(M[rs].\s\w+(?:\s[\w-]+)(?:\s\([^\)]*\))?)
.
Note that the second name is in a non-capturing group. Because I was thinking to use a negative look-ahead, but If I do that, the first word is not captured (because the entire name does not match), and I need that to be captured.
Any Ideas?
Here is some text to fast check.
Here is my two cents:
\bM[rs]\.\h(\p{Lu}\p{Ll}+(?:[\h-]\p{Lu}\p{Ll}+)*)\b
See an online demo
\b
- A word-boundary;M[rs]\.\h
- Match Mr.
or Ms.
followed by a horizontal whitespace;(\p{Lu}\p{Ll}+(?:[\h-]\p{Lu}\p{Ll}+)*)
- A capture group with a nested non-capture group to match an uppercase letter followed by lowercase letters and 0+ 2nd names concatenated through whitespace or hyphen;\b
- A word-boundary.