Comparing 2 lists in Excel with VBA Regex

I want to use them to compare two lists (columns) in Excel to find matches. As this is quite a complex operation, I have performed it in the past, using several different functions (non-VBA) in Excel, but it has proved to be awkward at best, and so I wanted to try an all-in-one VBA solution, if possible.

The first column has names with irregularities (such as quoted nicknames, suffixes such as 'jr' or 'sr', and parentheses around 'preferred' versions of first names). Additionally, when middle names are present, they may be either the name or the initial.

The order in the first column is:

 <first name or initial>
 <space>
 <any parenthetical 'preferred' names - if they exist>
 <space>
 <middle name or initial - if it exists>
 <space>
 <quoted nickname or initial - if it exists>
 <space>
 <last name>
 <comma - if necessary><space - if necessary><suffix - if it exists>

The order in the second column is:

 `<lastname><space><suffix>,<firstname><space><middle name, if it exists>`

, with none of the 'irregularities' that the first column has.

My main objective is to 'clean' the first column into this order:

 `lastname-space-suffix,firstname-space-preferred name-space-
 middle name-space-nickname`

Although I keep the 'irregularities' in here, I could possibly use some sort of 'flags' in my compare code to alert me of them on a case-by-case basis.

I've been trying several patterns, and this is my most recent:

["]?([A-Za-z]?)[.]?["]?[.]?[\s]?[,]?[\s]?

However, I would like to allow for the last name and suffixes (if they exist). I had tested it with 'global,' however I couldn't figure out how to separate the last name and suffixes, via backreferencing, for example.

I would then like to compare the last,first, middle initial (as most names are only initials in the first list) between the two lists.

 An example would be:
 (1st list)
 John (Johnny) B. "Abe" Smith, Jr.
 turned into:
 Smith Jr,John (Johnny) B "Abe"
 or
 Smith Jr,John B

 and
 (2nd list)
 Smith Jr,John Bertrand
 turned into:
 Smith Jr,John B

 Then run a comparison between the two columns.

What would be a good starting or continuing point for this list comparison?

Apr 10, 2012 addendum:

As a side note, I will need to eliminate the quotes from the nicknames and the parentheses from the preferred names. Can I just break the grouped references down further into sub groups (in the below examples)?

 (?:  ([ ] \( [^)]* \)))?  # (2) parenthetical 'preferred' name (optional) 
 (?:  ([ ] (["'] ) .*?) \6 )? # (5,6) quoted nickname or initial (optional)

Can I group them like this:

 (?:(([ ])(\()([^)]*)(\))))? # (2) parenthetical 'preferred' name (optional) 
 not sure how to do this one -  # (5,6) quoted nickname or initial (optional)

I tried them in 'Regex Coach' and 'RegExr', and they worked fine, but in VBA, when I wanted the backreferences returned as in \11,\5 all that was returned was First Name, the numeral one, and a comma (e.g. "Carl1,"). I'm going back to check for any typos. Thanks for any help.

Apr 17, 2012 addendum:

There was one name 'situation' I overlooked, and that is last names consisting of 2 or more words, e.g. 'St Cyr' or 'Von Wilhelm'.
Would the following addition

 `((St|Von)[ ])?

work in this Regex, which you offered?

 `((St|Von)[ ])?([^\,()"']+)

My tests in Regex Coach and RegExr haven't quite worked, in that the replacement returns 'St' with a space preceding.

Solution

Redo -

This is different approach. It might work in your VBA, and is just an example. I tested this in Perl and it worked great. But, I won't show the perl code,
just regex's and some explanations.

This is a two step process.

Normalize the column text
Do the main parse

Normalize Process

Get a column value
Strip out all dots . - Globally search for \. , replace with nothing ''
Turn whitespaces into spaces - Globally search for \s+ , replace with a single space [ ]

(Note that if it can't be normalized, I don't see much chance for sucess no matter what is tried)

Main Parse Process

After normalizing a column value (do for both columns), run it through these regex's.

Column 1 regex

^
  [ ]?
  ([^\ ,()"']+)                        # (1)     first name or initial          (required)
  (?:  ([ ] \( [^)]* \))    )?         # (2)     parenthetical 'preferred' name (optional)
  (?:
       ([ ] [^\ ,()"'] )               # (3,4)   middle initial OR name         (optional)
       ([^\ ,()"']*)                   #         name and initial are both captured
  )?
  (?:  ([ ] (["'] ) .*?) \6 )?         # (5,6)   quoted nickname or initial     (optional)
  [ ]  ([^\ ,()"']+)                   # (7)     last name                      (required)
  (?:
        [, ]* ([ ].+?) [ ]?            # (8)     suffix                         (optional)
      | .*?
  )?
$

The replacement depends on what you want.
Three types are defined (replace $ with \ as needed):

type 1a full middle - $7$8,$1$2$3$4$5$6
type 1b middle initial - $7$8,$1$2$3$5$6
type 2 middle initial - $7$8,$1$3

Example conversion:

Input (raw)               = 'John (Johnny) Bertrand "Abe" Smith, Jr.  '
Out type 1 full middle    = 'Smith Jr,John (Johnny) Bertrand "Abe"'
Out type 1 middle initial = 'Smith Jr,John (Johnny) B "Abe"'
Out type 2 middle initial = 'Smith Jr,John B'

Column 2 regex

^
  [ ]?
  ([^\ ,()"']+)                  # (1)     last name                      (required)
  (?: ([ ] [^\ ,()"']+) )?       # (2)     suffix                         (optional)
  ,
  ([^\ ,()"']+)                  # (3)     first name or initial          (required)
  (?:
      ([ ] [^\ ,()"'])           # (4,5)   middle initial OR name         (optional)
      ([^\ ,()"']*)
  )?
  .*
$

The replacement depends on what you want.
Two types are defined (replace $ with \ as needed):

type 1a full middle - $1$2,$3$4$5
type 1b middle initial - $1$2,$3$4

Example conversion:

Input                     = 'Smith Jr.,John Bertrand  '
Out type 1 full middle    = 'Smith Jr,John Bertrand'
Out type 1 middle initial = 'Smith Jr,John B'

VBA Replacement Help

This works on a very old copy of Excel, creating a VBA project.
These are two modules created to show an example.
They both do the same thing.

The first is a verbose example of all the replacement types possible.
The second is a trimmed down version using just a type-2 comparison.

I haven't done VB before as you can tell, but it should be simple enough
for you to glean how the replacement works, and how to tie in the excel
columns.

If you are doing just a flat comparison, you might want to do a col 1 val
once, then check each value in column 2 against it, then go to next val in
column 1, then repeat.

For the fastest way to do this, create 2 extra columns, convert respected
columns vals to type-2 (variables strC1_2 and strC2_2, see example), then copy them
to the new columns.
After that, you don't need regex, just compare columns, find the matching rows,
then delete the type-2 columns.

Verbose -

Sub RegexColumnValueComparison()

' Column 1 and 2 , Sample values
' These should probably be passed in values
' ============================================
strC1 = "John (Johnny)   Bertrand ""Abe""   Smith, Jr.  "
strC2 = "Smith Jr.,John Bertrand  "

' Normalization Regexs for whitespace's and period's
' (use for both column values)
' =============================================
Set rxDot = CreateObject("vbscript.regexp")
rxDot.Global = True
rxDot.Pattern = "\."
Set rxWSp = CreateObject("vbscript.regexp")
rxWSp.Global = True
rxWSp.Pattern = "\s+"

' Column 1 Regex
' ==================
Set rxC1 = CreateObject("vbscript.regexp")
rxC1.Global = False
rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"

' Column 2 Regex
' ==================
Set rxC2 = CreateObject("vbscript.regexp")
rxC2.Global = False
rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$"

' Normalize column 1 and 2, Copy to new var
' ============================================
strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "")
strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "")


' ------------------------------------------------------
' This section is informational
' Shows some sample replacements before comparison
' Just pick 1 replacement from each column, discard the rest
' ------------------------------------------------------

' Create Some Replacement Types for Column 1
' =====================================================
strC1_1a = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$4$5$6")
strC1_1b = rxC1.Replace(strC1_Normal, "$7$8,$1$2$3$5$6")
strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3")

' Create Some Replacement Types for Column 2
' =====================================================
strC2_1b = rxC2.Replace(strC2_Normal, "$1$2,$3$4$5")
strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4")

' Show Types in Message Box
' =====================================================
c1_t1a = "Column1 Types:" & Chr(13) & "type 1a full middle    - " & strC1_1a
c1_t1b = "type 1b middle initial - " & strC1_1b
c1_t2 = "type 2 middle initial - " & strC1_2
c2_t1b = "Column2 Types:" & Chr(13) & "type 1b middle initial - " & strC2_1b
c2_t2 = "type 2 middle initial - " & strC2_2

MsgBox (c1_t1a & Chr(13) & c1_t1b & Chr(13) & c1_t2 & Chr(13) & Chr(13) & c2_t1b & Chr(13) & c2_t2)

' ------------------------------------------------------
' Compare a Value from Column 1 vs Column 2
' For this we will compare Type 2 values
' ------------------------------------------------------
If strC1_2 = strC2_2 Then
   MsgBox ("Type 2 values are EQUAL: " & Chr(13) & strC1_2)
Else
   MsgBox ("Type 2 values are NOT Equal:" & Chr(13) & strC1_2 & " != " & strC1_2)
End If

' ------------------------------------------------------
' Same comparison (Type 2) of Normalized column 1,2 values
' In esscense, this is all you need
' ------------------------------------------------------
If rxC1.Replace(strC1_Normal, "$7$8,$1$3") = rxC2.Replace(strC2_Normal, "$1$2,$3$4") Then
   MsgBox ("Type 2 values are EQUAL")
Else
   MsgBox ("Type 2 values are NOT Equal")
End If

End Sub

Type 2 only -

Sub RegexColumnValueComparison()

' Column 1 and 2 , Sample values
' These should probably be passed in values
' ============================================
strC1 = "John (Johnny)   Bertrand ""Abe""   Smith, Jr.  "
strC2 = "Smith Jr.,John Bertrand  "

' Normalization Regexes for whitespace's and period's
' (use for both column values)
' =============================================
Set rxDot = CreateObject("vbscript.regexp")
rxDot.Global = True
rxDot.Pattern = "\."
Set rxWSp = CreateObject("vbscript.regexp")
rxWSp.Global = True
rxWSp.Pattern = "\s+"

' Column 1 Regex
' ==================
Set rxC1 = CreateObject("vbscript.regexp")
rxC1.Global = False
rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:([ ]\([^)]*\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ]([""']).*?)\6)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"

' Column 2 Regex
' ==================
Set rxC2 = CreateObject("vbscript.regexp")
rxC2.Global = False
rxC2.Pattern = "^[ ]?([^ ,()""']+)(?:([ ][^ ,()""']+))?,([^ ,()""']+)(?:([ ][^ ,()""'])([^ ,()""']*))?.*$"

' Normalize column 1 and 2, Copy to new var
' ============================================
strC1_Normal = rxDot.Replace(rxWSp.Replace(strC1, " "), "")
strC2_Normal = rxDot.Replace(rxWSp.Replace(strC2, " "), "")

' Comparison (Type 2) of Normalized column 1,2 values
' ============================================
strC1_2 = rxC1.Replace(strC1_Normal, "$7$8,$1$3")
strC2_2 = rxC2.Replace(strC2_Normal, "$1$2,$3$4")

If strC1_2 = strC2_2 Then
   MsgBox ("Type 2 values are EQUAL")
Else
   MsgBox ("Type 2 values are NOT Equal")
End If

End Sub

Paren/Quote Response

As a side note, I will need to eliminate the quotes from the nicknames and the parentheses from the preferred names.

If I understand correctly ..

Yes, you can capture the contents inside quotes and parenthesis separetly.
It just requires some modifications. The below regex has the ability to
formulate a replacement with or without quotes and/or parenthesis,
or other forms.

Samples below give ways to formulate replacements.

Very Important Note here

IF you are talking about eliminating quotes "" and parenthesis () from the
matching regex, that could be done as well. It requires a new regex.

The only problem is that ALL distinction between preferred/middle/nick
gets thrown out the window because these were positional as well as
delimited (ie: (preferred) middle "nick" ).

Taking away consideration of that would require regex subexpressions like this

(?:[ ]([^ ,]+))?   # optional preferred
(?:[ ]([^ ,]+))?   # optional middle
(?:[ ]([^ ,]+))?   # optional nick

And, thier being optional, loses all positional reference and renders the midle-initial
expression invalid.

End Note

Regex template (use to formulate replacement strings)

^
  [ ]?

# (required) 
   # First
   #  $1  name
   # -----------------------------------------
    ([^\ ,()"']+)                 # (1) name     

# (optional)
   # Parenthetical 'preferred'
   #  $2    all
   #  $3$4  name
   # -----------------------------------------
    (?: (                         #  (2)   all  
           ([ ]) \( ([^)]*) \)    #  (3,4) space and name
        )
    )?  

# (optional)
  # Middle
  #   $5    initial
  #   $5$6  name
  # -----------------------------------------
    (?:  ([ ] [^\ ,()"'] )       #  (5) first character
         ([^\ ,()"']*)           #  (6) remaining characters

    )?                                   

# (optional)
   # Quoted nick                       
   #  $7$8$9$8  all
   #  $7$9      name
   # -----------------------------------------
    (?: ([ ])                    # (7) space
        (["'])                   # (8) quote
        (.*?)                    # (9) name
        \8 
    )?

# (required)
   #  Last
   #  $10  name
   # -----------------------------------------
    [ ] ([^\ ,()"']+)            # (10) name

# (optional)
   # Suffix 
   #  $11  suffix
   # -----------------------------------------
    (?:    [, ]* ([ ].+?) [ ]?   # (11) suffix
        |  .*?
    )?
$

VBA regex (2nd edition, tested in my VBA project from above)

rxC1.Pattern = "^[ ]?([^ ,()""']+)(?:(([ ])\(([^)]*)\)))?(?:([ ][^ ,()""'])([^ ,()""']*))?(?:([ ])([""'])(.*?)\8)?[ ]([^ ,()""']+)(?:[, ]*([ ].+?)[ ]?|.*?)?$"


strC1_1a  = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$6$7$8$9$8" )
strC1_1aa = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$6$7$9" )
strC1_1b  = rxC1.Replace( strC1_Normal, "$10$11,$1$2$5$7$8$9$8" )
strC1_1bb = rxC1.Replace( strC1_Normal, "$10$11,$1$3$4$5$7$9" )
strC1_2   = rxC1.Replace( strC1_Normal, "$10$11,$1$5" )

Sample input/output possibilities

Input (raw)                 = 'John (Johnny) Bertrand "Abe" Smith, Jr.  '

Out type 1a  full middle    = 'Smith Jr,John (Johnny) Bertrand "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Johnny Bertrand Abe'
Out type 1b  middle initial = 'Smith Jr,John (Johnny) B "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Johnny B Abe'
Out type 2   middle initial = 'Smith Jr,John B'

Input (raw)                 = 'John  (Johnny)  Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John (Johnny)'
Out type 1aa full middle    = 'Smith Jr,John Johnny'
Out type 1b  middle initial = 'Smith Jr,John (Johnny)'
Out type 1bb middle initial = 'Smith Jr,John Johnny'
Out type 2   middle initial = 'Smith Jr,John'


Input (raw)                 = 'John  (Johnny)  "Abe" Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John (Johnny) "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Johnny Abe'
Out type 1b  middle initial = 'Smith Jr,John (Johnny) "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Johnny Abe'
Out type 2   middle initial = 'Smith Jr,John'


Input (raw)                 = 'John   "Abe" Smith, Jr.'

Out type 1a  full middle    = 'Smith Jr,John "Abe"'
Out type 1aa full middle    = 'Smith Jr,John Abe'
Out type 1b  middle initial = 'Smith Jr,John "Abe"'
Out type 1bb middle initial = 'Smith Jr,John Abe'
Out type 2   middle initial = 'Smith Jr,John'

Re: 4/17 concern

last names that have 2 or more words. Would the allowance for certain literal names, rather than generic word patterns, be the solution?

Actually, no it wouldn't. In this case, for your form, allowing multiple words in the last name
injects the space field delimeter into the last name field.

However, for your particular form, it could be done, since the only handicap is when the
"nick" field is missing. When it is missing and given that there is only one word in the
middle name, 2 permutations are presented.

Hopefully You can get the solution from the 3 regexes and test case output(s) below. The regexes have expunged space delimeters from the capture. So, you can either compose
replacements with the Replace method, or just store the capture buffers to compare to
the results of capture scenario's of other columns.

Nick_rx.Pattern (template)

* This pattern is multi-word last name, NICK is required 

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3) preferred
       )
    )?  

   # Middle
    (?: [ ]
        (                      # (4) full middle, -or-
          ([^\ ,()"'])         # (5) initial
          [^\ ,()"']*
        )
    )?

   # Quoted nick (req'd)
     [ ]
     (                         # (6) "nick",
       (["'])   # (7) n/a        -or-
       (.*?)                   # (8)  nick
       \7
     )

   # Single/Multi Last (req'd)
    [ ]
    (                          # (9) multi/single word last name
      [^\ ,()"']+
      (?:[ ][^\ ,()"']+)*
    )

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (10) suffix

  [ ]?
$

-----------------------------------

FLs_rx.Pattern (template)

* This pattern has no MIDDLE/NICK, is single-word last name,
* and has no permutations.

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3)  preferred
       )
    )?  

   # Single Last (req'd)
     [ ]
     ([^\ ,()"']+)             # (4) single word last name

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (5) suffix

  [ ]?
$

-----------------------------------

FLm_rx.Pattern (template)

* This pattern has no NICK, is multi-word last name,
* and has 2 permutations.
* 1. Middle as part of Last name.
* 2. Middle is separate from Last name.

^
  [ ]?

   # First (req'd)
    ([^\ ,()"']+)              # (1) first name

   # Preferred first
    (?: [ ]
       (                       # (2) (preferred), -or-
         \( ([^)]*?) \)        # (3)  preferred
       )
    )?  

   # Multi Last (req'd)
    [ ]
    (                         # (4) Multi, as Middle + Last,
                              # -or-
       (?:                         # Middle
          (                        # (5) full middle, -or-
             ([^\ ,()"'])          # (6) initial
             [^\ ,()"']*
          )
          [ ]
       )
                                   # Last (req'd)
       (                           # (7) multi/single word last name
          [^\ ,()"']+
          (?:[ ][^\ ,()"']+)* 
       )
    )

   # Suffix 
    (?: [ ]? , [ ]? (.*?) )?   # (8) suffix

  [ ]?
$

-----------------------------------

Each of these regexes are mutually exclusive and should be checked
in an if-then-else like this (Pseudo code):

str_Normal = rxDot.Replace(rxWSp.Replace(str, " "), "")

If  Nick_rx.Test(str_Normal) Then
     N_1a  = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $4 $6 "), " ")
     N_1aa = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $4 $8 "), " ")
     N_1b  = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $2 $5 $6 "), " ")
     N_1bb = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $3 $5 $8 "), " ")
     N_2   = rxWSp.Replace( Nick_rx.Replace(str_Normal, "$9 $10 , $1 $5 "), " ")

     ' see test case results in output below
Else

If FLs_rx.Test(str_Normal) Then

     FLs_1a  = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $2 "), " ")
     FLs_1aa = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 $3 "), " ")
     FLs_2   = rxWSp.Replace( FLs_rx.Replace(str_Normal, "$4 $5 , $1 "), " ")

Else

If FLm_rx.Test(str_Normal) Then

   ' Permutation 1:
     FLm1_1a  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $2 "), " ")
     FLm1_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 $3 "), " ")
     FLm1_2   = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$4 $8 , $1 "), " ")

  ' Permutation 2:
     FLm2_1a  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $5 "), " ")
     FLm2_1aa = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $5 "), " ")
     FLm2_1b  = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $2 $6 "), " ")
     FLm2_1bb = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $3 $6 "), " ")
     FLm2_2   = rxWSp.Replace( FLm_rx.Replace(str_Normal, "$7 $8 , $1 $6 "), " ")

  ' At this point, the odds are that only one of these permutations will match 
  ' a different column.

Else

     ' The data could not be matched against a valid form
End If

-----------------------------

Test Cases

Found form 'Nick'
Input (raw)                 = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr '
Normal                      = 'John1 (JJ) Bert "nick" St Van Helsing ,Jr '

Out type 1a  full middle    = 'St Van Helsing Jr , John1 (JJ) Bert "nick" '
Out type 1aa full middle    = 'St Van Helsing Jr , John1 JJ Bert nick '
Out type 1b  middle initial = 'St Van Helsing Jr , John1 (JJ) B "nick" '
Out type 1bb middle initial = 'St Van Helsing Jr , John1 JJ B nick '
Out type 2   middle initial = 'St Van Helsing Jr , John1 B '

=======================================================

Found form 'Nick'
Input (raw)                 = 'John2 Bert "nick" Helsing ,Jr '
Normal                      = 'John2 Bert "nick" Helsing ,Jr '

Out type 1a  full middle    = 'Helsing Jr , John2 Bert "nick" '
Out type 1aa full middle    = 'Helsing Jr , John2 Bert nick '
Out type 1b  middle initial = 'Helsing Jr , John2 B "nick" '
Out type 1bb middle initial = 'Helsing Jr , John2 B nick '
Out type 2   middle initial = 'Helsing Jr , John2 B '

=======================================================

Found form 'Nick'
Input (raw)                 = 'John3 Bert "nick" St Van Helsing ,Jr '
Normal                      = 'John3 Bert "nick" St Van Helsing ,Jr '

Out type 1a  full middle    = 'St Van Helsing Jr , John3 Bert "nick" '
Out type 1aa full middle    = 'St Van Helsing Jr , John3 Bert nick '
Out type 1b  middle initial = 'St Van Helsing Jr , John3 B "nick" '
Out type 1bb middle initial = 'St Van Helsing Jr , John3 B nick '
Out type 2   middle initial = 'St Van Helsing Jr , John3 B '

=======================================================

Found form 'First-Last (single)'
Input (raw)                 = 'John4 Helsing '
Normal                      = 'John4 Helsing '

Out type 1a  no middle      = 'Helsing  , John4  '
Out type 1aa no middle      = 'Helsing  , John4  '
Out type 2                  = 'Helsing  , John4 '

=======================================================

Found form 'First-Last (single)'
Input (raw)                 = 'John5 (JJ) Helsing '
Normal                      = 'John5 (JJ) Helsing '

Out type 1a  no middle      = 'Helsing  , John5 (JJ) '
Out type 1aa no middle      = 'Helsing  , John5 JJ '
Out type 2                  = 'Helsing  , John5 '

=======================================================

Found form 'First-Last (multi)'
Input (raw)                 = 'John6 (JJ) Bert St Van Helsing ,Jr '
Normal                      = 'John6 (JJ) Bert St Van Helsing ,Jr '

Permutation 1:
Out type 1a  no middle      = 'Bert St Van Helsing Jr , John6 (JJ) '
Out type 1aa no middle      = 'Bert St Van Helsing Jr , John6 JJ '
Out type 2                  = 'Bert St Van Helsing Jr , John6 '
Permutation 2:
Out type 1a  full middle    = 'St Van Helsing Jr , John6 (JJ) Bert '
Out type 1aa full middle    = 'St Van Helsing Jr , John6 JJ Bert '
Out type 1b  middle initial = 'St Van Helsing Jr , John6 (JJ) B '
Out type 1bb middle initial = 'St Van Helsing Jr , John6 JJ B '
Out type 2   middle initial = 'St Van Helsing Jr , John6 B '

=======================================================

Found form 'First-Last (multi)'
Input (raw)                 = 'John7 Bert St Van Helsing ,Jr '
Normal                      = 'John7 Bert St Van Helsing ,Jr '

Permutation 1:
Out type 1a  no middle      = 'Bert St Van Helsing Jr , John7 '
Out type 1aa no middle      = 'Bert St Van Helsing Jr , John7 '
Out type 2                  = 'Bert St Van Helsing Jr , John7 '
Permutation 2:
Out type 1a  full middle    = 'St Van Helsing Jr , John7 Bert '
Out type 1aa full middle    = 'St Van Helsing Jr , John7 Bert '
Out type 1b  middle initial = 'St Van Helsing Jr , John7 B '
Out type 1bb middle initial = 'St Van Helsing Jr , John7 B '
Out type 2   middle initial = 'St Van Helsing Jr , John7 B '

=======================================================

Form  ***  (unknown)
Input (raw)                 = ' do(e)s not. match ,'
Normal                      = ' do(e)s not match ,'

=======================================================