Search code examples
regexnotepad++textpadgenealogygedcom

"Gedcom file - How to split names and eliminating double places."


I have two questions - both related to my gedcom-file for my genealogical tree (I use both notepad++ and textpad):

1.)

I have around 1000 people, who have De, La, Le, Van, Von, etc., as part of their Surname (in the beginning of it), and I would like for instance “Von” to be part of their Given Name (in the end of it).

How can I do a change for all the people (as a global change), who is for example named “Von”?

2.)

I have many double village/town/city names, for instances “Copenhagen, Copenhagen, Denmark”. I would the double word just to be a single word, so it would be “Copenhagen, Denmark”.

How can I do a change so double words becomes a single word (as a global change?

Hope someone can help me with these two questions.

Thanks in advance!

Best regards, Nick

Here is an example on what I mean:

0 @I@ INDI 1 NAME Anna /Von Hat/ 2 GIVN Anna 2 SURN Von Hat 1 BIRT 2 DATE 01 Jan 2000 2 PLAC Copenhagen, Copenhagen, Denmark

To:

0 @I@ INDI 1 NAME Anna von /Hat/ 2 GIVN Amalie Nydia Anna von 2 SURN Lysarch Koenigk 1 BIRT 2 DATE 01 Jan 1940 2 PLAC Copenhagen, Denmark


Solution

  • For the first question, you you can add these names inside a round brackets to create a group for them and add the sign of the optional | between them as the following. (De|La|Le|Van|Von). This code will highlight them all and add them in a group. Then, match the first name based on your text by using whatever cue these first names have. For example:

    ([a-zA-Z]+) \/(De|La|Le|Van|Von) 
    

    Then replaced by

    $1 $2 /
    

    Demo: https://regex101.com/r/9QT99V/2/

    As for the second question, you could make use of this sign \1 which matches the repeated instances. For example, in your code, you could match the cities by matching any word that is followed by a comma ,, and make them inside a group by using ( ), then add \1 to match the repeated string. Example:

    ([a-zA-Z]+, )\1
    

    Replace with:

    $1
    

    Demo: https://regex101.com/r/Dm76wn/1/