Search code examples
vb.netcsvnotepad++

remove lines that has duplicated words in begining before comma


Welcome ,

here's the text file

Soma, ID 6588, 1988

Lara, ID 4652, 1995

John, ID 1098, 1987

Soma, ID 7898, 1998

John, ID 1024, 1996

i want to delete any line that starts with a duplicated word before the first comma .

so the text will be :

Soma, ID 6588, 1988

Lara, ID 4652, 1995

John, ID 1024, 1996

the order is not necessary.

any ideas ?

using : notepad++ , vb.net .


Solution

  • No need of VB.net, regular expression can do this.

    1. Open your text file with notepad++
    2. Show Replace window by pressing CTRL+H
    3. In "Find what", input (^[^,]+).+\r\n((.|\r\n)+)\1
    4. In "Replace with", input \2\1
    5. In "Search Mode", choose "Regular expression"
    6. Click "Replace All" button a few times until 0 occurrence was replaced
    7. You got what you want.