I have a pipe-delimited csv. Each row should have just three pipes to separate the columns. I need to find any lines that do NOT have three pipes - more OR less should match.
I'm learning regex and I came up with this (kind of hacked together finding parts I thought would work...)
^(?:[^|\r\n]*\|){3,}.*$
However, it's just matching all rows, regardless of the number of pipes in the row.
What's the correct syntax for what I want to do?
[UPDATE]
As @anubhava pointed out, I should provide an example.
This is example data in my file:
John Doe|1hgds234|Some comment|
Mary Jane|5df678|This column is the end of this record|Harry Jones|3456|Harry's record should be on the next line|
Sue Anderson|037dsf533|Another comment|
Harry Jones' record should start on a new line, starting at "Harry". Each line ends in a pipe and CRLF.
So I need a find/replace with a regex that would match on that second line and put a CRLF after the third pipe in the second line.
Assuming you don't have escaped |
or |
inside quoted cell value, uou can match using this regex:
^((?:[^|\n]*\|){3})(?![\r\n])
And replace this with:
$1\n
RegEx Details:
^
: Start(
: Start capture group #1
(?:[^|\n]*\|){3}
:)
: End capture group(?![\r\n])
: Negative lookahead to assert that we don't have \r
or \n
ahead of the current position