Update: Please keep in mind is that regex is my only option.
Update 2: Actually, I can use a bash based solution as well.
Trying to replace the pipes(can be more than one) that are between double quotes with commas in perl regex
Example
continuer|"First, Name"|123|12412|10/21/2020|"3|7"||Yes|No|No|
Expected output (3 and 7 are separated by a comma)
continuer|"First, Name"|123|12412|10/21/2020|"3,7"||Yes|No|No|
There may be more digits, it may not be just the two d\|d
. It could be "3|7|2"
and the correct output has to be "3,7,2"
for that one. I've tried the following
cat <filename> | perl -pi -e 's/"\d+\|[\|\d]+/\d+,[\|\d]+/g'
but it just puts the actual string of d+
etc...
I'd really appreciate your help. ty
If it must be a regex here is a simpler one
perl -wpe's/("[^"]+")/ $1 =~ s{\|}{,}gr /eg' file
Not bullet-proof but it should work for the shown use case.†
Explanation. With /e
modifier the replacement side is evaluated as code. There, a regex runs on $1
under /r
so that the original ($1
) is unchanged; $N
are read-only and so we can't change $1
and thus couldn't run a "normal" s///
on it. With this modifier the changed string is returned, or the original if there were no changes. Just as ordered.
Once it's tested well enough add -i
to change the input file "in-place" if wanted.
I must add, I see no reason that at least this part of the job can't be done using a CSV parser...
Thanks to ikegami for an improved version
perl -wpe's/"[^"]+"/ $& =~ tr{|}{,}r /eg' file
It's simpler, with no need to capture, and tr
is faster
† Tested with strings like in the question, extended only as far as this
con|"F, N"|12|10/21|"3|7"||Yes|"2||4|12"|"a|b"|No|""|end|