How can I remove string data between double quotes in a Pipe Delimited file? Would like to maintain file structure and could replace the string data with "NULL" or a space? File can have multiple rows with 150 + columns with Alpha/numeric data between "".
Sample Data (one row):"900085322P"|"Fred Flinstone"|"GUMBY"|" "|"83562315"|"00010101"|"M"|"FRED"|"9999999999"|"Address1"|" "|"city"|"state"|"zip"|"US"|"street"|" "|"city"|"State"|"Zipcode"|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|"021"|"Y"|"Y"|"FILLER"|""|""|""|""|""|""|""|"1538637137"|"003"|"20200601"|"20200630"|" "|"Address1"|" "|"City"|"State"|"Zip"|""
Expected output:
""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|""|"",etc...
Looking at you data, there are no escaped double quotes, so you might use a pattern that matches between the double quotes, but should also not match a pipe char:
"\K[^"|]+(?=")
Explanation
"
Match a double quote\K
Forget what is matched so far[^"|]*
match 1+ chars except a double quote or pipe(?=")
Assert a double quote to the rightSee a regex demo.
If there can be escaped double quotes:
"\K[^"|\\]*(?:\\.[^"|\\]*)*(?=")