I have a pipe delimited file where I want to remove all text before a comma in field 9.
Example line:
www.upstate.edu|upadhyap|Prashant K Upadhyaya, MD||General Surgery|http://www.upstate.edu/hospital/providers/doctors/?docID=upadhyap|Patricia J. Numann Center for Breast, Endocrine & Plastic Surgery|Upstate Specialty Services at Harrison Center|Suite D, 550 Harrison Street||Syracuse|NY|13202|
so the targeted field is: |Suite D, 550 Harrison Street|
and I want it to look like: |550 Harrison Street|
So far what I have tried has either deleted information from other fields (usually the name in field 3) or has had no effect.
The .awk script I have been trying to write looks like this:
mv $1 $1.bak4
cat $1.bak4 | awk -F "|" '{
gsub(/*,/,"", $9);
print $0
}' > $1
The pattern argument to gsub
is a regex not a glob. Your * isn't matching what you expect it to. You want /.*,/
there. You are also going to need to OFS
to | to keep that delimiter.
mv $1 $1.bak4
awk 'BEGIN{ FS = OFS = "|" }{ gsub(/.*,/,"",$9) } 1' $1.bak4 > $1
I also replaced the verbose print line you had with a true pattern (1
) that uses the fact that the default action is print.