Search code examples
awkdelimiter

Delete text before comma in a delimited field


I have a pipe delimited file where I want to remove all text before a comma in field 9.

Example line:

www.upstate.edu|upadhyap|Prashant K Upadhyaya, MD||General Surgery|http://www.upstate.edu/hospital/providers/doctors/?docID=upadhyap|Patricia J. Numann Center for Breast, Endocrine & Plastic Surgery|Upstate Specialty Services at Harrison Center|Suite D, 550 Harrison Street||Syracuse|NY|13202|

so the targeted field is: |Suite D, 550 Harrison Street|

and I want it to look like: |550 Harrison Street|

So far what I have tried has either deleted information from other fields (usually the name in field 3) or has had no effect.

The .awk script I have been trying to write looks like this:

mv $1 $1.bak4 
cat $1.bak4 | awk -F "|" '{
    gsub(/*,/,"", $9);
    print $0
}'  > $1

Solution

  • The pattern argument to gsub is a regex not a glob. Your * isn't matching what you expect it to. You want /.*,/ there. You are also going to need to OFS to | to keep that delimiter.

    mv $1 $1.bak4 
    awk 'BEGIN{ FS = OFS = "|" }{ gsub(/.*,/,"",$9) } 1' $1.bak4 > $1
    

    I also replaced the verbose print line you had with a true pattern (1) that uses the fact that the default action is print.