Search code examples
bashpipecutio-redirection

How do you process multiple fields running through a pipe?


If I have a tab delimeted data file input.dat with a format like:

#id  acct    name   city          age
 12  100290  Sally  San Francisco 24
 15  102911  Jerry  Sacramento    40
 99  102134  Amir   Eureka        82

Can I use cut(1) or something similar to run multiple processing functions ex:( lookup_id, scrub_acct, scrub_name, lookup_city, scrub_age) per field as the data runs through a pipe?

It's easy to do this with one field:

cat input.dat | cut -f1 | lookup_id > output.dat

but I'm wondering if there's a way to do this per field, and have the results redirected to output.dat.

#id  acct    name   city          age
 AA  XXXXX0  SXXXX  city-57       20s
 AC  XXXXX1  JXXXX  city-29       40s
 AF  XXXXX4  AXXXX  city-100      80s

Maybe a pre-question is can you (do this simply)?

I'm also considering how paste(1) might be an alternative to just glue the columns back together, but maybe there's a better way.


Solution

  • Usually it is easier to process row, column data in awk but due to involvement of shell functions it is better to handle this in shell itself.

    Assuming lookup_id, scrub_acct, scrub_name, lookup_city, scrub_age are shell functions or scripts that are reading input from stdin you can create an array of them and call them while looping through each record from input file:

    # example shell functions
    lookup_id() { read str; printf "lookup_id: %s\n" "$str"; }
    scrub_acct() { read str; printf "scrub_acct: %s\n" "$str"; }
    scrub_name() { read str; printf "scrub_name: %s\n" "$str"; }
    lookup_city() { read str; printf "lookup_city: %s\n" "$str"; }
    scrub_age() { read str; printf "scrub_age: %s\n" "$str"; }    
    
    # array of functions or scripts to be invoked
    fnarr=(lookup_id scrub_acct scrub_name lookup_city scrub_age)
    
    # main processing
    while IFS=$'\t' read -ra ary; do
       for ((i=0; i<${#ary[@]}; i++)); do
          # call function for each field value
          "${fnarr[i]}" <<< "${ary[i]}"
       done
       echo '============================='
    done < <(tail -n +2 file)
    

    Output:

    lookup_id: 12
    scrub_acct: 100290
    scrub_name: Sally
    lookup_city: San Francisco
    scrub_age: 24
    =============================
    lookup_id: 15
    scrub_acct: 102911
    scrub_name: Jerry
    lookup_city: Sacramento
    scrub_age: 40
    =============================
    lookup_id: 99
    scrub_acct: 102134
    scrub_name: Amir
    lookup_city: Eureka
    scrub_age: 82
    =============================