Search code examples
awktransposeline-breaks

awk transpose specific colums with added text and done split on nth column since given text after


i have this input file

input: file1

LIB SITE LINUX A FEDORA 1a 22 X64 DC58 ROOM DC LOC 2.b
LIB SITE LINUX A FEDORA 1b 22 X64 DC58 ROOM DC LOC 2.b
LIB SITE OPENV A OPENV 1a 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE OPENV A OPENV 1b 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE OPENV A OPENV 1c 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE WIN9K A WIN9K 1a 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE WIN9K A WIN9K 1b 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE WIN9K A WIN9K 1c 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE WIN9K A WIN9K 1d 22 X64 DC22 ROOM DC LOC 2.b
LIB SITE SUSE B SUSE 2a 22 X64 DC10 ROOM DE LOC 3.ma
LIB SITE SUSE B SUSE 2b 22 X64 DC10 ROOM DE LOC 3.ma
LIB SITE SUSE B SUSE 3c 22 X64 DC10 ROOM DE LOC 3.ma
LIB SITE SUSE B SUSE 4a 22 X64 DC10 ROOM DE LOC 3.ma
LIB SITE LINUX A CENTOS 8b 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE LINUX A CENTOS 9a 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE LINUX A CENTOS 3c 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE LINUX A CENTOS 8a 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 7 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 7de 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 7e 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 7r 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 7t 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 74 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 71 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 73 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 744 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 77 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 75 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 724 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 75 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 71 22 X64 DC41 ROOM DF LOC 5.p
LIB SITE OPENV A OPEN1 72 22 X64 DC41 ROOM DF LOC 5.p

i have this:

awk  -v OFS=" " 'NR > 1 {d[$3]=$1 OFS $4; c[$3] = c[$3] OFS $6} END {for (p in d) print d[p] OFS parent c[p]}' file1

will output to this:

LIB A WIN9K 1a 1b 1c 1d
LIB A OPENV 1a 1b 1c 7 7de 7e 7r 7t 74 71 73 744 77 75 724 75 71 72
LIB B SUSE 2a 2b 3c 4a
LIB A LINUX 1b 8b 9a 3c 8a

i tried severeal piped commands to put into below format where is on any row as well since id> after each ten columns done is done break/split to shorter version of previous same part e.g "LIB A OPENV" with added text.

expected output:

place library>LIB site>A plat>WIN9K id>1a 1b 1c 1d
place library>LIB site>A plat>OPENV id>1a 1b 1c 7 7de 7e 7r 7t 74 71 
place library>LIB site>A plat>OPENV id>73 744 77 75 724 75 71 72
place library>LIB site>B plat>SUSE id>2a 2b 3c 4a
place library>LIB site>A plat>LINUX id>1b 8b 9a 3c 8a

Solution

  • awk '
      NR > 1{
        d[$3]="place library>" $1 " site>" $4 " plat>" $3 " id>"
        if ($3 in c) c[$3]=c[$3] "," $6  # append value with comma
        else c[$3]=$6                    # set initial value
      } 
      END{
        nl=""
        for (p in d){
          n=split(c[p], a, ",")          # split value into array a
          for (i=1;i<=n;i++){
            printf (i%10==1 ? nl d[p] : OFS) a[i]
            nl=ORS                       # add newline
          }
        }
        print ""
      }
    ' file1