I am trying to extract column "m" from multiple txt files (file1.txt, file2.txt,,,etc) and transpose each column to a row in new file.
Below is file1.txt
:
contig_1 contig_1 geneX ctg1_886;ctg1_887;ctg1_888
contig_2 contig_2 geneY ctg1_886;ctg1_887;ctg1_888
contig_3 contig_3 genesZ ctg1_886;ctg1_887;ctg1_888
I would like to have a summary.txt
file which looks like:
file1 geneX geneY geneZ
file2 geneA geneY
.
.
.
etc.
Total row numbers may vary between files. I tried using awk
without success.
Following glenn jackmans advise from the comments, an GNU AWK solution would look like this:
awk 'BEGIN {ORS=" "} BEGINFILE{print FILENAME} {print $3} ENDFILE{ printf("\n")}' file*.txt
And an awk solution could look like this (sorry only gnu awk for testing):
awk 'BEGIN {ORS=" "} FNR==1 {printf("\n%s", FILENAME)} {print $3} END{printf("\n")} '
Explanation
There are several special patterns:
BEGIN
, its action is executed once at the beginning. Here the ORS
( output record separator) is set to space, the effect is that you get from each original row a new column, this is the transpose stepEND
action is executed once at the endBEGINFILE
and ENDFILE
actions are executed once at the beginning and end of the processing of each file. Here the FILENAME
respectively a linefeed is printed.