I want to add a column with a constant value at the end of each line of a file in bash, whilst selecting columns, doing a mathematical operation, and changing the field separator (from what I think is just tab) to space.
My input file:
10:100968448:T:AA 0.3519 10 100968448 t aa 1.0024 0.01 0.812
10:101574552:A:ATG 0.4493 10 101574552 a atg 0.98906 0.0097 0.2585
10:102244152:A:AG 0.2008 10 102244152 a ag 0.996705 0.0114 0.7701
10:102290698:A:AG 0.1899 10 102290698 a ag 0.993024 0.0114 0.5431
10:104999458:T:TG 0.3449 10 104999458 t tg 0.956763 0.0101 1.149e-05
If I throw the constant at the second to last column:
awk -v OFS=" " 'BEGIN { FS = "\t" } ; {print $1, $5, $6, log($7)/log(10), '105318', $9}' input
It works:
10:100968448:T:AA t aa 0.00104106 105318 0.812
10:101574552:A:ATG a atg -0.00477736 105318 0.2585
10:102244152:A:AG a ag -0.00143336 105318 0.7701
10:102290698:A:AG a ag -0.00304026 105318 0.5431
10:104999458:T:TG t tg -0.0191956 105318 1.149e-05
But when I try putting the constant at the end of the file, as I need it:
awk -v OFS=" " 'BEGIN { FS = "\t" } ; {print $1, $5, $6, log($7)/log(10), $9, '105318'}' input
It doesn't really work (it's adding the constant to the first field):
10531868448:T:AA t aa 0.00104106 0.812
10531874552:A:ATG a atg -0.00477736 0.2585
10531844152:A:AG a ag -0.00143336 0.7701
10531890698:A:AG a ag -0.00304026 0.5431
10531899458:T:TG t tg -0.0191956 1.149e-05
I even tried using the file where it works, shuffling the columns, and the constant is added somewhere random... I have used dos2unix on this file, thinking maybe there's some weird character in it, but the problem remains the same. When I use comma as the output field separator, I see that the multiple commas are generated at the end of the file (when I try to include the constant as the last column).
For clarification, desired output:
10:100968448:T:AA t aa 0.00104106 0.812 105318
10:101574552:A:ATG a atg -0.00477736 0.2585 105318
10:102244152:A:AG a ag -0.00143336 0.7701 105318
10:102290698:A:AG a ag -0.00304026 0.5431 105318
10:104999458:T:TG t tg -0.0191956 1.149e-05 105318
Any ideas?
Could you please try following.
awk '{print $1,$5,$6,log($7)/log(10),$NF,105318}' Input_file
In case you have control M characters as per Kamil's answer then run following.
awk '{gsub(/\r/,"");print $1,$5,$6,log($7)/log(10),$NF,105318}' Input_file