Which sed/awk command can I use to cut the last number (and remove delimiter _
) of each string in column 1 and move it to a new column (column 3)?
For example,
$ head -3 test1.bed
HLA:HLA00001_A*01:01:01:01_3503 1
HLA:HLA02169_A*01:01:01:02N_3291 1
HLA:HLA14798_A*01:01:01:03_2903 1
Should become:
$ head -3 test1.bed
HLA:HLA00001_A*01:01:01:01 1 3503
HLA:HLA02169_A*01:01:01:02N 1 3291
HLA:HLA14798_A*01:01:01:03 1 2903
$ sed -E 's/(.*)_([0-9]+)(.*)/\1\3 \2/' file
HLA:HLA00001_A*01:01:01:01 1 3503
HLA:HLA02169_A*01:01:01:02N 1 3291
HLA:HLA14798_A*01:01:01:03 1 2903
The above will work with OSX sed and newer GNU seds where -E = Extended Regexps. With any sed:
$ sed 's/\(.*\)_\([0-9]*\)\(.*\)/\1\3 \2/' file
HLA:HLA00001_A*01:01:01:01 1 3503
HLA:HLA02169_A*01:01:01:02N 1 3291
HLA:HLA14798_A*01:01:01:03 1 2903