Search code examples
awksedcut

Move last character of each line to new column


Which sed/awk command can I use to cut the last number (and remove delimiter _) of each string in column 1 and move it to a new column (column 3)?

For example,

$ head -3 test1.bed
HLA:HLA00001_A*01:01:01:01_3503  1
HLA:HLA02169_A*01:01:01:02N_3291 1
HLA:HLA14798_A*01:01:01:03_2903  1

Should become:

$ head -3 test1.bed
HLA:HLA00001_A*01:01:01:01  1 3503
HLA:HLA02169_A*01:01:01:02N 1 3291
HLA:HLA14798_A*01:01:01:03  1 2903

Solution

  • $ sed -E 's/(.*)_([0-9]+)(.*)/\1\3 \2/' file
    HLA:HLA00001_A*01:01:01:01  1 3503
    HLA:HLA02169_A*01:01:01:02N 1 3291
    HLA:HLA14798_A*01:01:01:03  1 2903
    

    The above will work with OSX sed and newer GNU seds where -E = Extended Regexps. With any sed:

    $ sed 's/\(.*\)_\([0-9]*\)\(.*\)/\1\3 \2/' file
    HLA:HLA00001_A*01:01:01:01  1 3503
    HLA:HLA02169_A*01:01:01:02N 1 3291
    HLA:HLA14798_A*01:01:01:03  1 2903