Search code examples
sedawkcut

Separating a nested field into two new fields, maintaining order


I've been trying to break a sample file as below such that the third column becomes two parts while maintaining order within the file.

100 400 500.00APPLE 5.8 9.2

200 300 600.00DOG 5.3 9.1

300 763 454.44KITTEN 5.7 9.2

Should result in

100 400 500.00 APPLE 5.8 9.2

200 300 600.00 DOG 5.3 9.1

300 763 454.44 KITTEN 5.7 9.2

I've toyed doing this in awk but seem to be having issues.

PS: The point upon which to separate is always a digit [0-9] followed by [a-zA-Z] in regex.


Solution

  • Try:

    sed 's/\([0-9]\)\([A-Z]\)/\1 \2/' ./infile
    

    Proof of Concept

    $ sed 's/\([0-9]\)\([A-Z]\)/\1 \2/' ./infile
    100 400 500.00 APPLE 5.8 9.2
    200 300 600.00 DOG 5.3 9.1
    300 763 454.44 KITTEN 5.7 9.2
    

    Or if you have gawk you can limit the split to just the 3rd field by using:

    awk '{$3=gensub(/([0-9])([A-Z])/,"\\1 \\2","",$3)}1' ./infile
    

    Proof of Concept

    $ awk '{$3=gensub(/([0-9])([A-Z])/,"\\1 \\2","",$3)}1' ./infile
    100 400 500.00 APPLE 5.8 9.2
    200 300 600.00 DOG 5.3 9.1
    300 763 454.44 KITTEN 5.7 9.2