For column 2 in my input files I want to keep the part after the hyphen. I have tried a cut
command, but don't know how to apply this to the second column only:
echo TCCCATATGGTCTAGCGGTTAGGATTCCT 1-230823 | cut -d - -f 2
230823
Input:
TCCCATATGGTCTAGCGGTTAGGATTCCT 1-230823
GCATTGGTGGTTCAGTGGTAGAATTCTC 2-172580
Out:
TCCCATATGGTCTAGCGGTTAGGATTCCT 230823
GCATTGGTGGTTCAGTGGTAGAATTCTC 172580
You can use the following sed
command:
sed -E 's/^([^[:space:]]+[[:blank:]]+)[0-9]+-/\1/' file
See the online sed
demo:
s='TCCCATATGGTCTAGCGGTTAGGATTCCT 1-230823
GCATTGGTGGTTCAGTGGTAGAATTCTC 2-172580'
sed -E 's/^([^[:space:]]+[[:blank:]]+)[0-9]+-/\1/' <<< "$s"
# TCCCATATGGTCTAGCGGTTAGGATTCCT 230823
# GCATTGGTGGTTCAGTGGTAGAATTCTC 172580
The POSIX ERE (-E
option enables this syntax) regex matches
^
- start of string([^[:space:]]+[[:blank:]]+)
- Group 1 (\1
refers to this group value): one or more non-whitespace chars followed with one or more horizontal whitespace chars[0-9]+-
- 1 or more digits and a -
.