Search code examples
bashsedwhitespaceend-of-line

bash sed processing data with end of line or potentially something else


I have this two types of outputs:

UID:474D229F698D494E889D85CEF9303B97:480 f
UID:474D229F698D494E889D85CEF9303B97:480

I want to get the 32 char long uid with the 480 at the end of it. (Note that there is nothing after 480 for the second type of input) Desired output:

474D229F698D494E889D85CEF9303B97:480
474D229F698D494E889D85CEF9303B97:480

I am using sed:

cat input.txt | sed 's!UID:\(.*\):\([0-9]*\)[\s]*!Captured:\1:\2!'

but the output is:

Captured:474D229F698D494E889D85CEF9303B97:480 f
Captured:474D229F698D494E889D85CEF9303B97:480

Solution

  • awk to the rescue?

    $ awk -F"[: ]" '{print $2":"$3}' file
    474D229F698D494E889D85CEF9303B97:480
    474D229F698D494E889D85CEF9303B97:480
    

    Explanation: we define different possible field separators : or space . Once the text has been splitted, then we print the 2nd and 3rd field.

    sed way could be the following:

    $ sed 's/UID:\([^:]*\):\([^ ]*\).*/Captured:\1:\2/g' file
    Captured:474D229F698D494E889D85CEF9303B97:480
    Captured:474D229F698D494E889D85CEF9303B97:480
    

    Explanation: We see the text is based on the pattern UID:number:number something. Hence, we get it with UID:\([^:]*\):\([^ ]*\).*. With \( expression \) we capture the text we need, so that it can be printed later on with \1, \2...