Search code examples
bashawksedmv

Sed removing text from a File


i have a log file with the following lines

"TSAGE_20160304193254_AAA_29792A_1103.tgz:Binary file (standard input) matches"

i need to remove the first of the line up to 29792A and the text after that so the file just looks like this:

29745gv92A
297342A
2934792A
29755692A
29778892A

Solution

  • You can use cut to get something between delimiters like an _.
    When you want to redirect it to another file, use

    cut -d"_" -f4 logfile > otherfile
    

    You can do something like this with sed, but you will need to tell sed that it will have to skip the pattern [^_]*_ (character except underscore repeated 0 or more times) followed by an underscore). You have to skip this (pattern) {3} times from the ^ beginning of the line.
    The second string you match ([^_]*) is the part you want. The .* is the rest of the line and is garbage.
    The first part of sed will have 2 strings remembered, so recall \2 to get the second.
    Together with backslashes you will have

    sed 's/^\([^_]*_\)\{3\}\([^_]*\).*/\2/' logfile
    

    I did not test the sed command, the cut is better.