Search code examples
unixconsolecommand-line-tool

unix console: how to delete lines that contained in other lines


I have a text file that contains sorted paths e.g.

/abc
/abc/def
/abc/jkl
/def
/def/jkl
/def/jkl/yui
/def/xsd
/zde

now I'd like to delete lines that contained in other lines in this case following lines should stay:

/abc/def
/abc/jkl
/def/jkl/yui
/def/xsd
/zde

Solution

  • Using awk and tac (concatenate and print files in reverse):

    $ tac test.txt | awk '{ if (substr(prev, 1, length($0)) != $0) print $0; prev = $0}' | tac
    /abc/def
    /abc/jkl
    /def/jkl/yui
    /def/xsd
    /zde
    

    Here's a more readable version of awk:

    {
        if (substr(prev, 1, length($0)) != $0)  # Compare with last line (substring?)
            print $0;
        prev = $0  # Remember the last line
    }