Search code examples
cubuntutextlines

Keep only last line of a repeated pattern


I would like to know if it is possible to delete all the lines of a selected pattern except the last one. It is not so easy to explain, so I will make an example.

I have a text file with content similar to this:

A sent (1)
A received (1)
B sent (1)
B sent (2)
B sent (3)
B received (1)

I would like to have an alternation between "sent" and "received" messages, where the "sent" one is the last between the sent messages with the same letter. So I need an output like:

A sent (1)
A received (1)
B sent (3)
B received (1)

Is there some program that can do something like that? I can use either Ubuntu or Windows, or build a simple C/C++ application, if necessary.


Solution

  • Here's a simple way:

    tac FILE | uniq -w 6 | tac
    

    We:

    1. Reverse-print the file using tac (necessary for uniq to work right here).
    2. Weed out duplicate lines basing uniqueness on only the first 6 characters (thereby ignoring the incrementing number in parantheses). Only the first line of a set of duplicate lines is kept, which is why we have used tac.
    3. Then reverse-print the file again so it's in the order you want.