Search code examples
awkuniq

Print lines that have no duplicates in a file and preserve sort order linux


I have the following file:

2
1
4
3
2
1

I want the output like this (unique lines that don't have any duplicates and preserve order):

4
3

I tried sort file.txt | uniq -u it works, but output is sorted:

3
4

I tried awk '!x[$0]++' file.txt it keeps order, but it prints all values once:

2
1
4
3

Solution

  • A couple ideas to choose from:

    a) read the input file twice:

    awk '
    FNR==NR         { counts[$0]++; next }  # 1st pass: keep count
    counts[$0] == 1                         # 2nd pass: print rows with count == 1
    ' file.txt file.txt
    

    b) read the input file once:

    awk '
        { lines[NR] = $0                    # maintain ordering of rows
          counts[$0]++
        }
    END { for ( i=1;i<=NR;i++ )             # run thru the indices of the lines[] array and ...
              if ( counts[lines[i]] == 1 )  # if the associated count == 1 then ...
                 print lines[i]             # print the array entry to stdout
        }
    ' file.txt
    

    Both of these generate:

    4
    3