I have file called "test.txt" which looks like this:
10
10
10
8
10
9
10
10
9
10
8
For some reason, when I ran uniq test.txt
, I got this output:
10
8
10
9
10
9
10
8
Why am I getting this output? I am using BSD uniq. Is there some sort of bug in the program?
I'm not an expert, but I'm pretty sure uniq only compares adjacent lines. I don't have to use it too often but running man uniq on my system I get:
The uniq utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file. If input_file is a single dash (`-') or absent, the standard input is read. If output_file is absent, standard output is used for output. The second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are not adja- cent, so it may be necessary to sort the files first.
So they have to be adjacent to be detected. Hence, the repeated reports. They're different than the adjacent lines, and that's all uniq actually tests for.
Hope that helps. Lemme know if I missed anything, I'm actually sort of curious, too.