I'm sorting a list of usernames. When the letters are lowercase, the sort command works as expected.
Expected and actual output for lowercase:
n
n_123
na
na_123
When the characters are uppercase and followed by an underscore, things get weird.
Expected output for uppercase:
N
N_123
NA
NA_123
Actual output for uppercase using sort:
N
NA
NA_123
N_123
I thought I'd be able to solve this using
env LC_COLLATE=C sort $file
but no dice.
Actual output using env LC_COLLATE=C sort:
N
NA
NA_123
N_123
I'm running GNU bash, version 4.4.12(1)-release (x86_64-apple-darwin16.3.0) on Mac OS X 10.12.3
Any help would be much appreciated.
Underscore is ASCII 95
and that comes after all the uppercase letters (A-Z
) i.e. 65-90
. So in sorting uppercase letters will always come before _
.
If you want to delimit at _
then you can use -t _
to get your expected output:
sort -t _ -k1,1 file
N
N_123
NA
NA_123
Reason why your sort
command worked with lowercase letters is because lowercase letters come after _
i.e. 97-122