I wanted to sort a file name reportA with following contents
pat_int_parallel_all
/projects/test
-v ../../../../../../te
min_custom.v
-v ../../../../../../tes
-y ../../../../../../test_
-y ../../../../../../test_lib/test
../../../../../../tesla
/projects/checklist
../../../../../../test_lib/LIB
../../../../../../telib/av
../../../../../../telib/te
+libext+.v
+incdir+/projectsst_relea/ana
when i tried sort -u -r reportA >output . I got this result
-y ../../../../../../test_lib/test
-y ../../../../../../test_
-v ../../../../../../tes
-v ../../../../../../te
../../../../../../test_lib/LIB
../../../../../../test
../../../../../../telib/te
../../../../../../telib/av
/projects/test /projects/checklist
pat_int_parallel_all min_custom.v
+libext+.v
+incdir+/projectsst_relea/ana
My locale output is en_US
LANG=en_US
LC_CTYPE="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_COLLATE="en_US"
LC_MONETARY="en_US"
LC_MESSAGES="en_US"
LC_PAPER="en_US"
LC_NAME="en_US"
LC_ADDRESS="en_US"
LC_TELEPHONE="en_US"
LC_MEASUREMENT="en_US"
LC_IDENTIFICATION="en_US"
LC_ALL=
But for the other user with same sort command it resulted in a different output.
pat_int_parallel_all
min_custom.v
/projects/test
/projects/checklist
../../../../../../test_lib/LIB
../../../../../../tesla
../../../../../../telib/te
../../../../../../telib/av
-y ../../../../../../test_lib/test
-y ../../../../../../test_
-v ../../../../../../tes
-v ../../../../../../te
+libext+.v
+incdir+/projectsst_relea/ana
My friends locale output is C
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C
I was wondering why a normal uinx sort command is giving two different results when my sort alias,SHELL version is same as other user. Even cshrc settings are same. Is it due to the special characters?
Can someone explain what's wrong here.
The ground reason of the different behavior of sort
is the value of LC_COLLATE
. The output of man 7 locale
says:
LC_COLLATE
This category governs the collation rules used for sorting and regular expressions, including character equivalence classes and multicharacter collating elements. This locale category changes the behavior of the functions
strcoll(3)
andstrxfrm(3)
, which are used to compare strings in the local alphabet. For example, the German sharp s is sorted as "ss".
My (very quick) analysis of sort
source code, is that it transforms lines of text to be sorted with strxfrm()
to get a basis of comparison, so that byte
strings that would otherwise considered to be equal are considered equal here even if their bytes differ (sic).
Regarding the fact that you still get the same output is, as said by @Amadan, quite strange. Are you sure you have set the locale properly? Could you try LC_COLLATE="C" sort -ru your_file
.