Search code examples
linuxbashshellsortingposix

Linux whole-line sort does not sort correctly


I want to whole-line sort a file with the linux command sort.

My file cat hello

#_
*
#1

When run with sort hello

#_
*
#1

Because # stands before * in asccii table, my expected result is

#_
#1
*

Is there anyone can explain me why? Thank you.


Solution

  • By default gnu sort doesn't sort bytewise, that is, won't follow the order in ascii table. Check this example:

    kent$  cat f1
    a
    b
    c
    A
    B
    C
    
    kent$  sort f1          
    a
    A
    b
    B
    c
    C
    

    If you want the sort to sort bytewise, you can set the LC_ALL:

    kent$  LC_ALL=C    
    kent$  sort f1
    A
    B
    C
    a
    b
    c
    

    Thus, with LC_ALL=C, you get your expected output too.

    kent$  cat f
    #_
    *
    #1
    
    kent$  sort f
    #1
    #_
    *
    

    update

    I just checked the man page, it states this literally as well:

    * WARNING * The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.