Search code examples
sortingunicodecase-sensitiveicu

ICU: demonstrate case-sensitive sorting


On the ICU collation demo page I entered the following words into the Input textbox:

Adam
apple
Bob

How to set up a case-sensitive sorting, where

  1. small letters come first, i.e. apple < Adam < Bob,
  2. capital letters come first, i.e. Adam < Bob < apple?

Could you give some explanation?


Solution

  • If you specify case first: lower, then lower case letters are sorted before uppercase within that group. In the example below the 2A strings sort before the 5C strings. Within the 2A stings, the 05 sort together, with u1c sorting before _05.

    I specified the following settings:

    • strength: primary
    • case level: on
    • case first: lower

    Input: ADam, Za, ZA, zzz, Zb, Adam, apple, ADAM
    Output:

    Adam  [2A,05,u1C][30,05,_05][2A,05,_05][42,05,_05]  
    ADam  [2A,05,u1C][30,05,u1C][2A,05,_05][42,05,_05]  
    ADAM  [2A,05,u1C][30,05,u1C][2A,05,u1C][42,05,u1C]  
    apple [2A,05,_05][48,05,_05][48,05,_05][40,05,_05][32,05,_05]  
    Za    [5C,05,u1C][2A,05,_05]  
    ZA    [5C,05,u1C][2A,05,u1C]  
    Zb    [5C,05,u1C][2C,05,_05]  
    zzz   [5C,05,_05][5C,05,_05][5C,05,_05]
    

    Toggle the case first and the Adam's reverse position.

    You can also tailor the sort by adding your own rules.

     & z <* A-Z
    

    Will place all the lowercase strings before the uppercase strings. So with all defaults selected, the output is:

     apple [2A,05,_05][48,05,_05][48,05,_05][40,05,_05][32,05,_05]
     zzz   [5C,05,_05][5C,05,_05][5C,05,_05]
     Adam  [5D02,05,u05][30,05,_05][2A,05,_05][42,05,_05]
     ADam  [5D02,05,u05][5D0502,05,u05][2A,05,_05][42,05,_05]
     ADAM  [5D02,05,u05][5D0502,05,u05][5D02,05,u05][5D050B,05,u05]
     Za    [5D0518,05,u05][2A,05,_05]
     Zb    [5D0518,05,u05][2C,05,_05]
     ZA    [5D0518,05,u05][5D02,05,u05]