Search code examples
c#.netstringsortinglexicographic

What is the shortest way in .NET to sort strings starting with 1, 10 and 2 and respect the number ordering?


I need to sort file names as follows: 1.log, 2.log, 10.log

But when I use OrderBy(fn => fn) it will sort them as: 1.log, 10.log, 2.log

I obviously know that this could be done by writing another comparer, but is there a simpler way to change from lexicographical order to natural sort order?

Edit: the objective is to obtain the same ordering as when selecting "order by name" in Windows Explorer.


Solution

  • You can use the Win32 CompareStringEx function. On Windows 7 it supports the sorting you need. You will have use P/Invoke:

    static readonly Int32 NORM_IGNORECASE = 0x00000001;
    static readonly Int32 NORM_IGNORENONSPACE = 0x00000002;
    static readonly Int32 NORM_IGNORESYMBOLS = 0x00000004;
    static readonly Int32 LINGUISTIC_IGNORECASE = 0x00000010;
    static readonly Int32 LINGUISTIC_IGNOREDIACRITIC = 0x00000020;
    static readonly Int32 NORM_IGNOREKANATYPE = 0x00010000;
    static readonly Int32 NORM_IGNOREWIDTH = 0x00020000;
    static readonly Int32 NORM_LINGUISTIC_CASING = 0x08000000;
    static readonly Int32 SORT_STRINGSORT = 0x00001000;
    static readonly Int32 SORT_DIGITSASNUMBERS = 0x00000008; 
    
    static readonly String LOCALE_NAME_USER_DEFAULT = null;
    static readonly String LOCALE_NAME_INVARIANT = String.Empty;
    static readonly String LOCALE_NAME_SYSTEM_DEFAULT = "!sys-default-locale";
    
    [DllImport("kernel32.dll", CharSet = CharSet.Unicode)]
    static extern Int32 CompareStringEx(
      String localeName,
      Int32 flags,
      String str1,
      Int32 count1,
      String str2,
      Int32 count2,
      IntPtr versionInformation,
      IntPtr reserved,
      Int32 param
    );
    

    You can then create an IComparer that uses the SORT_DIGITSASNUMBERS flag:

    class LexicographicalComparer : IComparer<String> {
    
      readonly String locale;
    
      public LexicographicalComparer() : this(CultureInfo.CurrentCulture) { }
    
      public LexicographicalComparer(CultureInfo cultureInfo) {
        if (cultureInfo.IsNeutralCulture)
          this.locale = LOCALE_NAME_INVARIANT;
        else
          this.locale = cultureInfo.Name;
      }
    
      public Int32 Compare(String x, String y) {
        // CompareStringEx return 1, 2, or 3. Subtract 2 to get the return value.
        return CompareStringEx( 
          this.locale, 
          SORT_DIGITSASNUMBERS, // Add other flags if required.
          x, 
          x.Length, 
          y, 
          y.Length, 
          IntPtr.Zero, 
          IntPtr.Zero, 
          0) - 2; 
      }
    
    }
    

    You can then use the IComparer in various sorting API's:

    var names = new [] { "2.log", "10.log", "1.log" };
    var sortedNames = names.OrderBy(s => s, new LexicographicalComparer());
    

    You can also use StrCmpLogicalW which is the function used by Windows Explorer. It has been available since Windows XP:

    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
    static extern Int32 StrCmpLogical(String x, String y);
    
    class LexicographicalComparer : IComparer<String> {
    
      public Int32 Compare(String x, String y) {
        return StrCmpLogical(x, y);
      }
    
    }
    

    Simpler, but you have less control over the comparison.