Search code examples
c#wpfsortingenumerable

Sorting a set of values with special characters


I have a task to sort a set of values in the DataTable. BUt the tricky part is that the values have special characters like '.' or '-' and negative values as well. The following picture is my current output: enter image description here

the data are separated by ';' in the picture. I am using the following code to sort the data.

DataTable myDataTable = new DataTable();
        myDataTable.Columns.Add("SN", typeof(string));
        string myValues = "";

        if (!File.Exists("d:\\DUDBC-values.txt")) //DUDBC-values.txt
        {
            Console.WriteLine("No file found");
            Console.ReadKey();
            return;
        }
        StreamReader file = new StreamReader("d:\\DUDBC-values.txt");
        string line;
        while ((line = file.ReadLine()) != null)
        {
            myValues += line;
            myValues += ";";
        }
        file.Close();


        string[] myValuesArray = myValues.Split(';');
        myValuesArray = myValuesArray.Take(myValuesArray.Count() - 1).ToArray();

        foreach (string myValue in myValuesArray)
        {
            DataRow myRow = myDataTable.NewRow();
            myRow["SN"] = myValue;
            myDataTable.Rows.Add(myRow);
        }

        string beforeSort = string.Join(";", myDataTable.AsEnumerable().Select(x => x["SN"]));
        Console.WriteLine("Before Sorting:");
        Console.WriteLine();

        Console.WriteLine(beforeSort);

        Console.WriteLine();


        IEnumerable<DataRow> sortedValues = myDataTable.AsEnumerable()
                                                 .OrderBy(x =>
                                                 {
                                                     string currentStringValue = x["SN"].ToString();
                                                     char[] SplitChar = new char[] { '.', '-' };
                                                     string[] currentStringValueArray = new string[1];
                                                     try
                                                     {
                                                         float val = float.Parse(currentStringValue);
                                                         currentStringValueArray[0] = currentStringValue;

                                                     }
                                                     catch { 
                                                     currentStringValueArray = currentStringValue.Split(SplitChar);
                                                     }

                                                     string currentPart = "";
                                                     int currentPartNumeric = 0;


                                                     if (currentStringValueArray.Length > 1)
                                                     {
                                                         for (int i = 0; i < currentStringValueArray.Length; i++)
                                                         {
                                                             if (int.TryParse(currentStringValueArray[i], out currentPartNumeric))
                                                             {
                                                                 if (i >= 1)
                                                                     currentPart += ".";
                                                                 currentPart += currentPartNumeric.ToString();
                                                             }
                                                             else
                                                             {
                                                                 try
                                                                 {
                                                                     if (i >= 1)
                                                                         currentPart += ".";
                                                                     currentPart += (((int)(char.ToUpper(char.Parse(currentStringValueArray[i])))) - 64).ToString();
                                                                 }
                                                                 catch { }
                                                             }

                                                         }
                                                         return Convert.ToString(currentPart, CultureInfo.InvariantCulture);
                                                     }

                                                     else
                                                         return 0m.ToString();
                                                 });
string afterSort = string.Join(";", sortedValues.Select(x => x["SN"]));
        Console.WriteLine("After Sorting:");
        Console.WriteLine();

        Console.WriteLine(afterSort);
        //Copy to your existing datatable
        myDataTable = sortedValues.CopyToDataTable();
        Console.ReadKey();

I was expecting it to be like this:

-1
1.1.a.1
1.2.a.1
1.2.a.2
1.2.a.3
1.3.1
2.1.2
2.1a.1
2.1a.2
2.5
2.6.1
2.7.1
2.7.2
2.7.16
2.25a
2.25b
2.42.1
2.42.2
3.1.1
3.1.2
3.5.2
3.6a.1
3.6a.2
3.6b.2
5.1a.1
5.1a.2
5.1a.3
5.1b.1
5.1b.2
5.1b.6
6.3.1
6.3.2
6.3.3
6.3.4
6.3.5
6.5.1
6.5.2-C11
6.5.3-C12
17.06.01.b.i
17.06.02.b.i
17.06.02.b.vi
18.01.b
18.02.01.b.iii
1000

What am i doing wrong? Help needed please. I had also asked this type of question in this post until users kept putting different types of values.


Solution

  • It looks like you need to sort in what's called "Natural sort order".

    There is a Windows API function, StrCmpLogicalW() that you can use to do such a comparison.

    You can wrap this in a set of extension methods for sorting List<T> or arrays like so:

    public static class NaturalSortExt
    {
        /// <summary>Sorts a list in "Natural sort order", i.e. "9" sorts before "10".</summary>
        /// <typeparam name="T">The type of elements in the list to be sorted.</typeparam>
        /// <param name="self">The list to be sorted.</param>
        /// <param name="stringSelector">A projection to convert list elements to strings for comparision.</param>
    
        public static void SortNatural<T>(this List<T> self, Func<T, string> stringSelector)
        {
            self.Sort((lhs, rhs) => StrCmpLogicalW(stringSelector(lhs), stringSelector(rhs)));
        }
    
        /// <summary>Sorts a list in "Natural sort order", i.e. "9" sorts before "10".</summary>
        /// <param name="self">The list to be sorted.</param>
    
        public static void SortNatural(this List<string> self)
        {
            self.Sort(StrCmpLogicalW);
        }
    
        /// <summary>Sorts an array in "Natural sort order", i.e. "9" sorts before "10".</summary>
        /// <param name="self">The array to be sorted.</param>
    
        public static void SortNatural(this string[] self)
        {
            Array.Sort(self, StrCmpLogicalW);
        }
    
        [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
        static extern int StrCmpLogicalW(string lhs, string rhs);
    }
    

    Then you can just sort your array (or List<T>) as demonstrated in the sample code below:

    class Program
    {
        static void Main()
        {
            string[] test =
            {
                "3.1.2",
                "1.2.a.1",
                "1.2.a.2",
                "1.3.1",
                "2.1.2",
                "2.1a.2",
                "2.1a.1",
                "-1",
                "2.5",
                "2.7.1",
                "1.1.a.1",
                "2.7.16",
                "2.7.2",
                "2.25a",
                "2.6.1",
                "5.1a.3",
                "2.42.2",
                "2.25b",
                "2.42.1",
                "3.6a.2",
                "5.1b.1",
                "3.1.1",
                "3.5.2",
                "3.6a.1",
                "3.6b.2",
                "5.1a.1",
                "1.2.a.3",
                "5.1b.2",
                "5.1b.6",
                "6.3.1",
                "6.3.2",
                "17.06.02.b.i",
                "6.3.3",
                "5.1a.2",
                "6.3.4",
                "6.3.5",
                "6.5.1",
                "1000",
                "6.5.2-C11",
                "6.5.3-C12",
                "17.06.01.b.i",
                "17.06.02.b.vi",
                "18.01.b",
                "18.02.01.b.iii"
            };
    
            string[] expected =
            {
                "-1",
                "1.1.a.1",
                "1.2.a.1",
                "1.2.a.2",
                "1.2.a.3",
                "1.3.1",
                "2.1.2",
                "2.1a.1",
                "2.1a.2",
                "2.5",
                "2.6.1",
                "2.7.1",
                "2.7.2",
                "2.7.16",
                "2.25a",
                "2.25b",
                "2.42.1",
                "2.42.2",
                "3.1.1",
                "3.1.2",
                "3.5.2",
                "3.6a.1",
                "3.6a.2",
                "3.6b.2",
                "5.1a.1",
                "5.1a.2",
                "5.1a.3",
                "5.1b.1",
                "5.1b.2",
                "5.1b.6",
                "6.3.1",
                "6.3.2",
                "6.3.3",
                "6.3.4",
                "6.3.5",
                "6.5.1",
                "6.5.2-C11",
                "6.5.3-C12",
                "17.06.01.b.i",
                "17.06.02.b.i",
                "17.06.02.b.vi",
                "18.01.b",
                "18.02.01.b.iii",
                "1000"
            };
    
            test.SortNatural();
    
            Debug.Assert(test.SequenceEqual(expected));
    
            Console.WriteLine(string.Join("\n", test));
        }
    }