Search code examples
c#winformsfrequency-distributioncode-statistics

C# WinsForm, Frequency Distribution Table [Updated]


Update 01 Thanks to Caius, found the main problem, the logic on the "if" was wrong, now fixed and giving the correct results. The loop still create more positions than needed on the secondary List, an extra position for each number on the main List.

I've updated the code bellow for refence for the following question:

-001 I can figure out why it create positions that needed, the for loop should run only after the foreach finishes its loops correct? -002 To kind of solving this issue, I've used a List.Remove() to remove all the 0's, so far no crashes, but, the fact that I'm creating the extra indexes, and than removing them, does means a big performance down if I have large list of numbers? Or is an acceptable solution?

Description

It supposed to read all numbers in a central List1 (numberList), and count how many numbers are inside a certain (0|-15 / 15|-20) range, for that I use another List, that each range is a position on the List2 (numberSubList), where each number on List2, tells how many numbers exists inside that range. -The range changes as the numbers grows or decrease

Code:

void Frequency()
        {

            int minNumb = numberList.Min();
            int maxNumb = numberList.Max();
            int size = numberList.Count();

            numberSubList.Clear();
            dGrdVFrequency.Rows.Clear();
            dGrdVFrequency.Refresh();

            double k = (1 + 3.3 * Math.Log10(size));
            double h = (maxNumb - minNumb) / k;

            lblH.Text = $"H: {Math.Round(h, 2)} / Rounded = {Math.Round(h / 5) * 5}";
            lblK.Text = $"K: {Math.Round(k, 4)}";

            if (h <= 5) { h = 5; }
            else { h = Math.Round(h / 5) * 5; }
            

            int counter = 1;
            for (int i = 0; i < size; i++)
            {
                numberSubList.Add(0); // 001 HERE, creating more positions than needed, each per number.
                foreach (int number in numberList)
                {
                    if (number >= (h * i) + minNumb && number < (h * (i + 1)) + minNumb)
                    {
                        numberSubList[i] = counter++;
                    }
                }
                numberSubList.Remove(0); // 002-This to remove all the extra 0's that are created.
                counter = 1;
            }

            txtBoxSubNum.Clear();
            foreach (int number in numberSubList)
            {
                txtBoxSubNum.AppendText($"{number.ToString()} ,  ");
            }

            lblSubTotalIndex.Text = $"Total in List: {numberSubList.Count()}";
            lblSubSumIndex.Text = $"Sum of List: {numberSubList.Sum()}";

            int inc = 0;
            int sum = 0;
            foreach (int number in numberSubList)
            {
                sum = sum + number;
                int n = dGrdVFrequency.Rows.Add();
                dGrdVFrequency.Rows[n].Cells[0].Value = $"{(h * inc) + minNumb} |- {(h * (1 + inc)) + minNumb}";
                dGrdVFrequency.Rows[n].Cells[1].Value = $"{number}";
                dGrdVFrequency.Rows[n].Cells[2].Value = $"{sum}";
                dGrdVFrequency.Rows[n].Cells[3].Value = $"{(number * 100) / size} %";
                dGrdVFrequency.Rows[n].Cells[4].Value = $"{(sum * 100) / size} %";
                inc++;
            }
        }

Screen shot showing the updated version.

Visually Working


Solution

  • I think, if your aim is to only store eg 17 in the "15 to 25" slot, this is wonky:

    if (number <= (h * i) + minNumb) // Check if number is smaller than the range limit
    

    Because it's found inside a loop that will move on to the next range, "25 to 35" and it only asks if the number 17 is less than the upper limit (and 17 is less than 35) so 17 is accorded to the 25-35 range too

    FWIW the range a number should be in can be derived from the number, with (number - min) / number_of_ranges - at the moment you create your eg 10 ranges and then you visit each number 10 times looking to put it in a range, so you do 9 times more operations than you really need to