Search code examples
c#distincttolower

Distinct() doesn't see uppercase letter changed by ToLower() method


This is my code:

string textToEncode = File.ReadAllText(@"C:\Users\ASUS\Desktop\szyfrowanie2\TextSample.txt");
textToEncode = textToEncode.ToLower();
char[] distinctLetters = textToEncode.Distinct().ToArray();
var count = textToEncode.Distinct().Count();
Console.WriteLine("Letters used in text: \n\n");
for (int i = 0; i < count; i++)
{
    if (Equals(distinctLetters[i]," "))
    {
        Console.Write("<space>"); 
    }
    else
    {
        Console.Write(" " + distinctLetters[i] + " ");
    }
}

I want to read the .txt file, turn it all to lowercases by ToLower(); method, but then when I want to read all the distinct characters from .txt file and then write them on screen, they don't show up. Yet later when I use

for (int i = 0; i < distinctLetters.Length; i++)
{
    Console.Write("Swap " + distinctLetters[i] + " with "); 

it shows the letter that indeed was changed into a lowercase, but wasn't visible on screen by first for loop. First word in my TextSample.txt file is "With". The first loop only shows

i t h

But as the second loop starts, it asks

Swap w with

and I have no idea why. Also the if statement in first loop doesn't work, it doesn't detect the space.


Solution

  • I've also modified your code a bit:

    string textToEncode = File.ReadAllText(@"C:\Users\ASUS\Desktop\szyfrowanie2\TextSample.txt").ToLower();
    char[] distinctLetters = textToEncode.Distinct().ToArray();
    var count = distinctLetters.Count();
    Console.WriteLine("Letters used in text: \n\n");
    for (int i = 0; i < count; i++)
    {
        if (Equals(distinctLetters[i], ' ')) { Console.Write("<space>"); }
        else if (Equals(distinctLetters[i], '\r')) { Console.Write("<cr>"); }
        else if (Equals(distinctLetters[i], '\n')) { Console.Write("<lf>"); }
        else { Console.Write(" " + distinctLetters[i] + " "); }
    }
    

    Just a few minor things. I merged the two first lines, changed " " into ' ' so it now compares characters, changed the counting of characters to use distinctLetters instead of executing the same Distinct() command again and I added two conditions to handle the carriage return and line feed. (I always mix them up, btw.)
    This now shows the right result but should also explain why characters went missing! A simple reason, actually. Your text file has a carriage return character, which will send the cursor back to the left. This will cause the first character to be overwritten by a space...

    So your code actually prints " w i ..." but then gets the '\r'. It will then print a space, go back to the beginning of the line and writes another space over the ' '! Then the newline will come next, which prints a second space over the 'w', moves to the next line and prints a space again. Then the rest gets printed...

    Simple, isn't it? But by capturing these two special characters with the two extra if statements, it is fixed... :-) The '\r' and '\n' characters are often overlooked in console applications, giving unexpected results when they get printed.