Search code examples
c#stringtextseparatorstring-length

Finding longest text fragment in string array with conditions


Good evening. So I have a text file which contains multiple lines of text with separators. I need to find the longest text fragment with a condition, that the word's last letter has to be the first letter of the later word. That text fragment can continue for many lines, not only in one line.

F.e.

We have this string array :

Hello, obvious smile eager ruler.
Rave, eyes, random.

So from these two lines we get that our text fragment will be :

Hello, obvious smile eager ruler.
Rave, eyes

Our text fragment ends at word "eyes" because "random" doesn't start with "s".

Next two lines from our txt file :

Johnny, you use.
Eye eager sun.

So from these two other lines we get that our text fragment will be :

Johnny, you use.
Eye eager

Our text fragment ends at word "eager" because "sun" doesn't start with "r".

So we have multiple lines of text with separators in our input file(txt) and I need to find the biggest text fragment through all that text. That text fragment contains words and separators.

I don't even know where to begin, I guess I'll have to use functions like String.Length, Substring and String.Split, maybe Redex might come handy in there, but I'm not really familiar with Redex and it's functions yet.

I tried to explain it as clearly as I can, English isn't my native language, so it's kinda difficult.

My question is : What kind of an algorythm should I use to break my text into separate strings, where one string contains a word and the separator after that word?


Solution

  • You need to do the following:

    • Split the text into individual words
    • Compare the first and last letters of the words to see if they are the same
    • If they are not the same, go back to the original text and take the initial text fragment before you encountered that word.

    One way to do this would be the following:

    String text = "Johnny, you use.\nEye eager sun.";
    
    // Splits the text into individual words
    String[] words = text.ToLower().Split(new String[] {" ", ",", ".", "\n"}, StringSplitOptions.RemoveEmptyEntries);
    
    String lastLetter = text.ToLower()[0].ToString();
    String newText = text;
    
    // Checks to see if the last letter if the previous word matches with the first letter of the next word
    foreach (String word in words)
    {
        if (word.StartsWith(lastLetter))
        {
            lastLetter = word[word.Length - 1].ToString();
        }
        else
        {
            newText = text.Split(new String[] { word }, StringSplitOptions.RemoveEmptyEntries)[0]; // Split the original text at the location where the inconsistency happens and take the first text fragment.
            break;
        }
    }
    
    Console.WriteLine(text);
    Console.WriteLine(newText);