Search code examples
c#asp.netstringsubstringcapitalization

Find sub string between two capital words in a text


I would like to get substring between two capitalized words from the string "THIS is the first line in this PARAGRAPH" I would like the output for this to be "is the first line in this" using c#. I tried using str.All(char.IsUpper) and str.Contains functions to check but with not much luck. Can anyone point in the right direction. I tried researching a bit but I could find answers only to check for one capital character or if a string starts with a capital character.

string str = "THIS is the first line in this PARAGRAPH";

Solution

  • To match any 2 or more characters word with all capitalized letters, I would apply simple regular expression pattern:

    \b[A-Z]{2,}\b
    

    Explanation:

    \b - match word boundary

    [A-Z] - match one uppercase letter

    {2,} - match at least 2 occurences

    With that pattern, we can use Regex.Split method to extract parts between such words. Here's simple example:

    using System.Text.RegularExpressions;
    
    SplitByCapitalizedWords("THIS is the first line in this PARAGRAPH");
    Console.WriteLine("===");
    SplitByCapitalizedWords("THIS is THE first LINE in THIS paragraph");
    
    void SplitByCapitalizedWords(string testString)
    {
        var parts = Regex.Split(testString, @"\b[A-Z]{2,}\b");
    
        foreach (var part in parts)
            Console.WriteLine(part);
    }
    

    .NET fiddle