Search code examples
c#cssstring-parsingcss-parsing

Parsing nested CSS-styled text in C#


I am expecting to have a text as a string input in C# as shown under BEFORE_PROCESSING title. This text needs to be formatted such that:

  1. The naked sentences without any style tag (e.g. Sentence 1)must get a style tag to make the entire sentence become bold.
  2. The sentences that are already having a style tag needs to be identified and their foreground color element must be set as "fg:Red" in order to make the whole sentence look in RED.
  3. The sentences that are already having a style tag might have nested style tags. So, this needs to be taken into consideration.

As an example, after the formatting completed, the sentence in BEFORE_PROCESSING title should look like the text under AFTER_PROCESSING.

My question is what would be the most efficient way to realize this text processing business in C#? Would it be use of a regex or it overkills? Would you think some better alternative might exist? Thanks.

(I am using C#4)

BEFORE_PROCESSING

"Sentence 1 <style styles='B;fg:Green'>STYLED SENTENCE</style> Sentence 2"

AFTER_PROCESSING

"<style styles='B'>Sentence 1 </style> 
 <style styles='B;fg:Red'>STYLED  SENTENCE</style>  
 <style styles='B'>Sentence 2</style>"

Solution

  • You can try the solution below, based on a regular expression :

    string myLine = "Sentence 1<style styles='B;fg:Green'>STYLED SENTENCE</style>Sentence 2";
    const string splitLinesRegex = @"((?<Styled>\<style[^\>]*\>[^\<\>]*\</style\>)|(?<NoStyle>[^\<\>]*))";
    
    var splitLinesMatch = Regex.Matches(myLine, splitLinesRegex, RegexOptions.Compiled);
    List<string> styledLinesBis = new List<string>();
    
    foreach (Match item in splitLinesMatch)
    {
        if (item.Length > 0)
        {
            if (!string.IsNullOrEmpty(item.Groups["Styled"].Value))
                styledLinesBis.Add(string.Format("<style styles='B'>{0}</style> ", item.Groups["Styled"].Value));
    
            if (!string.IsNullOrEmpty(item.Groups["NoStyle"].Value))
                styledLinesBis.Add(string.Format("<style styles='B;fg:Red'>{0}</style>  ", item.Groups["NoStyle"].Value));
        }
    }
    

    You just have to join the strings, using a string.Join statement for instance.