Search code examples
c#regexbbcode

how to find the index positions of particular BBCode tag (Regex)


I have a string, let's say:

[s]AB[/s]23[sb]45[/sb]AB45ABABAB

I want to find all indexes that are surrounded by a tag that contains the letter s, so that includes [s] and [sb].

A function call to findIndices("[s]01[/s]23[sb]45[/sb]AB45ABABAB", "s") would return the List [0, 1, 4, 5]. Note that the indices ignores all BBCode. In other words, it thinks the index of the first 'A' character is 0, not 3.

How does one implement findIndices in C#. I tried using System.Text.RegularExpressions but I am having trouble, the difficulty is in finding the index relative to the string that has the BBCode stripped out.


Solution

  • It's only an example, you can try this way, test here: http://rextester.com/FMTZ35054

        public class Entity
        {
            public string Text {get; set;}
            public int Index {get; set;}
    
            public int CountDirty {get; set;}
    
            public int CountClean {get; set;}
            public int CountGross {get; set;}
    
            public int IndexStart {get; set;}
            public int IndexEnd {get; set;}
    
            public int IndexStartClean {get; set;}
            public int IndexEndClean {get; set;}
    
            public int IndexStartGross {get; set;}
            public int IndexEndGross {get; set;}
    
            public int CountBefore {get;set;}
            public int CountAfter {get;set;}
        }
    
        public static List<Entity> findIndices(string text)
        {
            string regex = @"(\[[a-zA-Z]*\])(.*?)(\[/[[a-zA-Z]*\])";
            Regex r = new Regex(regex);
    
            MatchCollection matches = r.Matches(text);
    
            List<Entity> list = new List<Entity>();
    
            int accumulation = 0;
            foreach (Match match in matches)
            {
                Entity t = new Entity();
    
                string stringa2 = match.ToString();
    
                t.CountBefore = match.Groups[1].ToString().Count();
                t.CountAfter = match.Groups[3].ToString().Count();
    
                t.CountClean = match.Groups[2].ToString().Count();
                t.CountGross = match.ToString().Count();
                t.CountDirty = t.CountClean - t.CountGross;
                t.Text = stringa2;
                t.IndexStart = match.Index;
                t.IndexEnd = match.Index + t.CountGross - 1;
    
                t.IndexStartGross = t.IndexStart + t.CountBefore;
                t.IndexEndGross = t.IndexStartGross + t.CountClean - 1;
    
                t.IndexStartClean = t.IndexStartGross - t.CountBefore - accumulation;
                t.IndexEndClean = t.IndexStartClean + t.CountClean - 1;
    
                list.Add(t);
    
                accumulation += t.CountBefore + t.CountAfter;
            }
    
            return list;
        }
    

    And this is an implementation:

            List<Entity> list = findIndices("[s]AB[/s]23[sb]45[/sb]AB45ABABAB[a]test[/a]");
    
            for (var i = 0; i < list.Count; i++)
            {
                var l = list[i];
    
                Console.WriteLine("Text = " + l.Text);
    
                Console.WriteLine("IndexStartClean = " + l.IndexStartClean);
                Console.WriteLine("IndexEndClean = " + l.IndexEndClean);
    
                Console.WriteLine("---");
            }