Search code examples
c#regexword-boundary

Regular expression with specific word boundary


Let's say I have a string of type

(Price+Discounted_Price)*2-Max.Price

and a dictionary containing what to replace for each element

Price: A1 Discounted_Price: A2 Max.Price:A3

How can I replace exactly each phrases, without touching the other. Meaning search for Price should not modify Price in Discounted_Price. The result should be (A1+A2)*2-A3 and not (A1+Discounted_A1) - Max.A1 or anything else

Thank you.


Solution

  • If your variables can consist of alphanumeric/underscore/dot characters, you can match them with [\w.]+ regex pattern, and add boundaries that include .:

    using System;
    using System.Collections.Generic;
    using System.Text.RegularExpressions;
    public class Test
    {
        public static void Main()
        {
            var s = "(Price+Discounted_Price)*2-Max.Price";
            var dct = new Dictionary<string, string>();
            dct.Add("Price", "A1");
            dct.Add("Discounted_Price", "A2");
            dct.Add("Max.Price","A3");
            var res = Regex.Replace(s, @"(?<![\w.])[\w.]+(?![\w.])",     // Find all matches with the regex inside s
                x => dct.ContainsKey(x.Value) ?   // Does the dictionary contain the key that equals the matched text?
                      dct[x.Value] :              // Use the value for the key if it is present to replace current match
                      x.Value);                   // Otherwise, insert the match found back into the result
            Console.WriteLine(res);
        }
    }
    

    See the IDEONE demo

    The (?<![\w.]) negative lookbehind fails the match if the match is preceded with a word or a dot char, and the (?![\w.]) negative lookahead will fail the match if it is followed with a word or dot char.

    Note that [\w.]+ allows a dot in the leading and trailing positions, thus, you might want to replace it with \w+(?:\.\w+)* and use as @"(?<![\w.])\w+(?:\.\w+)*(?![\w.])".

    UPDATE

    Since you have already extracted the keywords to replace as a list, you need to use a more sophisticated word boundary excluding dots:

    var listAbove = new List<string> { "Price", "Discounted_Price", "Max.Price" };
    var result = s;
    foreach (string phrase in listAbove)
    {
        result = Regex.Replace(result, @"\b(?<![\w.])" + Regex.Escape(phrase) +  @"\b(?![\w.])", dct[phrase]);
    }
    

    See IDEONE demo.