Search code examples
c#regexp-replace

How to convert superscript characters to normal text in C# string


I have strings with mathematical expressions like 2⁻¹² + 3³ / 4⁽³⁻¹⁾.

I want to convert these strings to the form of 2^-12 + 3^3 / 4^(3-1).

What I got so far is that I can extract the superscript number and prepend the ^.

Fiddle of code below: https://dotnetfiddle.net/1G9ewP

using System;
using System.Text.RegularExpressions;
                    
public class Program
{
    private static string ConvertSuperscriptToText(Match m){
        string res = m.Groups[1].Value;
            
        res = "^" + res;
        return res;
    }
    public static void Main()
    {
        string expression = "2⁻¹² + 3³ / 4⁽³⁻¹⁾";
        string desiredResult = "2^-12 + 3^3 / 4^(3-1)";
        
        string supChars = "([¹²³⁴⁵⁶⁷⁸⁹⁰⁺⁻⁽⁾]+)";
        string result = Regex.Replace(expression, supChars, ConvertSuperscriptToText);

        Console.WriteLine(result); // Currently prints 2^⁻¹² + 3^³ / 4^⁽³⁻¹⁾
        Console.WriteLine(result == desiredResult); // Currently prints false
    }
}

How would I replace the superscript characters without replacing each one of them one by one?

If I have to replace them one by one, how can I replace them using something like a collection similar to PHP's str_replace which accepts arrays as search and replace argument?

Bonus question, how can I replace all kinds of superscript characters with normal text and back to superscript?


Solution

  • You just need a dictionary to map the values and then you can use Linq to translate them over and create a new string out of them.

    private static Dictionary<char, char> scriptMapping = new Dictionary<char, char>()
    {
        ['¹'] = '1',
        ['²'] = '2',
        ['³'] = '3',
        ['⁴'] = '4',
        ['⁵'] = '5',
        ['⁶'] = '6',
        ['⁷'] = '7',
        ['⁸'] = '8',
        ['⁹'] = '9',
        ['⁰'] = '0',
        ['⁺'] = '+',
        ['⁻'] = '-',
        ['⁽'] = '(',
        ['⁾'] = ')',
    };
    
    private static string ConvertSuperscriptToText(Match m){
        string res = m.Groups[1].Value;
    
        res = "^" + new string(res.Select(c => scriptMapping[c]).ToArray());
        return res;
    }
    

    You could also create your regex from the dictionary so there's only one place to add new subscripts.

    string supChars = "([" + new string(scriptMapping.Keys.ToArray()) + "]+)"