Search code examples
c#encodingpathdecodingillegal-characters

C# File Path encoding and decoding


I am looking for a simple method to encode/escape and decode/unescape file paths (illegal characters in file paths "\/?:<>*| )

HttpUtility.UrlEncode does its job, except it does not encode the * character.

All I could find was escaping with regex, or just replacing the illegal chars with _

I want to be able to encode/decode consistently.

I want to know if there's a pre-defined way to do that or I just need to write some code to encode and another piece to decode.

Thanks


Solution

  • I've never tried anything like this before, so I threw this together:

    static class PathEscaper
    {
        static readonly string invalidChars = @"""\/?:<>*|";
        static readonly string escapeChar = "%";
    
        static readonly Regex escaper = new Regex(
            "[" + Regex.Escape(escapeChar + invalidChars) + "]",
            RegexOptions.Compiled);
        static readonly Regex unescaper = new Regex(
            Regex.Escape(escapeChar) + "([0-9A-Z]{4})",
            RegexOptions.Compiled);
    
        public static string Escape(string path)
        {
            return escaper.Replace(path,
                m => escapeChar + ((short)(m.Value[0])).ToString("X4"));
        }
    
        public static string Unescape(string path)
        {
            return unescaper.Replace(path,
                m => ((char)Convert.ToInt16(m.Groups[1].Value, 16)).ToString());
        }
    }
    

    It replaces any forbidden character with a % followed by its 16-bit representation in hex, and back. (You could probably get away with an 8-bit representation for the specific characters you have but I thought I'd err on the safe side.)