Search code examples
powershellescapingconvertto-jsonconvertfrom-json

ConvertTo-Json and ConvertFrom-Json with special characters


I have a file containing some properties which value of some of them contains escape characters, for example some Urls and Regex patterns.

When reading the content and converting back to the json, with or without unescaping, the content is not correct. If I convert back to json with unescaping, some regular expression break, if I convert with unescaping, urls and some regular expressions will break.

How can I solve the problem?

Minimal Complete Verifiable Example

Here are some simple code blocks to allow you simply reproduce the problem:

Content

$fileContent = 
@"
{
    "something":  "http://domain/?x=1&y=2",
    "pattern":  "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
"@

With Unescape

If I read the content and then convert the content back to json using following command:

$fileContent | ConvertFrom-Json | ConvertTo-Json | %{[regex]::Unescape($_)}

The output (which is wrong) would be:

{
    "something":  "http://domain/?x=1&y=2",
    "pattern":  "^(?!(\|\~|\!|\@|\#|\$|\||\\|\'|\")).*"
}

Without Unescape

If I read the content and then convert the content back to json using following command:

$fileContent | ConvertFrom-Json | ConvertTo-Json 

The output (which is wrong) would be:

{
    "something":  "http://domain/?x=1\u0026y=2",
    "pattern":  "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\\u0027|\\\")).*"
}

Expected Result

The expected result should be same as the input file content.


Solution

  • I decided to not use Unescape, instead replace the unicode \uxxxx characters with their string values and now it works properly:

    $fileContent = 
    @"
    {
        "something":  "http://domain/?x=1&y=2",
        "pattern":  "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
    }
    "@
    
    $fileContent | ConvertFrom-Json | ConvertTo-Json | %{
        [Regex]::Replace($_, 
            "\\u(?<Value>[a-zA-Z0-9]{4})", {
                param($m) ([char]([int]::Parse($m.Groups['Value'].Value,
                    [System.Globalization.NumberStyles]::HexNumber))).ToString() } )}
    

    Which generates the expected output:

    {
        "something":  "http://domain/?x=1&y=\\2",
        "pattern":  "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
    }