Search code examples
powershellcharacter-encodinginvoke-webrequest

Powershell ConvertFrom-Json Encoding Special Characters Issue


I have this code in my powershell script and it doesn't do well on the special characters parts.

 $request = 'http://151.80.109.18:8082/vrageremote/v1/session/players'
 $a = Invoke-WebRequest -ContentType "application/json; charset=utf-8" $request |
 ConvertFrom-Json    |
 Select -expand Data |
 Select -expand players |
 Select displayName, factionTag | Out-file "$scriptPath\getFactionTag.txt"

In my output file I only get '????' for any special characters. Does anyone know how I can get it to show special characters in my output file?


Solution

  • Peter Schneider's helpful answer and Nas' helpful answer both address one problem with your approach: You need to:

    • either: access the .Content property on the response object returned by Invoke-WebRequest to get the actual data returned (as a JSON string), which you can then pass to ConvertFrom-Json.

    • or: use Invoke-RestMethod instead, which returns the data directly and parses it into custom objects, so you can work with these objects directly, without the need for ConvertTo-Json; however, with a character-encoding problem such as in this case this is not an option, because explicit re-encoding of the JSON string is needed - see below.

    However, you still have a character-encoding problem, because, in the absence of charset information in the response header, PowerShell interprets the UTF-8-encoded JSON string returned as ISO-8859-1-encoded, in Windows PowerShell as well as in PowerShell (Core) up to v7.3.3, except that v7.0+ defaults to UTF-8 for JSON, specifically.
    v7.4+ will use UTF-8 as the general default, i.e. for all media types.

    There are two possible solutions:

    • Preferably, amend the web service to include charset=utf-8 in the response header's ContenType field.

    • If you can't do that, you must perform your own decoding, based on the raw bytes of the response body, accessible via the .RawContentStream property:

    Here's the implementation of the latter:

    # Note that there's no point in using 
    # -ContentType  "application/json; charset=utf-8" in this case,
    # as -ContentType only applies to data sent *to* the web service.
    $request = 'http://151.80.109.18:8082/vrageremote/v1/session/players'
    $a = Invoke-WebRequest $request
    
    # $a.Content cannot be used, because it contains the *misinterpreted* JSON string,
    # but $a.RawContentStream provides access to the raw bytes,
    # which you can decode into a string with the encoding of choice.
    $jsonCorrected = 
      [Text.Encoding]::UTF8.GetString(
        $a.RawContentStream.ToArray()
      )
    
    # Now process the reinterpreted string.
    $jsonCorrected |
      ConvertFrom-Json    |
      Select -expand Data |
      Select -expand players |
      Select displayName, factionTag | Out-file "$scriptPath\getFactionTag.txt"
    

    Note:

    • This answer provides convenience function ConvertTo-BodyWithEncoding, which wraps the functionality above.