I'm working on a config installer for a game. I want to make a menu for the user to choose from different colors for certain settings. To change those colors I use a PowerShell command in a batch file to find and replace the relevant text in a specific file. There is no problem with that alone.
In the replacement process, PowerShell also replaces the newline character found in the config file with a "?". That is not intended and I want to avoid that.
The character that gets replaced with a "?" is the following:
↵
I want to exclude that character from getting replaced in the process.
My code looks like that:
powershell -command "& {($p=gc "path.txt");(gc $p\GameConfig\SpecificFile.txt).replace('<col:Default>','<col:Green>') | sc $p\GameConfig\SpecificFile.txt}"
I have already tried to exclude the character like so:
powershell -command "& {($p=gc "path.txt");(gc $p\GameConfig\SpecificFile.txt).replace[↵]::escape('<col:Default>','<col:Green>') | sc $p\GameConfig\SpecificFile.txt}"
That didn't work.
I also tried to revert the replacement process of the newline character like so:
powershell -command "& {($p=gc "path.txt");(gc $p\GameConfig\SpecificFile.txt).replace('<col:Default>','<col:Green>') | sc $p\GameConfig\SpecificFile.txt}"
powershell -command "& {($p=gc "path.txt");(gc $p\GameConfig\SpecificFile.txt).replace('>?<','>↵<') | sc $p\GameConfig\SpecificFile.txt}"
That didn't work either.
I really need some help. Thanks in advance!
Cheers
tl;dr
Use the -Encoding
parameter of Set-Content
(whose built in alias is sc
in Windows PowerShell) to specify a Unicode character encoding, to ensure that Unicode characters such as ↵
(DOWNWARDS ARROW WITH CORNER LEFTWARDS , U+21B5
) are preserved; to use UTF-8 encoding, for instance, add -Encoding utf8
:
powershell -command "$p=gc path.txt; (gc -Encoding utf8 $p\GameConfig\SpecificFile.txt).Replace('<col:Default>','<col:Green>') | sc -Encoding utf8 $p\GameConfig\SpecificFile.txt"
A streamlined reformulation that speeds up processing by reading the file as a whole rather than line by line, using Get-Content
's -Raw
switch as well as Set-Content
's -NoNewLine
switch:
powershell -command "$p=(gc path.txt)+'\GameConfig\SpecificFile.txt'; (gc -Raw -Encoding utf8 $p).Replace('<col:Default>','<col:Green>') | sc -Encoding utf8 -NoNewLine $p"
To instead use UTF-16LE ("Unicode") encoding, use -Encoding Unicode
(sic).
Note:
In Window PowerShell, the legacy, ships-with-Windows Windows-only PowerShell edition you're using, this invariably creates a UTF-8 file with a BOM.
If that is undesired, you need a workaround - see this answer.
Note that if / once your input files are BOM-less UTF-8 files, you also need to use -Encoding utf8
for reading them properly with Get-Content
(whose built-in alias is gc
), as used in the command above; without that, the file would be misinterpreted as ANSI-encoded in Windows PowerShell (see next point).
By default, Windows PowerShell's Set-Content
cmdlet uses ANSI encoding, i.e. the fixed-width 8-bit character encoding associated with your system's legacy system locale (aka language for non-Unicode programs), such as Windows-1252 on US-English systems.
Trying to save a Unicode character such as ↵
that cannot be represented in such an encoding results in an (ASCII-range) ?
character getting saved instead, which is what you saw.
Note that the PowerShell (Core) 7+ edition now fortunately consistently defaults to (BOM-less) UTF8.
Generally, note that PowerShell's pipelines are not raw byte conduits: text file contents as well as output from external programs are invariably decoded into .NET strings before further processing, so that a Get-Content ... | Set-Content ...
pipeline never preserves the original character encoding and instead uses Set-Content
's default encoding on writing (unless the -Encoding
parameter is used); see this answer for background information.