Powershell Set-Content encoding

part of the script looks like this:

$template = Get-Content "./template/temaplate.htm" -raw
$html = $template.Replace('{{imie}}', $imie).Replace('{{nazwisko}}', $nazwisko).Replace('{{stanowisko}}', $stanowisko).Replace('{{mobile}}', $mobile).Replace('{{kapital}}', $kapital).Replace('{{telefon}}', $telefon)
Set-Content -Encoding UTF8 "output/podpis.htm" -Value $html

temaplate.htm has for example word "Sąd" or "Wrocław" but after running Set-Content all polish special characters are lost "SÄ…d", "WrocĹ‚aw" i dont really understand why. the template also have set

<meta charset="UTF-8">

Solution

Your symptom implies:

Your file is UTF-8-encoded but doesn't have a BOM.
You're using Windows PowerShell, where Get-Content defaults to the system's active ANSI code page, and therefore misinterprets your file:^[1]
- Note that Get-Content does not try to interpret the content of the file, and therefore the presence of <meta charset="UTF-8"> inside it is irrelevant.
  All that matters is whether the file starts with a Unicode BOM (which unequivocally identifies the character encoding) or not (in which case an encoding must be assumed).
- Using -Encoding utf8 only with Set-Content is then too late, because the misinterpretation has already happened.

Note that you would not have this problem in PowerShell (Core) 7+, which consistently defaults to (BOM-less) UTF-8.

Therefore, use -Encoding utf8 also in your Get-Content call:

$template = Get-Content -Encoding UTF8 "./template/temaplate.htm" -Raw
# ...
Set-Content -Encoding UTF8 "output/podpis.htm" -Value $html

Caveat:

In Windows PowerShell, Set-Content -Encoding UTF8 invariably creates a UTF-8 file with BOM. If that is undesired, use New-Item as a workaround:

# Creates a BOM-less UTF-8 file even in Windows PowerShell.
New-Item -Force "output/podpis.htm" -Value $html

(Again, in PowerShell (Core) 7+ you wouldn't have that problem: all cmdlets there create BOM-less UTF-8 files by default; -Encoding utf8bom is needed to explicitly request a BOM.)

See this answer for additional information.

^{[1] Specifically, each byte in a multi-byte UTF-8 encoding sequence representing a single non-ASCII-range character is misinterpreted as its own character, namely a character from the ANSI character set. You can reproduce this as follows, assuming that Windows-1252 is the active ANSI code page: [Text.Encoding]::GetEncoding(1252).GetString([Text.Encoding]::UTF8.GetBytes('ą')) - this yields Ä…, i.e. two (different) characters, as in your question.}