Search code examples
powershellregistryhexdumputf-16le

Powershell Reg Multi_SZ into String ( Export / Import Scenario )


Need your help :). Im trying to convert Roger Zanders wonderful Reg2PS into my own Powershell script which i want to use in Intune for some comfortable Registry configs. Greetings at this point to RZ! https://reg2ps.azurewebsites.net/

I found his code here: https://github.com/rzander/Reg2CI/blob/master/Source/REG2PS/RegClass.cs

Relevant Codepart in this topic:

if (DataType == ValueType.MultiString)
{
//return Encoding.UTF8.GetString(Encoding.Unicode.GetBytes(_value.Replace("hex(7):", "").Replace(" ", "")));
string[] aHex = _value.Replace("hex(7):", "").Replace(" ", "").Split(',');
List<byte> bRes = new List<byte>();

foreach (string sVal in aHex)
{
bRes.Add(Convert.ToByte(sVal, 16));
}

string sResult = Encoding.Unicode.GetString(bRes.ToArray());
_svalue = "\"" + string.Join(",", sResult.TrimEnd('\0').Split('\0')) + "\"";
return sResult.TrimEnd('\0').Split('\0');
}

So after some other people tried to translate it or write it by there own, i recognize that Roger did it overall better with Multi_SZ (Whitespaces, "", and so on).

Therefore im fine with every Type but struggle at the Multi_SZ like so many.

I export a Registry Multi_SZ with Reg.exe.

Example:

$PlainValue = '"MultiStringProperty"=hex(7):56,00,61,00,6c,00,75,00,65,00,31,00,00,00,56,00,\61,00,6c,00,75,00,65,00,32,00,00,00,56,00,61,00,6c,00,75,00,65,00,33,00,00,\00,00,00'

# Get the Value itself  
$Value = $PlainValue.Split("=")[1]

 # Remove Hex Descrioption
$Value = $Value.Replace( "hex(7):","" )

# Learned from a HEX to UTF16 Converter the "\" is just ja reg.exe thing to shortening the Line in the File -> Remove!
$Value = $Value.Replace("\","")

And now the Problem begins. What worked for me is for Example:

$Value1 = $Value -split (",") | % { [INT]"0x$_" }
$Value1 = [Text.Encoding]::Unicode.GetString($Value1)

$Value2 = $Value -split (",") | % { [char][byte]"0x$_" }
$Value2 = $Value2 -join ''

My target is to have a final format of an String as: @("Value1","Value2","Value3")

Problem: The String i got in $Value1 or $Value2 isnt recognizing to .split(" ").

If i split before like:

# Splitting Hex Spaces
$Separator = ",00,00,00,"
$ArrValues = $Value -split $separator
$TempValues = @()
ForEach( $Val in $ArrValues )
{
     $Temp = $Val -split (",") | % { [INT]"0x$_" }
     [Text.Encoding]::Unicode.GetString($Temp)
     $TempValues += [Text.Encoding]::Unicode.GetString($Temp) 
}

Then PS has Problems to convert the Digits ...

Do i have some understanding problems or has powershell problems to convert the array? Im trying to find a better separation instead of 00,00,00 in the mean....

Please correct me but as i know in REG_Multi_SZ there is no witespace within the value allowed right?


Solution

  • I suggest a different approach based on -replace, the regular-expression-based string replacement operator, in combination with -split (which is by default also regex-based), which is both more concise and efficient:

    # Convert the hex string to a byte array.
    [byte[]] $bytes = 
      $plainValue -replace '^.+=hex\(7\):|\\' -split ',' -replace '^', '0x'
    
    # Interpret the byte array as a Unicode string, remove the trailing NULLs,
    # and split the result into individual strings by NULL
    $strings = 
      [Text.Encoding]::Unicode.GetString($bytes).TrimEnd("`0") -split "`0"
    

    Note:

    • -replace '^.+=hex\(7\):|\\' replaces everything up to and including =hex(7): as well as (|) any embedded \ (escaped as \\) from the input string, leaving just the "hex-byte string"

    • -split ',' splits the byte string into individual bytes (represented as two-digit hex strings without a prefix)

    • -replace '^', '0x' effectively prepends '0x' to each byte representation (^ matches the position at the start of each individual byte, which is where 0x, the substitution string is placed).

    • The array of resulting strings (@('0x56', '0x00', '0x61', ...)) is then automatically converted to [byte] values by PowerShell, due to the [byte[]] type constraint on the target variable, $bytes.

    • [Text.Encoding]::Unicode.GetString($bytes) then converts the resulting byte array to the text it represents, based on interpreting the bytes as "Unicode" (UTF-16LE) encoded.

    • .TrimEnd("`0"`) removes the trailing NULL characters from the resulting text (REG_MULTI_SZ strings have two NULLs as terminators, to signal that no more strings are present).

    • -split "`0" then splits the resulting string by the interior NULLs that separate the individual strings inside a REG_MULTI_SZ strings, yielding an array of those strings.


    As for what you tried:

    • Splitting $Value by ,00,00,00, isn't a valid way to break the byte sequences into parts that form valid UTF-16LE byte sequences.

    • ASCII-range Unicode characters, such as V or 1, encoded as UTF-16LE have a 0x0 byte as the second byte in the two-byte sequence that forms a single character, and if you remove this byte (which is what your -split operation effectively does at the end of the substrings), you'll end up with an invalid character, which [Text.Encoding]::Unicode.GetString() represents as , the REPLACEMENT CHARACTER, U+FFFD

      • More fundamentally, the assumption that the second byte in a two-byte sequences is 0x0 only holds for characters in the ISO-8859-1 subrange of Unicode, i.e. only for Unicode code points in the 8-bit subrange, U+0000 - U+00FF. For characters outside that range, say, , that second byte is not 0x0.
    • You can avoid having to manually break the byte sequences by 0x0 0x0 separators along proper character boundaries if you perform decoding via [Text.Encoding]::Unicode.GetString() first, on the entire string, and then split the resulting - properly parsed string - into substrings by NULL characters, U+0000), as shown in the solution at the top.