Search code examples
c++windowsregistry

Using RegQueryValueEx for a registry value that could be REG_DWORD or REG_SZ


Currently I am using RegQueryValueEx() to retrieve a registry value that could potentially be written in either a REG_SZ or REG_DWORD format.

BYTE byteArray[MAX];
DWORD dataSize = sizeof(byteArray);
DWORD type = 0;
RegQueryValueEx(
        hKey,
        subKey,
        nullptr,
        &type,
        reinterpret_cast<BYTE*>(&byteArray),
        &dataSize));

When I get the data of a REG_SZ value (example: "42314"), I get this in response:

byteArray   0x004fe6a8 "4"  unsigned char[100]
    [0] 52 '4'  unsigned char
    [1] 0 '\0'  unsigned char
    [2] 50 '2'  unsigned char
    [3] 0 '\0'  unsigned char
    [4] 51 '3'  unsigned char
    [5] 0 '\0'  unsigned char
    [6] 49 '1'  unsigned char
    [7] 0 '\0'  unsigned char
    [8] 52 '4'  unsigned char
    [9] 0 '\0'  unsigned char
    [10]0 '\0'  unsigned char

Is there any way I could not have the null bytes after every character? I think it is because of RegEnumValue() being called for each character, but I am not sure.


Solution

  • Your issue has nothing to do with RegEnumValue().

    Your app is calling the TCHAR-based RegQueryValueEx(), which is actually a preprocessor macro that maps to either RegQueryValueExA() (ANSI) or RegQueryValueExW() (Unicode), depending on whether UNICODE is defined at compile-time.

    RegQueryValueExW() returns string data as Unicode text in UTF-16LE format, which is exactly what you are seeing in your buffer, so clearly your app is being compiled for Unicode. What you are seeing is perfectly normal behavior.

    So, you need to handle string data in the format it is given to you, eg:

    BYTE byteArray[MAX];
    DWORD dataSize = sizeof(byteArray);
    DWORD type = 0;
    if (RegQueryValueEx( // <-- calling the TCHAR version!
        hKey,
        subKey,
        nullptr,
        &type,
        reinterpret_cast<BYTE*>(&byteArray),
        &dataSize) == 0)
    {
        switch (type)
        {
            case REG_DWORD:
            {
                LPDWORD value = reinterpret_cast<LPDWORD>(&byteArray);
                // use *value as needed ...
                break;
            }
    
            case REG_SZ:
            case REG_MULTI_SZ:
            case REG_EXPAND_SZ:
            {
                // note the T in LPTSTR!  That means 'TCHAR' is used...
                LPTSTR text = reinterpret_cast<LPTSTR>(&byteArray);
                // use text as needed, up to (dataSize/sizeof(TCHAR)) number
                // of TCHARs. This is because RegQueryValueEx() does not
                // guarantee the output data has a null terminator.  If you
                // want that, use RegGetValue() instead...
                break;
            }
        }
    }
    

    Or:

    BYTE byteArray[MAX];
    DWORD dataSize = sizeof(byteArray);
    DWORD type = 0;
    if (RegQueryValueExW( // <-- calling the UNICODE version!
        hKey,
        subKey,
        nullptr,
        &type,
        reinterpret_cast<BYTE*>(&byteArray),
        &dataSize) == 0)
    {
        switch (type)
        {
            case REG_DWORD:
            {
                LPDWORD value = reinterpret_cast<LPDWORD>(&byteArray);
                // use *value as needed ...
                break;
            }
    
            case REG_SZ:
            case REG_MULTI_SZ:
            case REG_EXPAND_SZ:
            {
                // note the W in LPWSTR!  That means 'WCHAR' is used...
                LPWSTR text = reinterpret_cast<LPWSTR>(&byteArray);
                // use text as needed, up to (dataSize/sizeof(WCHAR)) number
                // of WCHARs. This is because RegQueryValueExW() does not
                // guarantee the output data has a null terminator.  If you
                // want that, use RegGetValueW() instead...
                break;
            }
        }
    }
    

    If you want the text in another format, you will have to either:

    1. convert it after reading it as Unicode, such as with WideCharToMultiByte() or equivilent.

    2. use RegQueryValueExA() (or RegGetValueA()) directly, which will return string data as ANSI text in the user's current locale, per the documentation:

      If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, and the ANSI version of this function is used (either by explicitly calling RegQueryValueExA or by not defining UNICODE before including the Windows.h file), this function converts the stored Unicode string to an ANSI string before copying it to the buffer pointed to by lpData.

      BYTE byteArray[MAX];
      DWORD dataSize = sizeof(byteArray);
      DWORD type = 0;
      if (RegQueryValueExA( // <-- calling the ANSI version
          hKey,
          subKey,
          nullptr,
          &type,
          reinterpret_cast<BYTE*>(&byteArray),
          &dataSize) == 0)
      {
          switch (type)
          {
              case REG_DWORD:
              {
                  LPDWORD value = reinterpret_cast<LPDWORD>(&byteArray);
                  // use *value as needed ...
                  break;
              }
      
              case REG_SZ:
              case REG_MULTI_SZ:
              case REG_EXPAND_SZ:
              {
                  // note the lack of T in LPSTR! That means 'char' is used...
                  LPSTR text = reinterpret_cast<LPSTR>(&byteArray);
                  // use text as needed, up to dataSize number of chars. This
                  // is because RegQueryValueExA() does not guarantee the
                  // output data has a null terminator.  If you want that,
                  // use RegGetValueA() instead...
                  break;
              }
          }
      }
      

    Either way, just note that you will run the risk of losing any non-ASCII characters that do not exist in the target charset you decide to convert to. So, better to stick with Unicode instead, and handle the buffer data as WCHAR data (what TCHAR maps to when UNICODE is defined).