Search code examples
c++linuxtype-conversioncharchar16-t

How to convert command line arguments from char * argv[] to char16_t * argv[]


I have a main() on Linux that receives command line arguments as char**

int main(int argc ,char * argv[]) 

In my cross platform program I want to use the command line arguments as char16_t. Therefore I need to convert char->char16_t. How do I do that? I tried this but once I leave the loop my debugger shows strange characters in the array. What did I do wrong?

//u16String <- string
std::u16string u16string_from_string(std::string const str) {
  std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;
  return converter.from_bytes(str);
}


bool convert_argv_to_char16(int argc,char * argv[]){    
  char16_t * argv16[argc];
  int i=0;

  // for each arg
  for (char **arg = argv; *arg; ++arg) { 
      std::string S(*arg);
      std::u16string S16 = u16string_from_string(S);
      const char16_t* cc_16 = S16.c_str();
      char16_t* c_16 = (char16_t*) cc_16;    
      argv16[i] = c_16;
      i++;
  }    
  return true;
}

Solution

  • To fix this issue, you need to ensure that the converted std::u16string objects have a lifetime that extends beyond the convert_argv_to_char16 function. One way to achieve this is by using a std::vector<std::u16string> to store the converted arguments. This way, the vector will keep the std::u16string objects alive until the end of the function and you can safely assign the pointers to argv16

    #include <vector>
    #include <string>
    #include <codecvt>
    
    // u16String <- string
    std::u16string u16string_from_string(const std::string& str) {
        std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;
        return converter.from_bytes(str);
    }
    
    bool convert_argv_to_char16(int argc, char* argv[]) {
        std::vector<std::u16string> argv16;
    
        // Convert each arg and store it in the vector
        for (int i = 0; i < argc; ++i) {
            std::u16string S16 = u16string_from_string(argv[I]);
            argv16.push_back(S16);
        }
    
        // Now you can use argv16 as char16_t* argv[]
        // For example, you can access the elements using argv16[i].c_str()
    
        return true;
    }
    

    By using the std::vector<std::u16string> to store the converted arguments, you ensure their validity throughout the scope of the function. Once the function convert_argv_to_char16 returns, the std::u16string objects in the vector will be destroyed, but the argv16 array won't be holding pointers to invalid memory.