Search code examples
c++stringbyteshellcode

Convert string in '\\x00\\x00\\x00' format to unsigned char array


Say I have a string as so:

std::string sc = "\\xfc\\xe8\\x82";

how could I convert the sc string into the equivalent of

 unsigned char buf[] = "\xfc\xe8\x82";

I'm trying to convert a string containing shellcode into a unsigned char array.

I have tried the following:

char buf[5120];
strncpy(buf, sc.c_str(), sizeof(buf));
buf[sizeof(buf) - 1] = 0;

This seems to store strings into the char array I need the char array to store/represent bytes.

When I print:

//example 1
unsigned char buf[] = "\xfc\xe8\x82";
printf("%s", buf);

The console outputs:

ⁿΦé

When I print:

//example 2
char buf[5120];
strncpy(buf, sc.c_str(), sizeof(buf));
buf[sizeof(buf) - 1] = 0;

The Console outputs:

\xfc\xe8\x82

How do I convert the sc string into a unsigned char array so that when sc is printed sc will produce the same output of example 1.


Solution

  • The literal "\\xfc\\xe8\\x82" as a string uses "\" as an escape character. "\\" will be reduced to "\". As you would expect. So, if you print your given std::string, then the result will be: \xfc\xe8\x82.

    So, what you want to do now is: Create a char array containing those hex values, given in the original std::string.

    Please note: Your statement char s[] = "\xfc\xe8\x82"; will create a C-Style array of char, with the size 4 and containing:

    s[0]=fc, s[1]=e8, s[2]=82, s[3]=0
    

    In the example below I show 2 proposals for conversion. 1. Straight forward conversion 2. Using C++ standard algorithms

    #include <string>
    #include <iostream>
    #include <iomanip>
    #include <regex>
    #include <vector>
    #include <iterator>
    #include <algorithm>
    
    
    // Hex digit String
    std::regex hexValue{R"(\\[xX]([0-9a-fA-F][0-9a-fA-F]))"};
    
    
    int main ()
    {   
        // Source string
        std::string s1 = "\\xfc\\xe8\\x82";
        std::cout << "s 1: " << s1 << "\n";
    
    
        // Proposal 1 ------------------------------------------------------
    
        // Target array
        unsigned char s2[3];
    
        // Convert bytes from strings
        for (int i=0; i<s1.size()/4; ++i ) {
    
            // Do conversion. Isolate substring, the convert
            s2[i] = std::strtoul(s1.substr(i*4+2,2).c_str(), nullptr,16);
            // Result is now in s2
    
            // Output value as tring and decimal value
            std::cout << s1.substr(i*4+2,2) << " -> " << std::hex << static_cast <unsigned short>(s2[i]) 
                      << " -> " << std::dec << static_cast <unsigned short>(s2[i]) << "\n";
        }
    
        // Proposal 2 ------------------------------------------------------
    
        // Get the tokens
        std::vector<std::string> vstr(std::sregex_token_iterator(s1.begin(),s1.end(),hexValue, 1), {});
    
        // Convert to unsigned int
        std::vector<unsigned int> vals{};
    
        std::transform(vstr.begin(), vstr.end(), std::back_inserter(vals), 
            [](std::string &s){ return static_cast<unsigned>(std::strtoul(s.c_str(), nullptr,16)); } );
    
        // Print output on std::cout
        std::copy(vals.begin(), vals.end(), std::ostream_iterator<unsigned>(std::cout,"\n"));
    
        return 0;
    }
    

    The second solution will eat any number of hex numbers given in a string