Is there a Linux/Unix tool that can be used to convert the hex dump array of a C file (i.e. the output of xxd -i
) to the corresponding source code?
The output of xxd -i xyz.c
for a source file xyz.c
looks like:
unsigned char xyz_c[] = {
0x23, 0x69, 0x6e, 0x63, 0x6c, 0x75, 0x64, 0x65, 0x20, 0x3c, 0x73, 0x74,
0x64, 0x69, 0x6f, 0x2e, 0x68, 0x3e, 0x0a, 0x23, 0x69, 0x6e, 0x63, 0x6c,
0x75, 0x64, 0x65, 0x20, 0x3c, 0x73, 0x74, 0x64, 0x6c, 0x69, 0x62, 0x2e,
0x68, 0x3e, 0x0a, 0x23, 0x69, 0x6e, 0x63, 0x6c, 0x75, 0x64, 0x65, 0x20,
0x3c, 0x73, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x2e, 0x68, 0x3e, 0x0a, 0x0a,
…
0x65, 0x5f, 0x6c, 0x69, 0x73, 0x74, 0x28, 0x73, 0x74, 0x61, 0x72, 0x74,
0x29, 0x3b, 0x0a, 0x20, 0x20, 0x20, 0x20, 0x7d, 0x0a, 0x0a, 0x20, 0x20,
0x20, 0x20, 0x72, 0x65, 0x74, 0x75, 0x72, 0x6e, 0x20, 0x30, 0x3b, 0x0a,
0x7d, 0x0a
};
unsigned int xyz_c_len = 4442;
Assume that is stored in a file xyz.xxd
.
In many ways, the easiest way to regenerate the original code is:
#include <stdio.h>
#include "xyz.xxd"
int main(void)
{
for (unsigned int i = 0; i < xyz_c_len; i++)
putchar(xyz_c[i]);
return 0;
}
With some more care and some macros, you could make that a general-purpose outline program for the job — you'd need to supply the file name, and the two C variable names to be used.
If you can't (or don't want to) use a C compiler for the job, then writing a tool using Python or Perl is a straight-forward exercise. For example, a not necessarily minimal Perl script is:
#!/usr/bin/env perl -na
use strict;
use warnings;
# xxd -i drops the final comma - aargh (why?)!
foreach my $word (@F)
{
next unless $word =~ m/^0[Xx][[:xdigit:]]{2},?$/;
$word =~ s/,//;
printf "%c", hex($word);
}
It uses the 'auto-split' option (-a
) and 'automatic read but do not print' option (-n
), and then processes any words in the input that look like a hex character, such as 0x0a (optionally followed by a comma since xxd -i
somewhat unnecessarily omits the comma after the final byte value) and converts that to the corresponding byte. It being Perl, TMTOWTDI — There's More Than One Way To Do It.