I am making a programming language. I've done it some times before, but this time I wanted to to it better and use more tecniques when developing it. One of them is that I use 2 defined preprocessors, _DEBUG
(made by me, I use the _
so it doesn't get mixed up with the already defined by Visual Studio, DEBUG) and _DEBUG_VARS
. When I have _DEBUG = 1
, I want to do some stuff with debugging, and when doing _DEBUG_VARS = 1
, I want to do Var dumps and such. And one of those is a hex dump. I bet there is already one in the standard library, but I would want my own. The way I want it to work, is I pass in a pointer to any class (I made it work using template <class T_HEX>
. So then it would cast that T_HEX*
I put in, into a char*, and then get the size of T_HEX
, loop thru all bytes from char*
an onward (remember char*
is a position in RAM where the char is). Then I will write that byte out with 2 hexadecimal numbers. I know this is really unsafe, and the way I've coded the thing is that when I do _DEBUG_VARS = 1
, it creates those functions, and when I do _DEBUG_VARS = 0
, those functions is replaced with empty defines, which incase used, will be replaced with nothing during compile time. Because of it being unsafe, I will ONLY use it during developing. Release builds wont have this.
So, for the code. To try this, I made a class called Test:
class Test
{
public:
char* Allapoto = "AAaaAAaA\0";
int Banana = 1025;
bool Apple = true;
};
Note I have no functions here, it's because I want it to be simple when making the HexDump work. Then the HexDump functions itself:
int DEBUG_HexLineCounter = 0;
#define H_D_CUT 10
#define H_D_L() (DEBUG_HexLineCounter % H_D_CUT > 0)
#define H_D_NL() ((++DEBUG_HexLineCounter % H_D_CUT) == 0)
template <class T_HEX>
void HexDump(T_HEX* var)
{
char* ptr = reinterpret_cast<char*>(var);
char* off_ptr = NULL;
int size = sizeof(*var);
for (int i = 0; i < size; i++)
{
off_ptr = (ptr + i);
char c = *off_ptr;
HexDump(&c);
}
if (H_D_L())
std::cout << std::endl;
}
void HexDump(char* c)
{
char _char = *c;
char ch = _char / 0x10;
char cl = _char % 0x10;
std::cout << std::hex << static_cast<int>(ch) << std::hex << static_cast<int>(cl) << " ";
if (H_D_NL())
std::cout << std::endl;
}
I left out the part with _DEBUG and _DEBUG_VARS because I know they work so they're not important here. I run this, I want to get the hexadecimal values in bytes from the values in the whole class. I run this in the main function of the program (This is a Win32 Console Application) using this code:
Test test;
HexDump<Test>(&test);
This resulted in the output:
18 5e 17 01 01 04 00 00 01 00
For me this wasn't really as clearly as I wanted it. so I added this code:
HexDump<char>(test.Allapoto);
HexDump<int>(&test.Banana);
HexDump<bool>(&test.Apple);
so that it now looklikes this:
Test test;
HexDump<Test>(&test);
HexDump<char>(test.Allapoto);
HexDump<int>(&test.Banana);
HexDump<bool>(&test.Apple);
I did run this, and this time I got more interesting stuff:
18 5e 17 01 01 04 00 00 01 00
00 00
41
01 04 00 00
01
So after this, I thought, hmm, some familiar number, and some I've seen before. 1st thing 01 04 00 00 and a 01, I've seen them before in the bigger one (first test). If I look in the class structure, I see I put a bool last and an int before that. So the 01 must be my bool, because a bool is 1 byte and it's also set to true, then before that, I declared an int. an int is 32-bits or 4 bytes, and here I see 01 04 00 00. then we have the char, which is before the int. I've always done the variable and passed in a pointer to it. Now I wanna do a char, and passed in a char pointer. In this case, the first char is 'A'. When we look at the console output, we can see that it shows 41, you may ask why is it 41? well it's hexadecimal, and 41 in hex is 65 in decimal, which is the ascii value of A.
But now to my questions.
If we look at the one before the char value: 00 00. Why isn't those in the first output?
If we look in the first output, and think of question 1, why isn't the char written there, in this case 41?
Now that it isn't 41 there, we can also see that the rest of the char* (string) isn't there either. Maybe those 18 5e 17 01 is a pointer to the char* string, am I right?
Is there another way to do a hex dumb? I want both custom code, and maybe if there is, a function in the standard library.
Thanks
What seem to mess this up is that one call to HexDump
may result in multiple lines. If you change your logic to output always newline at the end of the generic HexDump
and never in the specialized HexDump
you get one line fore each call to HexDump
.
That would probably clear a few of your questions.
Without those modification I get the output:
--- &test:
6f 10 40 00 00 00 00 00 01 04
00 00 01 00 00 00
--- test.Allapoto:
41
--- &test.Banana:
01 04 00
00
--- &test.Apple:
01
With my modified line-break handling I get:
--- &test:
6f 10 40 00 00 00 00 00 01 04 00 00 01 00 00 00
--- test.Allapoto:
41
--- &test.Banana:
01 04 00 00
--- &test.Apple:
01
- If we look at the one before the char value: 00 00. Why isn't those in the first output?
00 00 is part of the first line, everything's ok here.
- If we look in the first output, and think of question 1, why isn't the char written there, in this case 41?
41 is a first char of the line while in the structure you store pointer.
- Now that it isn't 41 there, we can also see that the rest of the char* (string) isn't there either. Maybe those 18 5e 17 01 is a pointer to the char* string, am I right?
Yes, you're right.
- Is there another way to do a hex dumb? I want both custom code, and maybe if there is, a function in the standard library.
You have to realize that any way you perform a hex dump will cross the border of what's defined in the standard. You will have to rely on some implementation defined behavior and also to some extent that undefined behaviour doesn't result in nasal demons. Most compilers would probably behave properly here to allow hex dumping.
There is a few improvements you could do. First of all you should probably use unsigned char
instead of char
in order to guarantee that you don't get sign extension when converting the byte to hex.
Second you should improve the new-line logic. You should probably confine that to the generic HexDump
function and make the counter a local variable. For example:
template <class T_HEX>
void HexDump(T_HEX* var)
{
unsigned char* ptr = reinterpret_cast<unsigned char*>(var);
unsigned char* off_ptr = NULL;
int size = sizeof(*var);
for (int i = 0; i < size; i++)
{
off_ptr = (ptr + i);
unsigned char c = *off_ptr;
if( i && i%8 == 0 )
std::cout << std::endl;
HexDump(&c);
}
std::cout << std::endl;
}
void HexDump(unsigned char* c)
{
unsigned char _char = *c;
unsigned char ch = _char / 0x10;
unsigned char cl = _char % 0x10;
std::cout << std::hex << static_cast<int>(ch) << std::hex << static_cast<int>(cl) << " ";
}