So, for a class assignment we are writing a Y86 (toy processor) disassembler in C++. Easy enough, I have almost everything done, except for disassembling instructions into a .quad directive.
The quad directive takes a numeric or hexadecimal value, and then converts it into an 8-byte "instruction" (it's not really an instruction, .quad is the only thing in the processor that takes 8 bytes so if you come across an 8 byte line you automatically know you're looking at a quad) that is representative of the value. Here's an example below since my explanation may not be great:
https://image.prntscr.com/image/h5xAoE4YRryl7HSJ13o5Yg.png
It's easy enough to see that the first two quads there are bit shifted 2 to the right on disassembly, but then the next two are bit-shifted 2 to the left. What's the pattern I'm missing here? Here's some more examples of disassembled quads:
0x0a0: 0300000000000000 | value: .quad 3
0x0a8: | list:
0x0a8: ffffffffffffffff | .quad -1
0x0b0: 0300000000000000 | .quad 3
0x0b8: 0500000000000000 | .quad 5
0x0c0: 0900000000000000 | .quad 9
0x0c8: 0300000000000000 | .quad 3
0x0d0: 2800000000000000 | .quad 40
0x0d8: 3000000000000000 | .quad 48
0x0e0: fcffffffffffffff | .quad -4
0x0e8: 0300000000000000 | .quad 3
0x0f0: 0700000000000000 | .quad 7
0x0f8: 0200000000000000 | .quad 2
0x100: 0300000000000000 | .quad 3
0x108: f6ffffffffffffff | .quad -10
0x110: f8ffffffffffffff | .quad -8
Essentially, I'm trying to write an algorithm that will take what's on the left in those screenshots (assembled processor code) and return ".quad 0xblahblah," but I can't figure out what it's doing to the hex values in order to get them like that.
My current C++ code is as follows:
unsigned int x;
stringstream oss;
oss << "0x" << std::uppercase << std::left << std::setw(20) << std::hex << hex;
string result = oss.str();
std::istringstream converter(result);
converter >> std::hex >> x;
But when it should be returning the .quads you see in the first screenshot I posted, it's returning this:
0x0d000d000d000000
0xc000c000c0000000
0x000b000b000b0000
0x00a000a000a00000
Which is the exact value of the assembled machine code, when I need to figure out what it's doing to end up with
0x000d000d000d0000
0x00c000c000c00000
0x0b000b000b000000
0xa000a000a0000000
As in the example screenshot.
It's easy enough to see that the first two quads there are bit shifted 2 to the right on disassembly, but then the next two are bit-shifted 2 to the left.
There's no 2 bit shift. There is what appears, if not paying close attention, to be a 2 nibble (8 bit) shift.
What's the pattern I'm missing here?
It's not bit shifting, it's reverse byte ordering.
Instead of repetitive patterns such as 000A000A000A try experimenting with counting patterns such as 0123456789AB
And pay attention to the most significant word, which is 0x0000 in nearly all of your examples. It appears at the end of the byte sequence, but becomes leading zeros (not even printed) in the decode.