Search code examples
cgccfreestanding

Why do these two pointers that should be the same point to different data?


I'm writing a FAT16 driver in GNU C for a hobby operating system, and I have a structure defined as such:

struct directory_entry {
  uint8_t name[11];
  uint8_t attrib;
  uint8_t name_case;
  uint8_t created_decimal;
  uint16_t created_time;
  uint16_t created_date;
  uint16_t accessed_date;
  uint16_t ignore;
  uint16_t modified_time;
  uint16_t modified_date;
  uint16_t first_cluster;
  uint32_t length;
} __attribute__ ((packed));

I was under the impression that name would be at the same address as the whole struct, and that attrib would be 11 bytes after that. And indeed, (void *)e.name - (void *)&e is 0 and (void *)&e.attrib - (void *)&e is 11, where e is of type struct directory_entry.

In my kernel, a void pointer to e is passed to a function which reads its contents from a disk. After this function, *(uint8_t *)&e is 80 and *((uint8_t *)&e + 11 is 8, as expected for what's on the disk. However, e.name[0] and e.attrib both are 0.

What gives here? Am I misunderstanding how __attribute__ ((packed)) works? Other structs with the same attribute work how I expect at other parts of my kernel. I can post a link to the full source if needed.

Edit: The full source is in this gitlab repository, on the stack-overflow branch. The relevant part is lines 34 to 52 of src/kernel/main.c. I'm sure that the data is being populated right, as I check *(uint8_t *)&e and *((uint8_t *)&e + 11). When I run it, the following is output by that part:

(void *)e.name - *(void *)&e
  => 0
*(uint8_t *)&e
  => 80
e.name[0]
  => 0
(void *)&e.attrib - (void *)&e
  => 11
*((uint8_t *)&e + 11)
  => 8
e.attrib
  => 0

I'm very confused about why e.name[0] would be any different than *(uint8_t *)&e.

Edit 2: I disassembled this part using objdump, to see what the difference was in the compiled code, but now I'm even more confused. u8_dec(*(uint8_t *)&e, nbuf); and u8_dec(e.name[0], nbuf); are both compiled to: (comments mine)

lea   eax, [ebp - 0x30] ;loads address of e from stack into eax
movzx eax, byte [eax]   ;loads byte pointed to by eax into eax, zero-extending
movzx eax, al           ;not sure why this is here, as it's already zero-extended
sub esp, 0x8
push  0x31ce0 ;nbuf
push  eax     ;the byte we loaded
call  0x3162f ;u8_dec
add esp, 0x10

This passes in the first byte of the struct, as expected. I'm sure that u8_dec doesn't modify e, as its first argument is passed by value and not by reference. nbuf is an array declared at file-scope, while e is declared at function scope, so it's not that they overlap or anything. Perhaps u8_dec isn't doing its job right? Here's the source of that:

void u8_dec(uint8_t n, uint8_t *b) {
  if (!n) {
    *(uint16_t *)b = '0';
    return;
  }
  bool zero = false;
  for (uint32_t m = 100; m; m /= 10) {
    uint8_t d = (n / m) % 10;
    if (zero)
      *(b++) = d + '0';
    else if (d) {
      zero = true;
      *(b++) = d + '0';
    }
  }
  *b = 0;
}

It's pretty clear now that packed structs do work how I think they do, but I'm still not sure what's causing the problem. I'm passing the same value to a function that should be deterministic, but I'm getting different results on different calls.


Solution

  • My kernel utilizes 32-bit protected mode segmenting. I had my data segment as 0x0000.0000 - 0x000f.ffff and my stack segment as 0x0003.8000 - 0x0003.ffff, to trigger a general protection fault if the stack over overflowed, rather than allowing it to overflow into other kernel data and code.

    However, when GCC compiles C code, it assumes that the stack and data segments have the same base, as this is most often the case. This was causing a problem as when I took the address of the local variable, it was relative to the stack segment (as local variables are on the stack), but when I dereferenced the pointer in the function that was called, it was relative to the data segment.

    I have changed my segmenting model so that the stack is in the data segment instead of its own segment, and this has fixed the problem.