Search code examples
cgccundefined-behaviordangling-pointer

Why returning pointer to locally declated variable is null in place of pointer to location in stack?


In the following code snippet shouldn't str_s should point to some location in stack.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char* fun_s(){
    char str[8]="vikash";
    printf("s :%p\n", str);
    return str;
}

char* fun_h(){
    char* str = (char*)malloc(8);
    printf("h :%p\n", str);
    strcpy(str, "vikash");
    return str;
}

int main(){
    char* str_s = fun_s();
    char* str_h = fun_h();
    printf("s :%p\nh :%p\n", str_s, str_h);
    return 0;
}

I understand that there is problem in return of fun_s and content of this pointer can't be trusted, but as per my understanding it should point to some location in stack not zero? I get following output in my console. Can you please explain why third line prints (nil) not 0x7ffce7561220

s :0x7ffce7561220
h :0x55c49538d670
s :(nil)
h :0x55c49538d670

GCC Version

gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

OS :Ubuntu 18.04.3 LTS


Solution

  • Your compiler is purposely injecting a null return value from that function. I don't have gcc 7.4 available, but I do have 7.3, and I assume the result is similar:

    Compiling fun_s to assembly delivers this:

    .LC0:
            .string "s :%p\n"
    fun_s:
            push    rbp
            mov     rbp, rsp
            sub     rsp, 16
            movabs  rax, 114844764957046
            mov     QWORD PTR [rbp-8], rax
            lea     rax, [rbp-8]
            mov     rsi, rax
            mov     edi, OFFSET FLAT:.LC0
            mov     eax, 0
            call    printf
            mov     eax, 0 ; ======= HERE =========
            leave
            ret
    

    Note the hard-set of zero to eax, which will hold the resulting pointer when returning back to the caller.

    Making str static delivers this:

    .LC0:
            .string "s :%p\n"
    fun_s:
            push    rbp
            mov     rbp, rsp
            mov     esi, OFFSET FLAT:str.2943
            mov     edi, OFFSET FLAT:.LC0
            mov     eax, 0
            call    printf
            mov     eax, OFFSET FLAT:str.2943
            pop     rbp
            ret
    

    In short, your compiler is detecting the local address return and rewriting it to be NULL. In doing so, it is preventing any later nefarious use of said-address (ex: a content injection attack).

    I see no reason the compiler should not be allowed to do this. I'm sure a language purist will confirm or reject that suspicion.