Search code examples
c++valgrind

Valgrind: how to find the stack variable that is detected as not initalized?


UPDATE-1: made the example a little more realistic

SUSE Tumbleweed, clang 19.1.4, gcc 14.2.1, valgrind 3.24.0 (built from source)

in my_proc.cpp

int proc()
{
    int y;
    return y;
}

int main()
{
    int y = proc();
    int x;
    return x+y;
}

using g++ and valgrind

g++ -g -O0 -fno-omit-frame-pointer my_prog.cpp
valgrind --tool=memcheck --leak-check=no --track-origins=yes ./a.out

giving me only line 2 which is the start int proc()

==469== Memcheck, a memory error detector
==469== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==469== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==469== Command: ./a.out
==469==
==469== Syscall param exit_group(status) contains uninitialised byte(s)
==469==    at 0x4CC909D: _Exit (in /usr/lib64/libc.so.6)
==469==    by 0x4C26EB5: __run_exit_handlers (in /usr/lib64/libc.so.6)
==469==    by 0x4C26FFF: exit (in /usr/lib64/libc.so.6)
==469==    by 0x4C0D2B4: (below main) (in /usr/lib64/libc.so.6)
==469==  Uninitialised value was created by a stack allocation
==469==    at 0x401116: proc() (my_prog.cpp:2)
==469==
==469==
==469== HEAP SUMMARY:
==469==     in use at exit: 0 bytes in 0 blocks
==469==   total heap usage: 1 allocs, 1 frees, 73,728 bytes allocated
==469==
==469== For a detailed leak analysis, rerun with: --leak-check=full
==469==
==469== For lists of detected and suppressed errors, rerun with: -s
==469== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

using clang++ with memory sanitizer as comparison (but I know the big burden to build all dependencies also with msan - that's why I'm using valgrind)

clang++ -g -O0 -fno-omit-frame-pointer -fsanitize=memory my_prog.cpp

giving me the exact location of the first uninitialized var usage in line 4 return y; and 9 int y = proc(); and then exit

==480==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x564ffb0f818a in proc() /home/linux/temp/my_prog.cpp:4:5
    #1 0x564ffb0f820a in main /home/linux/temp/my_prog.cpp:9:13
    #2 0x7fc2d66092ad in __libc_start_call_main (/lib64/libc.so.6+0x2a2ad) (BuildId: 03f1631dc9760d3e30311fe62e15cc4baaa89db7)
    #3 0x7fc2d6609378 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a378) (BuildId: 03f1631dc9760d3e30311fe62e15cc4baaa89db7)
    #4 0x564ffb05b104 in _start /home/abuild/rpmbuild/BUILD/glibc-2.40/csu/../sysdeps/x86_64/start.S:115

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/linux/temp/my_prog.cpp:4:5 in proc()
Exiting

are there some tricks, compiler settings, dbg tricks to exactly locate the variable in this unrealistic small example with the output of valgrind? (give that my real scenarios are way bigger and older)

ANOTHER SCENARIO:

small example - but in real that uninitialized access is somewhere hidden under +100k lines of inline code - too large functions (not developed by me) to get a fast clue from which var the problem comes from

using UBSAN (and MSAN as example) and -Werror -Wall no UBSAN warnings, no ASAN warnings only MSAN(very detailed) or Valgrind with just the main as the start point were it happens - but in my real scenario there are much much more variables cluttered over the whole function (as an example)

what can I do now to get more out of the valgrind info (additional options, other complementing tools helping valgrind?) - to be better prepared for the future or for extending my CI server to also give out better information

MSAN is out of scope because there are 20 third-party libs used (and not even all in source available) - the reason for using valgrind

and I'm only interested in uninitialized memory findings - ASAN is already in use for years for finding other memory related problems

union blub_t
{
    short a;
    int b;
};

int main()
{
    blub_t g;
    g.a = 10;
    return g.b;
}

build + valgrind

clang++ -g -O0 -fno-omit-frame-pointer -Werror -Wall my_prog.cpp
valgrind --tool=memcheck --leak-check=no --track-origins=yes ./a.out

gives

==636== Memcheck, a memory error detector
==636== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==636== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==636== Command: ./a.out
==636==
==636== Syscall param exit_group(status) contains uninitialised byte(s)
==636==    at 0x4CC909D: _Exit (in /usr/lib64/libc.so.6)
==636==    by 0x4C26EB5: __run_exit_handlers (in /usr/lib64/libc.so.6)
==636==    by 0x4C26FFF: exit (in /usr/lib64/libc.so.6)
==636==    by 0x4C0D2B4: (below main) (in /usr/lib64/libc.so.6)
==636==  Uninitialised value was created by a stack allocation
==636==    at 0x401116: main (my_prog.cpp:8)
==636==
==636==
==636== HEAP SUMMARY:
==636==     in use at exit: 0 bytes in 0 blocks
==636==   total heap usage: 1 allocs, 1 frees, 73,728 bytes allocated
==636==
==636== For a detailed leak analysis, rerun with: --leak-check=full
==636==
==636== For lists of detected and suppressed errors, rerun with: -s
==636== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

build with MSAN

clang++ -g -O0 -fno-omit-frame-pointer -fsanitize=memory,undefined -Werror -Wall my_prog.cpp

gives

==585==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x559d2f9471ec in main /home/linux/temp/my_prog.cpp:11:5
    #1 0x7f47941bf2ad in __libc_start_call_main (/lib64/libc.so.6+0x2a2ad) (BuildId: 03f1631dc9760d3e30311fe62e15cc4baaa89db7)
    #2 0x7f47941bf378 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a378) (BuildId: 03f1631dc9760d3e30311fe62e15cc4baaa89db7)
    #3 0x559d2f8aa104 in _start /home/abuild/rpmbuild/BUILD/glibc-2.40/csu/../sysdeps/x86_64/start.S:115

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/linux/temp/my_prog.cpp:11:5 in main
Exiting

Solution

  • If I change the code to be

    #include "valgrind/memcheck.h"
    
    int proc()
    {
        int easy_to_find;
        (void)VALGRIND_CHECK_MEM_IS_DEFINED(&easy_to_find, sizeof(easy_to_find));
        return easy_to_find;
    }
    
    int main()
    {
        int y = proc();
        int x;
        return x+y;
    }
    

    then with the right options

    valgrind --track-origins=yes --read-var-info=yes ./so13
    ==395216== Memcheck, a memory error detector
    ==395216== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
    ==395216== Using Valgrind-3.25.0.GIT and LibVEX; rerun with -h for copyright info
    ==395216== Command: ./so13
    ==395216== 
    ==395216== Uninitialised byte(s) found during client check request
    ==395216==    at 0x401378: proc() (so13.cpp:7)
    ==395216==    by 0x401394: main (so13.cpp:13)
    ==395216==  Location 0x1ffefff25c is 0 bytes inside local var "easy_to_find"
    ==395216==  declared at so13.cpp:6, in frame #0 of thread 1
    ==395216==  Uninitialised value was created by a stack allocation
    ==395216==    at 0x401326: proc() (so13.cpp:5)
    ==395216== 
    ==395216== Syscall param exit_group(status) contains uninitialised byte(s)
    ==395216==    at 0x512F336: _Exit (in /usr/lib64/libc-2.28.so)
    ==395216==    by 0x5077DC9: __run_exit_handlers (in /usr/lib64/libc-2.28.so)
    ==395216==    by 0x5077DFF: exit (in /usr/lib64/libc-2.28.so)
    ==395216==    by 0x50617EB: (below main) (in /usr/lib64/libc-2.28.so)
    ==395216==  Uninitialised value was created by a stack allocation
    ==395216==    at 0x401326: proc() (so13.cpp:5)
    ==395216== 
    ==395216== 
    ==395216== HEAP SUMMARY:
    ==395216==     in use at exit: 0 bytes in 0 blocks
    ==395216==   total heap usage: 1 allocs, 1 frees, 73,728 bytes allocated
    ==395216== 
    ==395216== All heap blocks were freed -- no leaks are possible
    ==395216== 
    ==395216== For lists of detected and suppressed errors, rerun with: -s
    ==395216== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
    

    I think that the problem is that the error occurs after the return from proc(). The dwarfdump for the exe gives me

    0x000002a2:   DW_TAG_subprogram
                    DW_AT_external  (true)
                    DW_AT_name      ("proc")
                    DW_AT_decl_file ("/path/to/so13.cpp")
                    DW_AT_decl_line (4)
                    DW_AT_decl_column       (0x05)
                    DW_AT_linkage_name      ("_Z4procv")
                    DW_AT_type      (0x0000029b "int")
                    DW_AT_low_pc    (0x0000000000401326)
                    DW_AT_high_pc   (0x0000000000401388)
                    DW_AT_frame_base        (DW_OP_call_frame_cfa)
                    DW_AT_call_all_calls    (true)
                    DW_AT_sibling   (0x00000306)
    
    0x000002c8:     DW_TAG_variable
                      DW_AT_name    ("easy_to_find")
                      DW_AT_decl_file       ("/path/to/so13.cpp")
                      DW_AT_decl_line       (6)
                      DW_AT_decl_column     (0x09)
                      DW_AT_type    (0x0000029b "int")
                      DW_AT_location        (DW_OP_fbreg -20)
    

    I'm not really a DWARF expert, but I think that means that with the instruction pointer is between DW_AT_low_pc and DW_AT_high_pc it's possible to use the DW_TAG_variable to get the file and line info from the address calculated from the function DW_AT_frame_base and the variable location DW_OP_fbreg -20. There's also a lexical scope for the variable that I missed out.

    All that to say that I don't think that it is feasible. At the point where the error happens, after main, we no longer have the values of the instruction pointer or frame poiner necessary to get the source file information from the DWARF debuginfo.

    I'll take a look at the info that --track-origins=yes retains to see if that could be extended to get more information.