I have been tracking down an off-by-one issue in a large C++ codebase. For some reason, I cannot understand the following Valgrind behavior. Could someone please shed some light here?
Code is:
% cat foo.cxx
#include <cstring>
#include <string>
#include <vector>
int main() {
std::vector<char> v;
#ifdef RESIZE9
v.resize(9);
#endif
v.resize(10);
std::string s(10, 'x');
std::strcpy(&v[0], s.c_str());
return 0;
}
Here is the expected Valgrind output:
% g++ foo.cxx && valgrind ./a.out
==21886== Memcheck, a memory error detector
==21886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21886== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==21886== Command: ./a.out
==21886==
==21886== Invalid write of size 1
==21886== at 0x4838DD7: strcpy (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==21886== by 0x1092CA: main (in /tmp/a.out)
==21886== Address 0x4d84c8a is 0 bytes after a block of size 10 alloc'd
==21886== at 0x4835DEF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==21886== by 0x109BCD: __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) (in /tmp/a.out)
==21886== by 0x109AD5: std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) (in /tmp/a.out)
==21886== by 0x109997: std::_Vector_base<char, std::allocator<char> >::_M_allocate(unsigned long) (in /tmp/a.out)
==21886== by 0x1095E8: std::vector<char, std::allocator<char> >::_M_default_append(unsigned long) (in /tmp/a.out)
==21886== by 0x1093CC: std::vector<char, std::allocator<char> >::resize(unsigned long) (in /tmp/a.out)
==21886== by 0x10926A: main (in /tmp/a.out)
==21886==
==21886==
==21886== HEAP SUMMARY:
==21886== in use at exit: 0 bytes in 0 blocks
==21886== total heap usage: 2 allocs, 2 frees, 72,714 bytes allocated
==21886==
==21886== All heap blocks were freed -- no leaks are possible
==21886==
==21886== For counts of detected and suppressed errors, rerun with: -v
==21886== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
But now consider the following:
% g++ -DRESIZE9 foo.cxx && valgrind ./a.out
==21904== Memcheck, a memory error detector
==21904== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21904== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==21904== Command: ./a.out
==21904==
==21904==
==21904== HEAP SUMMARY:
==21904== in use at exit: 0 bytes in 0 blocks
==21904== total heap usage: 3 allocs, 3 frees, 72,731 bytes allocated
==21904==
==21904== All heap blocks were freed -- no leaks are possible
==21904==
==21904== For counts of detected and suppressed errors, rerun with: -v
==21904== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
System is Debian/10.9 with:
% g++ --version
g++ (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
and
% valgrind --version
valgrind-3.14.0
While I can't confirm this as a complete (cross-platform) 'solution', adding a line to show the actual capacity of the vector after the resize operation(s) may shed some light:
#include <cstring>
#include <string>
#include <vector>
#include <iostream>
//#define RESIZE9 1
int main()
{
std::vector<char> v;
#ifdef RESIZE9
v.resize(9);
#endif
v.resize(10);
std::cout << v.capacity() << std::endl; // Show the actual allocated size
std::string s(10, 'x');
std::strcpy(&v[0], s.c_str());
return 0;
}
Running this code as is (Visual Studio, MSVC, Windows 10, 64-bit) shows a capacity of 10
(not unexpected). However, when the #define RESIZE9 1
line is uncommented, the shown capacity (after two resize calls) is 13
.
Adding extra capacity is, I believe, within the requirements of the standard: so long as the newly-allocated vector has sufficient capacity for the new size, nothing is broken. The allocation of 4 extra bytes (rather than just one) most likely optimizes memory management.