This question is a M(not)WE of this question. I wrote a code that reproduces the error:
#include <cstdlib>
#include <iostream>
#include <vector>
int *watch_errno = __errno_location();
int main(){
std::vector<double> a(7e8,1); // allocate a big chunk of memory
std::cout<<std::system(NULL)<<std::endl;
}
It has to be compiled with g++ -ggdb -std=c++11
(g++ 4.9 on a Debian). Note
that the int *watch_errno
is useful only to allow gdb to watch errno
.
When it is run under gdb
, I get this :
(gdb) watch *watch_errno
Hardware watchpoint 1: *watch_errno
(gdb) r
Starting program: /tmp/bug
Hardware watchpoint 1: *watch_errno
Old value = <unreadable>
New value = 0
__static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at bug.cpp:10
10 }
(gdb) c
Continuing.
Hardware watchpoint 1: *watch_errno
Old value = 0
New value = 12
0x00007ffff7252421 in do_system (line=line@entry=0x7ffff7372168 "exit 0") at ../sysdeps/posix/system.c:116
116 ../sysdeps/posix/system.c: No such file or directory.
(gdb) bt
#0 0x00007ffff7252421 in do_system (line=line@entry=0x7ffff7372168 "exit 0") at ../sysdeps/posix/system.c:116
#1 0x00007ffff7252510 in __libc_system (line=<optimized out>) at ../sysdeps/posix/system.c:182
#2 0x0000000000400ad8 in main () at bug.cpp:9
(gdb) l
111 in ../sysdeps/posix/system.c
(gdb) c
Continuing.
0
[Inferior 1 (process 5210) exited normally]
For some reason errno
is set to ENOMEM
at line 9 which corresponds to the
system()
call. Note that if the vector has a smaller size (I guess that it
depends on which computer you'll run the code), the code works fine and
system(NULL)
returns 1 as it should when a shell is available.
Why is the flag ENOMEM
raised? Why isn't the code using the swap memory? Is this a bug? Is there a workaround? Would popen
or exec*
do the same? (I know, I should only ask one question per post, but all these question could be summarized by, "what is going on?")
As requested, here is the result of ulimit -a
:
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-m: resident set size (kbytes) unlimited
-u: processes 30852
-n: file descriptors 65536
-l: locked-in-memory size (kbytes) 64
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 30852
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: unlimited
and here the relevant part of strace -f myprog
mmap(NULL, 5600002048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faa98562000
rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7fabe622b180}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7fabe622b180}, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7fff8797635c) = -1 ENOMEM (Cannot allocate memory)
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fabe622b180}, NULL, 8) = 0
rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7fabe622b180}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fabe6fde000
write(1, "0\n", 20
) = 2
write(1, "8\n", 28
) = 2
munmap(0x7faa98562000, 5600002048) = 0
here is the output of free:
total used free shared buffers cached
Mem: 7915060 1668928 6246132 49576 34668 1135612
-/+ buffers/cache: 498648 7416412
Swap: 2928636 0 2928636
The system()
function works by first creating a new copy of the process with fork()
or similar (in Linux, this ends up in the clone()
system call, as you show) and then, in the child process, calling exec
to create a shell running the desired command.
The fork()
call can fail if there is insufficient virtual memory for the new process (even though you intend to immediately replace it with a much smaller footprint, the kernel can't know that). Some systems allow you to trade the ability to fork large processes for reduced guarantees that page faults may fail, with copy-on-write (vfork()
) or memory overcommit (/proc/sys/vm/overcommit_memory
and /proc/sys/vm/overcommit_ratio
).
Note that the above applies equally to any library function that may create new processes - e.g. popen()
. Though not exec()
, as that replaces the process and doesn't clone it.
If the provided mechanisms are inadequate for your use case, then you may need to implement your own system()
replacement. I recommend starting a child process early on (before you allocate lots of memory) whose sole job is to accept NUL
-separated command lines on stdin
and report exit status on stdout
.
An outline of the latter solution in pseudo-code looks something like:
int request_fd[2];
int reply_fd[2];
pipe(request_fd);
pipe(reply_fd);
if (fork()) {
/* in parent */
close(request_fd[0]);
close(reply_fd[1]);
} else {
/* in child */
close(request_fd[1]);
close(reply_fd[0]);
while (read(request_fd[0], command)) {
int result = system(command);
write(reply_fd[1], result);
}
exit();
}
// Important: don't allocate until after the fork()
std::vector<double> a(7e8,1); // allocate a big chunk of memory
int my_system_replacement(const char* command) {
write(request_fd[1], command);
read(reply_fd[0], result);
return result;
}
You'll want to add appropriate error checks throughout, by reference to the man pages. And you might want to make it more object-oriented, and perhaps use iostreams for your read and write operations, etc.