I have following piece of code running inside thread.. 'executable' produces unique string output for each input 'url':
p = Popen(["executable", url], stdout=PIPE, stderr=PIPE, close_fds=True)
output,error = p.communicate()
print output
when above code gets executed for multiple input 'urls', the subprocess p's 'output' produced is not consistent.For some of the urls, subprocess gets terminated without producing any 'output'. I tried printing p.returncode for each failed 'p' instance(failed urls are not consistent across multiple runs either) and got '-11' as a return code with 'error' value as empty string.Can someone please suggest a way to get consistent behavior/output for each run in a multithreaded environment?
-11
as a return code might mean that C program is not fine e.g., you are starting too many subprocesses and it causes SIGSERV
in the C executable. You can limit number of concurrent subprocesses using multiprocessing.ThreadPool, concurrent.futures.ThreadPoolExecutor, threading + Queue -based solutions:
#!/usr/bin/env python
from multiprocessing.dummy import Pool # uses threads
from subprocess import Popen, PIPE
def get_url(url):
p = Popen(["executable", url], stdout=PIPE, stderr=PIPE, close_fds=True)
output, error = p.communicate()
return url, output, error, p.returncode
pool = Pool(20) # limit number of concurrent subprocesses
for url, output, error, returncode in pool.imap_unordered(get_url, urls):
print("%s %r %r %d" % (url, output, error, returncode))
Make sure the executable can be run in parallel e.g., it doesn't use some shared resource. To test, you could run in a shell:
$ executable url1 & executable url2
Could you please explain more about "you are starting too many subprocesses and it causes SIGSERV in the C executable" and possibly solution to avoid that..
Possible problem:
The suggested above solution is:
Understand what is SIGSEGV run time error in c++? In short, your program is killed with that signal if it tries to access a memory that it is not supposed to. Here's an example of such program:
/* try to fail with SIGSERV sometimes */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void) {
char *null_pointer = NULL;
srand((unsigned)time(NULL));
if (rand() < RAND_MAX/2) /* simulate some concurrent condition
e.g., memory pressure */
fprintf(stderr, "%c\n", *null_pointer); /* dereference null pointer */
return 0;
}
If you run it with the above Python script then it would return -11
occasionally.
Also p.returncode is not sufficient for debugging purpose..Is there any other option to get more DEBUG info to get to the root cause?
I won't exclude the Python side completely but It is most likely that the problem is the C program. You could use gdb
to get a backtrace to see where in a callstack the error comes from.