I have a system which consists of multiple threads. Each thread uses a common function called RunCmd
to execute a shell command. Everything works okay, but once in a while one of the threads (always the Alerts thread) gets stuck inside the RunCmd
function indefinitely. After adding some logging, it turns out that it gets stuck in the fgets call.
It is also worth noting that it does not matter what the shell command is, but it is getting stuck in the same place.
About the system:
Weird observations:
How can I further debug this? and trace logs or something else? Are there any workarounds?
Here is the runCmd
functions that gets stuck in that thread:
bool runCmd(char const* command, char* response, size_t bufferSize) {
FILE* pFile = popen(command, "r");
if (pFile == NULL) {
printf("Failed to run command\n" );
return false;
}
if (NULL != response) {
while (fgets(response, bufferSize, pFile) != NULL) {
response = response + strlen(response);
}
}
pclose(pFile);
return true;
}
select
to wait until the pipe was available before reading and timing out if took too long - but no differenceIf the rest of the code that we can't see is thread-safe (and no other thread is changing the buffers that command
and response
points at), I see one bug and two possible causes for the hanging fgets
.
Bug: The program may fill the response
buffer and then start to write out of bounds since you have no boundary check. The loop will tell fgets
to read up to bufferSize
characters every time even though you are eating away from the available response
memory in the loop. This makes the program have undefined behavior and hanging is one possible outcome.
One possible fix:
while (bufferSize > 1 && fgets(response, bufferSize, pFile) != NULL) {
size_t len = strlen(response);
response += len;
bufferSize -= len; // shrink what you tell fgets it can use
}
You could simplify it by using fread
instead. The logic is the same but fread
doesn't stop for newlines and returns how many items it read so you don't need to call strlen
afterwards.
if (bufferSize--) { // leave room for null terminator
size_t len;
while (bufferSize && (len = fread(response, 1, bufferSize, pFile))) {
response += len;
bufferSize -= len; // shrink what you tell fread it can use
}
*response = '\0';
}
The command
simply don't finish. If command
is waiting for something that never happens, it'll hang.
Example: If command
would execute this bash script that waits for input from the user, it would just hang until the user had pressed return. You wouldn't see any output either since fgets
is waiting for a newline that doesn't come (until after the user has pressed return).
#!/bin/bash
echo -n "Enter: "
read -r var