I tried to diagnose a bug in an app written in C on Linux. It turned out that the bug was caused by forgetting fclose
in the child process when the FILE *
handle is still open in the parent process.
The file operation is only read
. No write operation.
The app is running on Linux 5.4.0-58-generic
. In this case the bug occured.
The app is running on Linux 5.10.0-051000-generic
. In this case there is no bug, and this is what I expected.
The parent process do random number of fork
syscall if there is no fclose
in child process.
I am fully aware that forgetting fclose
will lead to memory leak, but:
exit(3)
not _exit(2)
.fclose
in child process affects the parent process?This is a Linux kernel bug that has been fixed in the version after 5.4
. Yet I don't have a proof, but my test showed me so.
I have been able to fix this app bug by calling fclose
in the child process before it exits. But, I want to know what actually happen in this case. So my question is How come forgetting fclose
in child process affects the parent process?
fclose
in the child process. test2.c does not call fclose
in the child process.123123123
123123123
123123123
123123123
123123123
123123123
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#define TICK do { putchar('.'); fflush(stdout); } while(0)
int main() {
char buff[1024] = {0};
FILE *handle = fopen("test.txt", "r");
uint32_t num_of_forks = 0;
while (fgets(buff, 1024, handle) != NULL) {
TICK;
num_of_forks++;
pid_t pid = fork();
if (pid == -1) {
printf("Fork error: %s\n", strerror(errno));
continue;
}
if (pid == 0) {
fclose(handle);
exit(0);
}
}
fclose(handle);
putchar('\n');
printf("Number of forks: %d\n", num_of_forks);
wait(NULL);
}
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#define TICK do { putchar('.'); fflush(stdout); } while(0)
int main() {
char buff[1024] = {0};
FILE *handle = fopen("test.txt", "r");
uint32_t num_of_forks = 0;
while (fgets(buff, 1024, handle) != NULL) {
TICK;
num_of_forks++;
pid_t pid = fork();
if (pid == -1) {
printf("Fork error: %s\n", strerror(errno));
continue;
}
if (pid == 0) {
// fclose(handle);
exit(0);
}
}
fclose(handle);
putchar('\n');
printf("Number of forks: %d\n", num_of_forks);
wait(NULL);
}
ammarfaizi2@integral:/tmp$ uname -r
5.4.0-58-generic
ammarfaizi2@integral:/tmp$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ammarfaizi2@integral:/tmp$ ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.1) 2.31
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
ammarfaizi2@integral:/tmp$ cat test.txt
123123123
123123123
123123123
123123123
123123123
123123123
ammarfaizi2@integral:/tmp$ diff test1.c test2.c
27c27
< fclose(handle);
---
> // fclose(handle);
ammarfaizi2@integral:/tmp$ gcc test1.c -o test1 && gcc test2.c -o test2
ammarfaizi2@integral:/tmp$ ./test1
......
Number of forks: 6
ammarfaizi2@integral:/tmp$ ./test1
......
Number of forks: 6
ammarfaizi2@integral:/tmp$ ./test1
......
Number of forks: 6
ammarfaizi2@integral:/tmp$ ./test2
..................................................................................................................................................................................
Number of forks: 178
ammarfaizi2@integral:/tmp$ ./test2
............................................................................................................................................................................................................................................................................................................................................................
Number of forks: 348
ammarfaizi2@integral:/tmp$ ./test2
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Number of forks: 475
ammarfaizi2@integral:/tmp$ md5sum test1 test2
c32d03916b9b72546b966223837fd115 test1
f314d2135092362288a66f53b37ffa4d test2
root@esteh:/tmp# uname -r
5.10.0-051000-generic
root@esteh:/tmp# gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
root@esteh:/tmp# ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.1) 2.31
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
root@esteh:/tmp# cat test.txt
123123123
123123123
123123123
123123123
123123123
123123123
root@esteh:/tmp# diff test1.c test2.c
27c27
< fclose(handle);
---
> // fclose(handle);
root@esteh:/tmp# gcc test1.c -o test1 && gcc test2.c -o test2
root@esteh:/tmp# ./test1
......
Number of forks: 6
root@esteh:/tmp# ./test1
......
Number of forks: 6
root@esteh:/tmp# ./test1
......
Number of forks: 6
root@esteh:/tmp# ./test2
......
Number of forks: 6
root@esteh:/tmp# ./test2
......
Number of forks: 6
root@esteh:/tmp# ./test2
......
Number of forks: 6
root@esteh:/tmp# md5sum test1 test2 # Make sure the files are identical with case 1
c32d03916b9b72546b966223837fd115 test1
f314d2135092362288a66f53b37ffa4d test2
fclose
in the child process on Linux 5.4.0-58-generic
causes the fork syscall in the parent process be strange.Linux 5.10.0-051000-generic
.Thanks to @Jonathan Leffler!
This problem is a duplicate of Why does forking my process cause the file to be read infinitely
The missing knowledge, why does the bug not occur on Linux 5.10.0-051000-generic
turned out that it is not related to the kernel.
It turned out that the parent process competes with the child processes (not related to kernel).
fclose(3)
in the childs, the child processes will call lseek(2)
as soon as they call exit(3)
. This will cause the parent re-read the same offset, because the childs call lseek(2)
with negative offset + SEEK_CUR
.(I don't know why it is necessary to call lseek(2)
before exit, it might have been explained in @Jonathan Leffler's answer, I did not read the whole answer carefully).
lseek(2)
. Then there is no problem at all.Also, as @iBug has mentioned But keep in mind that process scheduling may make the result non-predictable, unless you implement some kind of "syncing".
The parent process on Linux 5.10.0-051000-generic
machine I used was just a lucky process that always won to read the entire file first before the childs call lseek(2)
.
I tried to add more lines to the file (to be 150 lines), so the parent will mostly be slower than reading 6 lines, and the undefined behavior happens.