I have the following code:
/*//not important
FILE * INFILE;
list_file = optarg;
if( ( INFILE = fopen( list_file, "a+" ) ) == NULL ) {
fprintf( stderr, "Can't open input file\n");
exit(0);
}
*/
pthread_mutex_t input_queue;
pthread_mutex_init(&input_queue, NULL);
for( i = 0 ; i < number_thread; i++)
{
if( pthread_create( &thread_id[i], NULL, &work, NULL) != 0 )
{
i--;
fprintf(stderr, RED "\nError in creating thread\n" NONE);
}
}
for( i = 0 ; i < number_thread; i++)
if( pthread_join( thread_id[i], NULL) != 0 )
{
fprintf(stderr, RED "\nError in joining thread\n" NONE);
}
void * work(void * data)
{
unsigned long line;
char buf[512];
while ( !feof(INFILE) )
{
pthread_mutex_lock(&input_queue);
fgets((char *)&buf, sizeof(buf), INFILE);
if (buf[strlen (buf) - 1] == '\n')
buf[strlen (buf) - 1] = '\0';
line = (unsigned long)buf;
pthread_mutex_unlock(&input_queue);
do_work( line );
}
fclose(INFILE);
return NULL;
}
it reads lines from file but after a while it exits unexpectedly, no error message. I guess I messed up something.
How can I read the file line by line using pthreads but keep as much as possible the code unchanged (I mean not to mess up the whole program) ?
You are closing INFILE
in the first thread that encounters EOF. Afterwards other threads will call feof()
— and possibly fclose()
— on the closed file, which will corrupt the heap and almost certainly lead to a crash. Also, your newline-chopping code can underrun your buffer at EOF, see remark below.
To fix the problem, protect feof()
and fclose()
with the same mutex, and set INFILE to NULL. When the mutex is acquired, check for INFILE being NULL and return immediately if so:
for (;;) {
pthread_mutex_lock(&input_queue);
if (!INFILE) {
pthread_mutex_unlock(&input_queue);
break;
}
if (feof(INFILE)) {
INFILE = NULL;
pthread_mutex_unlock(&input_queue);
break;
}
fgets(buf, sizeof(buf), INFILE);
pthread_mutex_unlock(&input_queue);
// ...strip newline, do_work...
}
Several remarks:
your code writes to buf[strlen(buf) - 1]
without checking whether strlen(buf)
is zero. buf
will be empty at EOF, so this is not a theoretical concern, it will happen exactly once in every execution.
line
is of type unsigned long
, but you are assigning it a pointer value. This will fail on platforms where long
does not contain a pointer, such as Win64. Declare line
and the argument of do_work
as char *
(or void *
if it must accept other pointer types) instead.
avoid calling your mutex a "queue"; in multithreaded programming queue refers to a producer-consumer aware FIFO.
you don't need to protect individual stdio functions like fgets
with mutexes. They are MT-safe, as mandated by POSIX. (However, in my modified code, fgets()
does need to be mutex-protected because INFILE
can become invalidated while the mutex is not being held.)
(char *) &buf
doesn't make sense. Since buf
is a char
array, it already decays to a pointer to its first member, so you can simply send buf
to fgets
. If you insist on using the address-of operator, the correct expression is &buf[0]
.
as Carl Norum hinted, feof()
is probably not what you want, since it only detects EOF condition already encountered by fgets()
. The correct way to check for EOF is by testing whether fgets()
returns an empty string — before stripping the newline.