Search code examples
cfilescanfstdio

Include newline character when reading with fscanf


How can I read a(n optional) newline character when reading a file word by word using fscanf()?

I know I could use fgets() + strtok(), but my program specifically requires fscanf().

I've tried the following:

fscanf(fp, "%s%[\n]+", buf);

But it doesn't work whatsoever.


Solution

  • You can consume and ignore a single newline character with this conversion format: %*1[\n]. It consumes at most one newline and discards it. Note that if you have multiple consecutive newlines, only the first one will be skipped. Note too that fscanf() will read an extra byte to verify whether it matches or not. This byte will be pushed back into the stream with ungetc() if it does not match.

    If you used %*[\n], fscanf would keep reading the stream until it gets a byte different from newline or reaches the end of file, which would cause surprising behavior when handling interactive input from the terminal.

    Your code fscanf(fp, "%s[\n]", buf); causes undefined behavior because you do not provide a destination array for the newline characters. Furthermore, it has another flaw because you do not specify the maximum number of bytes to store into buf, causing undefined behavior on input with long words.

    Try this:

        char buf[100];
        if (fscanf(" %99s%*1[\n]", buf) == 1) {
            printf("read a word: |%s|\n", buf);
        } else {
            printf("no more words\n");
        }
    

    If you want to include the newline in the buffer, you will need to store it into a variable and add it by hand:

    #include <stdio.h>
    #include <string.h>
    
    int main() {
        for (;;) {
            char buf[100];
            char nl[2] = "";
            int n = fscanf(stdin, " %98s%1[\n]", buf, nl);
            if (n > 0) {
                strcat(buf, nl);
                printf("read a word: |%s|\n", buf);
            } else {
                printf("no more words\n");
                break;
            }
        }
        return 0;
    }
    

    Input:

    Hello word
      I     am   ready    
    

    Output:

    read a word: |Hello|
    read a word: |word
    |
    read a word: |I|
    read a word: |am|
    read a word: |ready|
    no more words