Search code examples
c++cscanfnewlinestdin

What does the format string "%*[^\n]" in a scanf() statement instruct? How do assignment suppressor (*) and negated scanset ([^) work together?


I know about the introduction of the scanset with the [ conversion specifier which subsequent indicate characters to match or not to match with an additional interposition of the ^ symbol.

For this, in ISO/IEC 9899/1999 (C99) is stated:

The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.

So, the expression [^\n] means, that it is scanning characters until a \n character is found in the according stream, here at scanf(), stdin. \n is not taken from stdin and scanf() proceeds with the next format string if any remain, else it skips to the next C statement.

Next there is the assignment-suppression-operator *:

For this, in ISO/IEC 9899/1999 (C99) is stated:

Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result.

Meaning in the case of f.e. scanf("%*100s",a); that a sequence of 100 characters is taken from stdin unless a trailing white-space character is found but not assigned to a if a is a proper-defined char array of 101 elements (char a[101];).


But what does now the format string "%*[^\n]" in a scanf()-statement achieve? Does \n remain instdin?

How do assignment supressor * and negated scanset [^ work together?

Does it mean, that:

  1. By using * all characters matching to this format string are taken from stdin, but are sure not assigned?, and
  2. \n isn't taken from stdin but it is used to determine the scan-operation for the according format string?

I know what each of those [^ and * do alone, but not together. The question is what is the result of the mix of those two together, incorporated with the negated scanset of \n.


I know that there is a similar question on Stack Overflow which covers the understanding of %[^\n] only, here: What does %[^\n] mean in a scanf() format string. But the answers there do not help me with my problem.


Solution

  • %[^\n] reads up to but not including the next \n character. In plain English, it reads a line of text. Normally, the line would be stored in a char * string variable.

    char line[SIZE];
    scanf("%[^\n]", line);
    

    The * modifier suppresses that behavior. The line is simply discarded after being read and no variable is needed.

    scanf("%*[^\n]");
    

    * doesn't alter how the input is processed. In either case, everything up to but not including the next \n is read from stdin. Assuming no I/O errors, it is guaranteed that the next read from stdin will see either \n or EOF.

    Which scanf() statement should I use if I want to read and thereafter discard every character in stdin including the \n character?

    Add %*c to also consume the \n.

    scanf("%*[^\n]%*c");
    

    Why %*c instead of just \n? If you used \n it wouldn't just consume a single newline character, it would consume any number of spaces, tabs, and newlines. \n matches any amount of whitespace. It's better to use %*c to consume exactly 1 character.

    // Incorrect
    scanf("%*[^\n]\n");
    

    See also:

    Could I use fflush() instead?

    No, don't. fflush(stdin) is undefined.

    Isn't the negated scanset of [\n] completely redundant because scanf() terminate the scan process of the according format string at first occurrence of a white space character by default?

    With %s, yes, it will stop reading at the first whitespace character. %s only reads a single word. %[^\n], by contrast, reads an entire line. It will not stop at spaces or tabs, only newlines.

    More generally, with square brackets only the exact characters listed are relevant. There is no special behavior for whitespace. Unlike %s it does not skip leading whitespace, nor does it stop processing early if it encounters whitespace.