Search code examples
cstringscanfformat-specifierscharacter-arrays

Doesn't %[] or %[^] specifier in scanf(),sscanf() or fscanf() store the input in null-terminated character array?


Here's what the Beez C guide (LINK) tells about the %[] format specifier:

It allows you to specify a set of characters to be stored away (likely in an array of chars). Conversion stops when a character that is not in the set is matched.

I would appreciate if you can clarify some basic questions that arise from this premise:

1) Are the input fetched by those two format specifiers stored in the arguments(of type char*) as a character array or a character array with a \0 terminating character (string)? If not a string, how to make it store as a string , in cases like the program below where we want to fetch a sequence of characters as a string and stop when a particular character (in the negated character set) is encountered?

2) My program seems to suggest that processing stops for the %[^|] specifier when the negated character | is encountered.But when it starts again for the next format specifier,does it start from the negated character where it had stopped earlier?In my program I intend to ignore the | hence I used %*c.But I tested and found that if I use %c and an additional argument of type char,then the character | is indeed stored in that argument.

3) And lastly but crucially for me,what is the difference between passing a character array for a %s format specifier in printf() and a string(NULL terminated character array)?In my other program titled character array vs string,I've passed a character array(not NULL terminated) for a %s format specifier in printf() and it gets printed just as a string would.What is the difference?

//Program to illustrate %[^] specifier

#include<stdio.h>

int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10];

sscanf(ptr, "%[^|]%*c%[^|]%*c%s", type,fruit1, fruit2);
printf("%s,%s,%s",type,fruit1,fruit2);
}

//character array vs string

#include<stdio.h>

int main()
{
char test[10]={'J','O','N'};
printf("%s",test);
}

Output JON

//Using %c instead of %*c

#include<stdio.h>

int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10],char_var;

sscanf(ptr, "%[^|]%c%[^|]%*c%s", type,&char_var,fruit1, fruit2);
printf("%s,%s,%s,and the character is %c",type,fruit1,fruit2,char_var);

}

Output fruit,apple,lemon,and the character is |


Solution

    1. It is null terminated. From sscanf():

      The conversion specifiers s and [ always store the null terminator in addition to the matched characters. The size of the destination array must be at least one greater than the specified field width.

    2. The excluded characters are unconsumed by the scan set and remain to be processed. An alternative format specifier:

      if (sscanf(ptr, "%9[^|]|%9[^|]|%9s", type,fruit1, fruit2) == 3)
      
    3. The array is actually null terminated as remaining elements will be zero initialized:

      char test[10]={'J','O','N' /*,0,0,0,0,0,0,0*/ };
      

    If it was not null terminated then it would keep printing until a null character was found somewhere in memory, possibly overruning the end of the array causing undefined behaviour. It is possible to print a non-null terminated array:

        char buf[] = { 'a', 'b', 'c' };
        printf("%.*s", 3, buf);