Search code examples
cscanfstdin

Implementing an alternative to scanf


Since almost all C functions designed to get data from stdin are bad / flawed:

  • gets the less said about this the better

  • scanf no checks for buffer overflow and '\n' constantly remaining in stdin, screwing up next scanfs

  • scanf_s pretty much the same, but with buffer overflow checks
  • fgets appends a '\n' to the string
  • gets_s doesn't have the previous problem, but useless for other streams

I decided to write my own function that would at least be usable to read numbers from stdin

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void *scano(char mode);

int main()
{

  // int *num = (int *) scano(sData, 'd');
   float *num = (float *)scano('f');
   printf("Half: %f", *(num)/2);



    return 0;
}

void *scano(char mode){

    char sData[20];
    fgets(sData, 20, stdin);
    *(sData + strlen(sData) - 1) = '\0'; //get rid of the '\n' before the '\0'
   switch(mode){
       case 'd':{
           int *dataI = (int *)malloc(sizeof(int));
           *dataI = atoi(sData);
           return dataI;
       }
       case 'f':{
           float *dataF = (float *)malloc(sizeof(float));
           *dataF = atof(sData);
           return dataF;
       }
       case 'D':{
           //double...
       }
   }

}

The function is obviously unfinished for other data types, but I have some questions first:

  • How can the algorithm of the function be improved?
  • Don't I need to free() in every case? I'm aware allocated memory needs to be freed, but when working with lists, free() was only used to delete Nodes, when creating Nodes, no free() was called after a malloc().
  • Is it fully safe? If not, how can it be made safe?

Solution

  • The following is a simple example of a couple of functions to return a numeric value from an input source. These examples expect that one of several types of white space characters (space, tab, end of line, return) are used to delimit a set of numeric fields.

    These are to demonstrate an approach and there is definitely room for improvement.

    I suggest you take a look at the discussion of the question and answers in atoi vs atol vs strtol vs strtoul vs sscanf

    The fgetc() function is used to pull one character at a time from the input source and test whether the reading of the input source should continue or stop. By using the fgetc() function we can allow some other function to continue reading from the input source when these functions, scano_l() and scan_d() are used.

    We also eliminate the need for the malloc() and the accompanying free() and memory management by using a local buffer and returning the actual value as a long or a double.

    For instance a simple test of these using a C++ main would be (the _tmain() is because I am using Microsoft Visual Studio 2005 to generate a Windows console application) is below. This can be tested by compiling and then trying out several different scenarios of data entry in which an integer number such as 1234 is entered followed by one or more white space characters (space, tab, new line) then a floating point number such as 45.678 followed by at least one more white space character then some text characters.

    // scan_no.cpp : Defines the entry point for the console application.
    //
    
    #include "stdafx.h"
    
    #include <stdlib.h>
    
    extern "C" {
    long  scano_l (FILE *pFile);
    
    double  scano_d (FILE *pFile);
    };
    
    int _tmain(int argc, _TCHAR* argv[])
    {
        // test by reading a few white space delimited numeric value and then a
        // text string just to demonstrate how it works.
        long l = scano_l (stdin);
        double d = scano_d (stdin);
        char testBuffer[256] = {0};
    
        fgets (testBuffer, 255, stdin);
    
        printf (" value is %ld\n", l);
        printf (" value is %lf\n", d);
        printf (" value is %s\n", testBuffer);
        return 0;
    }
    

    The functions, which in my case are in another source file, csource.c, are:

    #include <stdio.h>
    #include <stdlib.h>
    
    // define the states for our Finite State Machine. It be a simple one with
    // straight transitions from one state to the next.
    enum  StateFSM {StateBegin = 1, StateAccept = 2, StateEnd = 3};
    
    static char *fetchValue (FILE *pFile, char *buffer, int nMaxLen)
    {
        int    iBuffIndex = 0;
        enum StateFSM   iState = StateBegin;
    
        do {
            // until we reach an end state of our state machine or we reach end of file
            // on our input source, lets fetch characters from the input source and decide
            // what to do with the character until our end state is reached.
            // end state is either white space trailing the desired characters or end of file.
            int    iInput = fgetc (pFile);
    
            if (feof(pFile)) break;
            switch (iInput) {
                case ' ':
                case '\t':
                case '\n':
                case '\r':
                    // eat up any leading whitespace
                    if (iState != StateAccept) break;
                    // we have found trailing white space so we are done.
                    iState = StateEnd;
                    break;
                default:
                    if (iBuffIndex < nMaxLen) {
                        // as long as we are not at the max length lets get a character into
                        // the supplied buffer. if we are at max buffer length then we will
                        // just eat any remaining characters until we come to white space.
                        buffer[iBuffIndex++] = (iInput & 0x7f);
                    }
                    iState = StateAccept;
                    break;
            }
        } while (! (iState == StateEnd));
    
        return buffer;    // be a good citizen and return the pointer provided to us. allows chaining.
    }
    
    long  scano_l (FILE *pFile)
    {
        char   buffer[32] = {0};
        long   lValue = 0;
        char   *pTemp;
    
        lValue = strtol (fetchValue(pFile, buffer, 31), &pTemp, 10);  // max characters is 31 to ensure zero terminator.
    
        return lValue;
    }
    
    double  scano_d (FILE *pFile)
    {
        char    buffer[32] = {0};
        double  dValue = 0.0;
        char    *pTemp;
    
        dValue = strtod (fetchValue(pFile, buffer, 31), &pTemp);  // max characters is 31 to ensure zero terminator.
    
        return dValue;
    }
    

    and additional, handy function would be one that will read in a string of characters. The following function reads characters from the input and adds them to a character buffer until either an ending character is read or the max number of characters is read.

    A non-space white space characters (tab, new line, return) are considered to be end of text indicators. A space character is now considered to be a valid text character which is added to the string being constructed from the input. Any leading non-space white space is discarded and the text string is considered to begin at the first character which is something other than a non-space white space character.

    char *  scano_asz(FILE *pFile, char *buffer, int nMaxLen)
    {
    
        int    iBuffIndex = 0;
        enum StateFSM   iState = StateBegin;
    
        do {
            // until we reach an end state of our state machine or we reach end of file
            // on our input source, lets fetch characters from the input source and decide
            // what to do with the character until our end state is reached.
            // end state is either white space trailing the desired characters or end of file.
            int    iInput = fgetc(pFile);
    
            if (feof(pFile)) break;
            switch (iInput) {
            case '\t':
            case '\n':
            case '\r':
                // eat up any leading non-space whitespace. spaces embedded in the string are
                // considered part of the string. delimiters include tab, new line, return.
                if (iState != StateAccept) break;
                // we have found trailing non-space white space so we are done.
                iState = StateEnd;
                break;
            default:
                if (iBuffIndex < nMaxLen) {
                    // as long as we are not at the max length lets get a character into
                    // the supplied buffer. allowable characters include the space character
                    // but not other white space characters such as tab, new line, return.
                    buffer[iBuffIndex++] = (iInput & 0x7f);
                    if (iBuffIndex >= nMaxLen) break;    // once we reach max size then we will break and exit.
                }
                iState = StateAccept;
                break;
            }
        } while (!(iState == StateEnd));
    
        if (iBuffIndex < nMaxLen) buffer[iBuffIndex] = 0;   // terminate the string if there is roome in the buffer.
        return buffer;
    }