Search code examples
c++visual-studiostring-formattingstdstring

Recognize string formatting Debug Assertion



I have a runtime problem with code below.

The purpose is to "recognize" the formats (%s %d etc) within the input string.
To do this, it returns an integer that matches the data type. Then the extracted types are manipulated/handled in other functions.

I want to clarify that my purpose isn't to write formatted types in a string (snprintf etc.) but only to recognize/extract them.

The problem is the crash of my application with error:

Debug Assertion Failed!
Program:
...ers\Alex\source\repos\TestProgram\Debug\test.exe
File: minkernel\crts\ucrt\appcrt\convert\isctype.cpp
Line: 36

Expression: c >= -1 && c <= 255

My code:

#include <iostream>
#include <cstring>

enum Formats
{
    TYPE_INT,
    TYPE_FLOAT,
    TYPE_STRING,

    TYPE_NUM
};

typedef struct Format
{
    Formats         Type;
    char            Name[5 + 1];
} SFormat;

SFormat FormatsInfo[TYPE_NUM] =
{
    {TYPE_INT,      "d"},
    {TYPE_FLOAT,    "f"},
    {TYPE_STRING,   "s"},
};


int GetFormatType(const char* formatName)
{
    for (const auto& format : FormatsInfo)
    {
        if (strcmp(format.Name, formatName) == 0)
            return format.Type;
    }

    return -1;
}

bool isValidFormat(const char* formatName)
{
    for (const auto& format : FormatsInfo)
    {
        if (strcmp(format.Name, formatName) == 0)
            return true;
    }

    return false;
}

bool isFindFormat(const char* strBufFormat, size_t stringSize, int& typeFormat)
{
    bool foundFormat = false;
    std::string stringFormat = "";

    for (size_t pos = 0; pos < stringSize; pos++)
    {
        if (!isalpha(strBufFormat[pos]))
            continue;

        if (!isdigit(strBufFormat[pos]))
        {
            stringFormat += strBufFormat[pos];

            if (isValidFormat(stringFormat.c_str()))
            {
                typeFormat = GetFormatType(stringFormat.c_str());
                foundFormat = true;
            }
        }
    }

    return foundFormat;
}

int main()
{
    std::string testString = "some test string with %d arguments";          // crash application
    // std::string testString = "%d some test string with arguments";   // not crash application

    size_t stringSize = testString.size();

    char buf[1024 + 1];
    memcpy(buf, testString.c_str(), stringSize);
    buf[stringSize] = '\0';

    for (size_t pos = 0; pos < stringSize; pos++)
    {
        if (buf[pos] == '%')
        {
            if (buf[pos + 1] == '%')
            {
                pos++;
                continue;
            }
            else
            {
                char bufFormat[1024 + 1];
                memcpy(bufFormat, buf + pos, stringSize);
                bufFormat[stringSize] = '\0';

                int typeFormat;
                if (isFindFormat(bufFormat, stringSize, typeFormat))
                {
                    std::cout << "type = " << typeFormat << "\n";
                    // ...
                }
            }
        }
    }
}

As I commented in the code, with the first string everything works. While with the second, the application crashes.

I also wanted to ask you is there a better/more performing way to recognize types "%d %s etc" within a string? (even not necessarily returning an int to recognize it).

Thanks.


Solution

  • Let's take a look at this else clause:

    char bufFormat[1024 + 1];
    memcpy(bufFormat, buf + pos, stringSize);
    bufFormat[stringSize] = '\0';
    

    The variable stringSize was initialized with the size of the original format string. Let's say it's 30 in this case.

    Let's say you found the %d code at offset 20. You're going to copy 30 characters, starting at offset 20, into bufFormat. That means you're copying 20 characters past the end of the original string. You could possibly read off the end of the original buf, but that doesn't happen here because buf is large. The third line sets a NUL into the buffer at position 30, again past the end of the data, but your memcpy copied the NUL from buf into bufFormat, so that's where the string in bufFormat will end.

    Now bufFormat contains the string "%d arguments." Inside isFindFormat you search for the first isalpha character. Possibly you meant isalnum here? Because we can only get to the isdigit line if the isalpha check passes, and if it's isalpha, it's not isdigit.

    In any case, after isalpha passes, isdigit will definitely return false so we enter that if block. Your code will find the right type here. But, the loop doesn't terminate. Instead, it continues scanning up to stringSize characters, which is the stringSize from main, that is, the size of the original format string. But the string you're passing to isFindFormat only contains the part starting at '%'. So you're going to scan past the end of the string and read whatever's in the buffer, which will probably trigger the assertion error you're seeing.

    Theres a lot more going on here. You're mixing and matching std::string and C strings; see if you can use std::string::substr instead of copying. You can use std::string::find to find characters in a string. If you have to use C strings, use strcpy instead of memcpy followed by the addition of a NUL.