Search code examples
cpointersmemorylibxml2

How to fix segfault caused by a realloc going out of bounds?


Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.

I'm still working on it, and I'm not really good at C.

My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc() function gives me a pointer to an address that's out of bounds.

If you look at this example

#include <stdio.h>

int main() {
    char **arrayString = NULL;
    char *testString;
    testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
    strcpy(testString, "1234567890123456789012345678901234567890123456789");
    int numElem = 0;
    while (numElem < 50) {
        numElem++;
        arrayString = realloc(arrayString, numElem * sizeof(char**));
        arrayString[numElem-1] = malloc(strlen(testString)+1);
        strcpy(arrayString[numElem-1], testString);
    }
    printf("done\n");
    return 0;
}

it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)

I'll explain more in dephth below.

In this code i make use of the libxml2, but you don't need to know it to help me.

I have a custom struct declared this way:

 struct List {
    char key[24][15];
    char **value[15];
    int size[15];
 };

struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail

Which is filled up with the necessary key values. list->size[] is initialized with zeros, to keep track of how many values i've inserted in value.

value is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)

I loop through the xml file, and for each node I do a strcmp between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.

The "broken" code, follows, where

  • read is the index of the key abovementioned.
  • reader is the xmlNode
  • string contained the name of the xmlNode but is then freed so consider it as if its a new char*
  • list is the above declared struct
if (xmlTextReaderNodeType(reader) == 3 && read >= 0)
    {
        /* pull out the node value */
        xmlChar *value;
        value = xmlTextReaderValue(reader);     
        if (value != NULL) {
            free(string);
            string=strdup(value);           
            /*increment array size */
            list->size[read]++;
            /* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
            if (list->value[read] == NULL)
                return 16;
            /*allocate string (char*) memory */
            list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
            if (list->value[read][list->size[read]-1] == NULL)
                return 16;
            /*write string in list */
            strcpy(list->value[read][list->size[read]-1], string);
        }
        /*free memory*/
        xmlFree(value);
    }
    xmlFree(name);
    free(string);

I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.

Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1] after the size is incremented, list->value[read][list->size[read]-1] points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.

Here is an image of that: https://i.sstatic.net/WQhXt.jpg

How can I properly allocate the needed memory without going out of bounds?


Solution

  • Your code has quite a few problems:

    1. You are not including all the appropriate headers. How did you get this to compile? If you are using malloc and realloc, you need to #include <stdlib.h>. If you are using strlen and strcpy, you need to #include <string.h>.
    2. Not really a mistake, but unless you are applying sizeof to a type itself you don't have to use enclosing brackets.
    3. Stop using sizeof str to get the length of a string. The correct and safe approach is strlen(str)+1. If you apply sizeof to a pointer someday you will run into trouble.
    4. Don't use sizeof(type) as argument to malloc, calloc or realloc. Instead, use sizeof *ptr. This will avoid your incorrect numElem * sizeof(char**) and instead replace it with numElem * sizeof *arrayString, which correctly translates to numElem * sizeof(char*). This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*), at least on GCC.
    5. If you are dynamically allocating memory, you must also deallocate it manually when you no longer need it. Use free for this purpose: free(testString);, free(arrayString);.
    6. Not really a mistake, but if you want to cycle through elements, use a for loop, not a while loop. This way your intention is known by every reader.

    This code compiles fine on GCC:

    #include <stdio.h> //NULL, printf
    #include <stdlib.h> //malloc, realloc, free
    #include <string.h> //strlen, strcpy
    
    int main()
    {
        char** arrayString = NULL;
        char* testString;
        testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
        strcpy(testString, "1234567890123456789012345678901234567890123456789");
        for (int numElem = 1; numElem < 50; numElem++)
        {
            arrayString = realloc(arrayString, numElem * sizeof *arrayString);
            arrayString[numElem - 1] = malloc(strlen(testString) + 1);
            strcpy(arrayString[numElem - 1], testString);
        }
        free(arrayString);
        free(testString);
        printf("done\n");
        return 0;
    }