Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.
I'm still working on it, and I'm not really good at C.
My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc()
function gives me a pointer to an address that's out of bounds.
If you look at this example
#include <stdio.h>
int main() {
char **arrayString = NULL;
char *testString;
testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
strcpy(testString, "1234567890123456789012345678901234567890123456789");
int numElem = 0;
while (numElem < 50) {
numElem++;
arrayString = realloc(arrayString, numElem * sizeof(char**));
arrayString[numElem-1] = malloc(strlen(testString)+1);
strcpy(arrayString[numElem-1], testString);
}
printf("done\n");
return 0;
}
it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)
I'll explain more in dephth below.
In this code i make use of the libxml2, but you don't need to know it to help me.
I have a custom struct declared this way:
struct List {
char key[24][15];
char **value[15];
int size[15];
};
struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail
Which is filled up with the necessary key values. list->size[]
is initialized with zeros, to keep track of how many values i've inserted in value
.
value
is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)
I loop through the xml file, and for each node I do a strcmp
between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value
matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.
The "broken" code, follows, where
read
is the index of the key abovementioned.reader
is the xmlNodestring
contained the name of the xmlNode but is then freed so consider it as if its a new char*list
is the above declared structif (xmlTextReaderNodeType(reader) == 3 && read >= 0)
{
/* pull out the node value */
xmlChar *value;
value = xmlTextReaderValue(reader);
if (value != NULL) {
free(string);
string=strdup(value);
/*increment array size */
list->size[read]++;
/* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
if (list->value[read] == NULL)
return 16;
/*allocate string (char*) memory */
list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
if (list->value[read][list->size[read]-1] == NULL)
return 16;
/*write string in list */
strcpy(list->value[read][list->size[read]-1], string);
}
/*free memory*/
xmlFree(value);
}
xmlFree(name);
free(string);
I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.
Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1]
after the size is incremented, list->value[read][list->size[read]-1]
points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.
Here is an image of that: https://i.sstatic.net/WQhXt.jpg
How can I properly allocate the needed memory without going out of bounds?
Your code has quite a few problems:
malloc
and realloc
, you need to #include <stdlib.h>
. If you are using strlen
and strcpy
, you need to #include <string.h>
.sizeof
to a type itself you don't have to use enclosing brackets.sizeof str
to get the length of a string. The correct and safe approach is strlen(str)+1
. If you apply sizeof
to a pointer someday you will run into trouble.sizeof(type)
as argument to malloc
, calloc
or realloc
. Instead, use sizeof *ptr
. This will avoid your incorrect numElem * sizeof(char**)
and instead replace it with numElem * sizeof *arrayString
, which correctly translates to numElem * sizeof(char*)
. This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*)
, at least on GCC.free
for this purpose: free(testString);
, free(arrayString);
.for
loop, not a while
loop. This way your intention is known by every reader.This code compiles fine on GCC:
#include <stdio.h> //NULL, printf
#include <stdlib.h> //malloc, realloc, free
#include <string.h> //strlen, strcpy
int main()
{
char** arrayString = NULL;
char* testString;
testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
strcpy(testString, "1234567890123456789012345678901234567890123456789");
for (int numElem = 1; numElem < 50; numElem++)
{
arrayString = realloc(arrayString, numElem * sizeof *arrayString);
arrayString[numElem - 1] = malloc(strlen(testString) + 1);
strcpy(arrayString[numElem - 1], testString);
}
free(arrayString);
free(testString);
printf("done\n");
return 0;
}