I'm working with xml for the first time and I have some problems in storing the contents of the xml file in an array. I'm using libxml2 for parsing the xml file and I'm able to get the data and able to print it. The code is given below:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <libxml/xmlmemory.h>
#include <libxml/parser.h>
#include <wchar.h>
wchar_t buffer[7][50]={"\0"};
static void parseDoc(const char *docname)
{
xmlDocPtr doc;
xmlNodePtr cur;
xmlChar *key;
int i=0;
doc = xmlParseFile(docname);
if (doc == NULL ) {
fprintf(stderr,"Document not parsed successfully. \n");
return;
}
cur = xmlDocGetRootElement(doc);
if (cur == NULL)
{
fprintf(stderr,"empty document\n");
xmlFreeDoc(doc);
return;
}
cur = cur->xmlChildrenNode;
while (cur != NULL)
{
key = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
wmemcpy(buffer[i],(wchar_t*)(key),size(key)); /*segmentation fault at this stage*/
printf("Content : %s\n", key);
xmlFree(key);
i++;
cur = cur->next;
}
xmlFreeDoc(doc);
return;
}
int main(void)
{
const char *docname="/home/workspace/TestProject/Text.xml;
parseDoc (docname);
return (1);
}
The sample xml file is provided below
<?xml version="1.0"?>
<story>
<author>John Fleck</author>
<datewritten>June 2, 2002</datewritten>
<keyword>example keyword</keyword>
<headline>This is the headline</headline>
<para>This is the body text.</para>
</story>
The output of the file contents when printed on the screen were as below
Content : null
Content : John Fleck
Content : null
Content : June 2, 2002
Content : null
Content : example keyword
Content : null
Content : This is the headline
Content : null
Content : This is the body text.
I feel that the content of the file being null in few places is causing the problem in copy and hence generating the segmentation fault. Please let me know how to fix the problem and is there an better way to get the thing done. I had done a similar xml file read using MSXML parser and this is my first time with Linux API's.
EDIT The copying part is performed as below but the contents of the wchart array are garbled. Further help would be appreciated.
while (cur != NULL) {
key = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
if(key!=NULL)
{
wmemcpy(DiscRead[i],(const wchar_t *)key,sizeof(key));
i++;
}
printf("keyword: %s\n", key);
xmlFree(key);
cur = cur->next;
}
Your code has multiple problems:
wchar_t
for your string array. This isn't appropriate for the UTF-8 encoded strings you'll get from libxml2. You should stick with xmlChar
or use char
.xmlNodeListGetString
to get the text content of nodes passing cur->xmlChildrenNode
as node list. The latter will be NULL
for text nodes, so xmlNodeListGetString
will return NULL
as an error condition. You should simply call xmlNodeGetContent
on the current node but only if it is an element node.xmlChildrenNode
as field name is deprecated. You should use children
.wmemcpy
is dangerous. I'd suggest something safer like strlcpy
.Try something like this:
char buffer[7][50];
static void parseDoc(const char *docname)
{
xmlDocPtr doc;
xmlNodePtr cur;
xmlChar *key;
int i = 0;
doc = xmlParseFile(docname);
if (doc == NULL) {
fprintf(stderr, "Document not parsed successfully. \n");
return;
}
cur = xmlDocGetRootElement(doc);
if (cur == NULL) {
fprintf(stderr, "empty document\n");
xmlFreeDoc(doc);
return;
}
for (cur = cur->children; cur != NULL; cur = cur->next) {
if (cur->type != XML_ELEMENT_NODE)
continue;
key = xmlNodeGetContent(cur);
strlcpy(buffer[i], key, 50);
printf("Content : %s\n", key);
xmlFree(key);
i++;
}
xmlFreeDoc(doc);
}
You should also check that i
doesn't overrun the number of strings in your array.