I am creating a compiler, and beginning to create my parse tree structure.
I have this for a 'node' which can contain subnodes, or not.
typedef struct Node {
int node_type;
union {
char* string;
int number;
struct Node* nodes;
} node_data;
} Node;
And these functions assemble/print it
Node* MakeNodeFromString(char* mystring) {
Node* mynode = (Node*) malloc(sizeof(Node));
mynode->node_data.string = strdup(mystring);
mynode->node_type = 0; // @TODO not 3
return mynode;
}
Node* MakeTwoBranchNode(int nodetype, Node* a, Node* b) {
Node* mynode = (Node*) malloc(sizeof(Node));
mynode->node_type = 2; // @TODO not 3
mynode->node_data.nodes = malloc(2 * sizeof(Node*));
mynode->node_data.nodes[0] = *a; mynode->node_data.nodes[1] = *b;
return mynode;
}
void printtree (Node *n, int level) {
if (!n) return;
printf ("start %d\n", n->node_type);
switch (n->node_type) {
case 2:
printf ("%*c2\n", level, ' ');
printtree (&n->node_data.nodes[0], level+1);
printtree (&n->node_data.nodes[1], level+1);
break;
case 0:
printf ("%*c%s\n", level, ' ', n->node_data.string);
break;
}
printf ("end %d\n", n->node_type);
}
And whenever I assemble a tree I get segfaults either printf'ing or strlen'ing my strings. I've tried strdup, strcpy, etc. I'm pretty sure its not MakeTwoBranchNode that's failing because I can create large trees of numbers (code not included). But I'm not sure.
This is a codesample of where it does - and doesnt - segfault on my machine
int main() {
// Works
printtree(
MakeTwoBranchNode(3,
MakeNodeFromString("first string"),
MakeNodeFromString("second string")
),
1
);
// Fails
printtree(
MakeTwoBranchNode(3,
MakeTwoBranchNode(3,
MakeNodeFromString("first string"),
MakeNodeFromString("second string")
),
MakeNodeFromString("third string")
),
1
);
}
If you run this example (and can understand its cryptic output) you'll see it segfaults during printf(n->node_data.string).
You're allocating sizeof-a-pointer, not sizeof-a-node in the following:
Node* MakeTwoBranchNode(int nodetype, Node* a, Node* b) {
Node* mynode = malloc(sizeof(Node));
mynode->node_type = 2; // @TODO not 3
mynode->node_data.nodes = malloc(2 * sizeof(Node*)); // HERE
mynode->node_data.nodes[0] = *a; mynode->node_data.nodes[1] = *b;
return mynode;
}
You could change the commented line above to:
mynode->node_data.nodes = malloc(2 * sizeof(Node));
However, as written your test program as a memory leak. What happens to the memory you allocated for the nodes your passing to MakeTwoBranchNode()
? This is not what I think you really want. You wold be better off using a pointer-array in the first place.
typedef struct Node {
int node_type;
union {
char* string;
int number;
struct Node *nodes[2];
} node_data;
} Node;
Then save the actual pointers passed to MakeTwoBranchNode
. In doing so you're passing ownership of those nodes to the two-branch node (and thus you should also ensure when it is freed it properly cleans up its child nodes):
Node* MakeTwoBranchNode(int nodetype, Node* a, Node* b) {
Node* mynode = malloc(sizeof(Node));
mynode->node_type = 2; // @TODO not 3
mynode->node_data.nodes[0] = a;
mynode->node_data.nodes[1] = b;
return mynode;
}
Now there's no memory leak unless you fail to free()
the pointers in nodes[0]
and nodes[1]
when you're destroying the two-branch node.