I'm trying to break a string str
containing the bar symbol |
into an array of strings(output
), with the delimiter being |
, which will not be included in the array of strings. There will be 20 elements in output
. This code belongs to a function that will return output
pointer, which is why I need dynamic allocation.
I'm trying to do this without using the sscanf()
function.
Example: if str
is "abc|def||jk"
then this is what output
should look like at the end (less than 20 elements for demonstration purpose):
output[0]=="abc"
output[1]=="def"
output[2]==""
output[3]=="jk"
However, I always get an error exit code, something like:
Process finished with exit code -1073740940 (0xC0000374)
When debugging I found out that the first string element is parsed correctly into output
, but the second element is produced correctly sometimes, and other times I ran into trouble.
Code below:
char **output = (char**) calloc(20, 20*sizeof(char));
int begin = 0;
int end = 1;
// index that counts output element
int arrayIndex = 0;
int i;
for(i = 0; i < strlen(str); i++) {
end = i;
bool endOfString = false;
// there is no '|' symbol at the end of the string
if (*(str+i) == '|' || (endOfString = (i+1)==strlen(str))) {
end = endOfString ? (end+1):end;
// problem here. Assembly code poped up when debugging (see image below)
char *target = (char*) calloc(end-begin+1, sizeof(char));
// give proper value to target
strcpy(target, str+begin);
*(target+end-begin) = '\0';
*(output+arrayIndex) = target;
// increase proper indexes
begin = i + 1;
arrayIndex++;
}
}
The worst of all is that I cannot debug it because a window with assembly code pops up the instance I step over the calloc
function when debugging.
I used gdb too, but it didn't work:
56 char target = (char) calloc(length, sizeof(char));
(gdb) n
warning: Critical error detected c0000374
Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffded8191f3 in ?? ()
(gdb) bt
#0 0x00007ffded8191f3 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) continue
Continuing.
gdb: unknown target exception 0xc0000374 at 0x7ffded819269
(edit) indexing was OK. Couple of notes: sizeof(char)
is always 1
-- just use 1
. Further, there is no need to cast the return of malloc
, it is unnecessary. See: Do I cast the result of malloc?. Additionally, how many times do you call strlen(str)
? (hopefully optimization limits the loop condition to a single call, but you could potentially be calling strlen(str)
for every iteration). Same with endOfString = (i+1)==strlen(str)
. Save the length of the string before entering the loop and then just use the saved value to compare against.
While you can count indexes as you have and loop character-by-character looking for delimiters, it is far more efficient to let strchr (pointer, '|')
advance to the next delimiter (or return NULL
indicating no more delimiters remain). Then rather than worrying about indexes, simply keep a pointer p
and end-pointer ep
to advance down the string, e.g.
#define NFIELDS 20
...
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
...
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
...
Adding appropriate error checking and returning the pointer to the allocated strings, you could do:
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
char **output = calloc (NFIELDS, sizeof *output); /* pointer to output */
if (!output) {
perror ("calloc-output");
return NULL;
}
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
perror ("malloc-output[n]");
while ((*n)--) /* free prior allocations on failure */
free (output[*n]);
free(output);
return NULL;
}
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
return output; /* return pointer to allocated strings */
}
In your case a complete short example would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NFIELDS 20
#define MAXC 1024
char **splitstr (const char *s, const char delim, size_t *n)
{
const char *p = s, /* pointer for parsing */
*ep = s; /* end pointer for parsing */
char **output = calloc (NFIELDS, sizeof *output); /* pointer to output */
if (!output) {
perror ("calloc-output");
return NULL;
}
*n = 0; /* zero string counter */
while (*n < NFIELDS && *p) { /* loop while less than NFIELDS */
size_t len;
if ((ep = strchr (p, delim))) /* get pointer to delim */
len = ep - p;
else
len = strlen (p); /* or get length of final string */
if (!(output[*n] = malloc (len + 1))) { /* allocated for string */
perror ("malloc-output[n]");
while ((*n)--) /* free prior allocations on failure */
free (output[*n]);
free(output);
return NULL;
}
memcpy (output[*n], p, len); /* copy chars to output[n] */
output[(*n)++][len] = 0; /* nul-terminate to make string */
if (!ep) /* if delim not found, last */
break;
p = ++ep; /* update p to 1-past delim */
}
return output; /* return pointer to allocated strings */
}
int main (void) {
char buf[MAXC], /* buffer for input */
**output = NULL; /* pointer to split/allocated strings */
size_t n = 0; /* number of strings filled */
if (!fgets (buf, MAXC, stdin)) { /* validate input */
fputs ("error: invalid input.\n", stderr);
return 1;
}
buf[strcspn (buf, "\n")] = 0; /* trim newline from buf */
/* split buf into separate strings on '|' */
if (!(output = splitstr (buf, '|', &n))) {
fputs ("error: splitstr() failed.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting each & free */
printf ("output[%2zu]: %s\n", i, output[i]);
free (output[i]); /* free strings */
}
free (output); /* free pointers */
}
Example Use/Output
$ echo "abc|def||jk" | ./bin/splitstr
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind
is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ echo "abc|def||jk" | valgrind ./bin/splitstr
==32024== Memcheck, a memory error detector
==32024== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32024== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==32024== Command: ./bin/splitstr
==32024==
output[ 0]: abc
output[ 1]: def
output[ 2]:
output[ 3]: jk
==32024==
==32024== HEAP SUMMARY:
==32024== in use at exit: 0 bytes in 0 blocks
==32024== total heap usage: 5 allocs, 5 frees, 172 bytes allocated
==32024==
==32024== All heap blocks were freed -- no leaks are possible
==32024==
==32024== For counts of detected and suppressed errors, rerun with: -v
==32024== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.