Can anyone point me to a program that strips off strings from C source code? Example
#include <stdio.h>
static const char *place = "world";
char * multiline_str = "one \
two \
three\n";
int main(int argc, char *argv[])
{
printf("Hello %s\n", place);
printf("The previous line says \"Hello %s\"\n", place);
return 0;
}
becomes
#include <stdio.h>
static const char *place = ;
char * multiline_str = ;
int main(int argc, char *argv[])
{
printf(, place);
printf(, place);
return 0;
}
What I am looking for is a program very much like stripcmt only that I want to strip strings and not comments.
The reason that I am looking for an already developed program and not just some handy regular expression is
because when you start considering all corner cases (quotes within strings, multi-line strings etc)
things typically start to be (much) more complex than it first appears. And
there are limits on what REs can achieve, I suspect it is not possible for this task.
If you do think you have an extremely robust regular expression feel free to submit, but please no naive sed 's/"[^"]*"//g'
like suggestions.
(No need for special handling of (possibly un-ended) strings within comments, those will be removed first)
Support for multi-line strings with embedded newlines is not important (not legal C), but strings spanning multiple lines ending with \ at the end must be supported.
This is almost the same as the some other questions, but I found no reference to any tools.
You can download the source code to StripCmt (.tar.gz - 5kB). It's trivially small, and shouldn't be too difficult to adapt to striping strings instead (it's released under the GPL).
You might also want to investigate the official lexical language rules for C strings. I found this very quickly, but it might not be definitive. It defines a string as:
stringcon ::= "{ch}", where ch denotes any printable ASCII character (as specified by isprint()) other than " (double quotes) and the newline character.