I'm writing a little Domain Specific Language for my program, using JUCE::JavascriptEngine as the scripting engine. This takes a string as input and then parses it, but I need to do some pre-processing on the string to adapt it from my DSL to JavaScript. The pre-processing mainly consists of wrapping some terms inside functions, and placing object names in front of functions. So, for instance, I want to do something like this:
take some special string input "~/1/2"
...
wrap it inside a function: "find("~/1/2")"
...
and then attach an object to it: "someObject.find("~/1/2")"
(the object name has to be a variable).
I've been using regex for this (now I have two problems...). The regexes are getting complicated and unreadable, and it's missing a lot of special cases. Since what I'm doing is grammatical, I thought I'd upgrade from regex to a proper parser (now I have three problems...). After quite a lot of research, I chose Boost.Spirit. I've been going through the documentation, but it's not taking me in the right direction. Can someone suggest how I might use this library to manipulate strings in the way I am looking for? Given that I am only trying to manipulate a string and am not interested in storing the parsed data, do I need to use karma
for the output, or can I output the string with qi
or x3
, during the parsing process?
If I'm headed down the wrong path here, please feel free to re-direct me.
This seems too broad to answer.
What you're doing is parsing input, and transforming it to something else. What you're not doing is find/replace (otherwise you'd be fine using regular expressions).
Of course you can do what regular expressions do, but I'm not sure it buys you anything:
template <typename It, typename Out>
Out preprocess(It f, It l, Out out) {
namespace qi = boost::spirit::qi;
using boost::spirit::repository::qi::seek;
auto passthrough = [&out](boost::iterator_range<It> ignored, auto&&...) {
for (auto ch : ignored) *out++ = ch;
};
auto transform = [&out](std::string const& literal, auto&&...) {
for (auto ch : "someObject.find(\"~"s) *out++ = ch;
for (auto ch : literal) *out++ = ch;
for (auto ch : "\")"s) *out++ = ch;
};
auto pattern = qi::copy("\"~" >> (*~qi::char_('"')) >> '"');
qi::rule<It> ignore = qi::raw[+(!pattern >> qi::char_)] [passthrough];
qi::parse(f, l, -qi::as_string[pattern][transform] % ignore);
return out;
}
The nice thing about this way of writing it, is that it will work with any source iterator:
for (std::string const input : {
R"(function foo(a, b) { var path = "~/1/2"; })",
})
{
std::cout << "Input: " << input << "\n";
std::string result;
preprocess(begin(input), end(input), back_inserter(result));
std::cout << "Result: " << result << "\n";
}
std::cout << "\n -- Or directly transformed stdin to stdout:\n";
preprocess(
boost::spirit::istream_iterator(std::cin >> std::noskipws), {},
std::ostreambuf_iterator<char>(std::cout));
See it Live On Coliru, printing the output:
Input: function foo(a, b) { var path = "~/1/2"; }
Result: function foo(a, b) { var path = someObject.find("~/1/2"); }
-- Or directly transformed stdin to stdout:
function bar(c, d) { var path = someObject.find("~/1/42"); }
But this is very limited since it will not even do the right thing if such things are parts of comments or multiline strings etc.
So instead you probably want a dedicated library that knows how to parse javascript and use it to do your transformation, such as (one of the first hits when googling tooling library preprocess javascript transform
): https://clojurescript.org/reference/javascript-library-preprocessing