I have a file with boolean expressions in the following format:
x0 x3+x4
x1+x2
these correspond to:
x0 AND x3 OR x4
x1 OR x2
Let's assume that we know the number of the expressions (N=2) and the number of x's (M=5). I want to have an array A of size N and an array Y of size M. The array A should contain in:
A[0] = y[0] && y[3] || y[4]
A[1] = y[1] || y[2]
and so on.
So far, I have implemented a python script that generates a .c file with the given array initialized with the expressions. But that means I need to recompile my program for each different instance. I want to avoid this. Any ideas are welcome.
Is there a "dirty" or quick trick that can do this? I would like to use purely C, if that's possible. Thanks in advance.
Here's the essentials of a solution in C. It assumes one-letter variable names, and one-letter operators. (For fun, I included a "not" operator and parentheses for subexpressions.) It processes strings of the form:
a&b|(~c&~a)
(where a..z correspond to OP's X and Y entities) and computes the boolean value of the expression.
This is implemented as a classic recursive descent parser; see Is there an alternative for flex/bison that is usable on 8-bit embedded systems? for more details on how to do this in general.
I coded it to point out just how straightforward this is for expressions. This is untested; OP gets to do his share of the work.
#define syntax_error -1 // result of parser if expression is malformed
bool variable_values[26]; // one for each letter a-z; initialized before parsing starts
char* expression; // pointer to the expression string being processed
int scan; // used to scan across the expression
#define reject_character() scan-- // used to back up scan when lexical error encountered
#define skip_blanks() { while (expression[scan]==' ') scan++; }
int boolean_primitive()
// returns result of a subexpression consisting of just variable names or constants
{ int subexpression_result;
skip_blanks();
switch (expression[scan++])
{ case 'a','b', ... 'z': // variable name
return variable_values[expression[scan-1]-"a"]; // look up value of variable
case '0': // constant for "false"
return 0;
case '1': // constant for "true"
return 1;
default:
return syntax_error;
}
}
int boolean_term()
// returns result of expression involving NOT or (...)
{ int subexpression_result;
skip_blanks();
switch (expression[scan++])
{ case '~': // not operator
subexpression_result=boolean_primitive();
if (subexpression_result==syntax_error)
return syntax_error;
return !subexpression_result;
case '(': // nested expression
subexpression_result=boolean_or_sequence();
if (subexpression_result==syntax_error)
return syntax_error;
skip_blanks();
if (expression[scan++]==')')
return subexpression_result;
else return syntax_error;
default:
reject_character();
return boolean_primitive();
}
}
int boolean_and_sequence()
// returns result of expression of form s1 & s2 & ...
{ int subexpression_result=boolean_term();
if (subexpression_result==syntax_error)
return syntax_error;
skip_blanks();
while (expression[scan++]=='&') // "and" operator?
{ int subexpression2_result=boolean_term();
if (subexpression2_result==syntax_error)
return syntax_error;
subexpression_result&=subexpression2_result;
skip_blanks();
}
reject_character; // undo overscan for '&'
return subexpression_result;
}
int boolean_or_sequence()
// returns result of expression of form of s1 | s2 | ...
{ int subexpression_result=boolean_and_sequence();
if (subexpression_result==syntax_error)
return syntax_error;
skip_blanks();
while (expression[scan++]=='|') // "or" operator?
{ int subexpression2_result=boolean_primitive();
if (subexpression2_result==syntax_error)
return syntax_error;
subexpression_result|=subexpression2_result;
skip_blanks();
}
reject_character; // undo overscan for '|'
return subexpression_result;
}
int calculate_boolean_expression(char* expression_to_evaluate)
// returns int==0 for boolean false;
// int==1 for boolean true
// int==-1 for malformed expression
{ int subexpression_result;
scan=1;
expression=expression_to_evaluate;
subexpression_result=boolean_or_sequence();
if (subexpression_result==syntax_error)
return syntax_error;
skip_blanks();
if (expression[scan]==0) // expression ends without excess junk in string?
return subexpression_result;
else return syntax_error;
}
After initializing the variable values, you invoke this as:
int the_answer=calculate_boolean_expression(&some_expression_string);
Obviously you want to check the answer to see if the parser found a syntax error.
You can avoid the global variables by passing them all as arguments, if you insist.
Returning/checking for syntax_error is awkward in C and clutters the implementation because of the repeated need to check for it as a result from an invoked sub-parser. Implementing this a throwable exception would be nicer but C doesn't allow that. You can fake it with a longjmp.
You'll have to expand it to handle your (multi-character) lexical complications, whatever they are. For quick and dirty, sticking to single-character operators/operands is pretty clean. The secret for handling multi-character lexemes is to test for each lexeme where you expect to encounter it, and simply backup the "scan" pointer to the beginning of a failed lexeme. I used the macro "reject" for this.