For example :
waldo:=fern+alpha/-beta^gamma;
The above arithmetical expression may be abstracted by this BNF(there may be some difference from standard BNF,but lets ignore it for now):
AEXP = AS $AS ;
AS = .ID ':=' EX1 ';' ;
EX1 = EX2 $( '+' EX2 / '-' EX2 ) ;
EX2 = EX3 $( '*' EX3 / '/' EX3 ) ;
EX3 = EX4 $( '^' EX3 ) ;
EX4 = '+' EX5 / '-' EX5 / EX5 ;
EX5 = .ID / .NUMBER / '(' EX1 ')' ;
.END
But the EX1~EX5
abstraction is not so intuitive to me.(I don't quite understand how they are crafted in the first place)
Is there any steps to follow when normalizing such expressions?
You can translate this notation to EBNF directly.
Naming categories EX1 through EX5 is not an uncommon way of specifying operator precedence. In fact it is a good one, IMHO, especially in some languages that have 15 or more precedence levels, like C and C++ do. :)
You can rename them to expression, term, factor, primary, etc. (or whatever terms make sense to you).
ADDENDUM
If you need a translation of the above into more traditional EBNF, here is how I would do it:
AEXP => AS+
AS => id ':=' EX1 ';'
EX1 => EX2 (('+' | '-') EX2)*
EX2 => EX3 (('*' | '/') EX3)*
EX3 => EX4 ('^' EX3)*
EX4 => ('+'|'-')? EX5
EX5 => id | number | '(' EX1 ')'
I use '*' for zero or more, '+' for one or more, and '?' for optional. It is pretty cool how operator precedence is handled here, I think.
ADDENDUM 2:
Please note: It appears that the rule for EX3 is wrong. The way it stands now you can get parse trees like this
EX3
|
+---+----+----+----+---------+
| | | | | | |
EX4 ^ EX3 ^ EX3 ^ EX3
/ | \ / | \
EX4 ^ EX3 EX4 ^ EX3
So writing a^b^c^d^e^f
could mean a^(b^c)^d^(e^f)
. But in fact there are other ways to make this tree. The grammar is ambiguous.
It appears the designer of the grammar wanted to make the ^
operator right-associative. But to do so, the rule should have been
EX3 => EX4 ('^' EX3)?
Now the grammar is no longer ambiguous. Look how the derivation of a^b^c^d^e^f
MUST now proceed:
EX3
/ | \
EX4 ^ EX3
/ | \
EX4 ^ EX3
/ | \
EX4 ^ EX3
/ | \
EX4 ^ EX3
/ | \
EX4 ^ EX3
Now a^b^c^d^e^f
can ONLY parse as a^(b^(c^(d^(e^f))))
An alternative is to rewrite the rule as EX3 => EX4 ('^' EX4)*
and have a side rule saying "OBTW the caret is right associative."