I am writing a toy compiler, which compile a c/c++ like language to c++. I am using bison, but in this structure is hard to handle when variable became out of scope.
In the source lanugage in the whole main function there can be only one variable with the same name, it is good, but there is a problem what I cannot solve.
I cannot make c++ code like this, because c++ compiler throw semantical error: 'var' was not declared in this scope.
int main()
{
if (true)
{
int var = 4;
}
if (true)
{
var = 5;
}
}
The source language has while, if and if/else statements. I have to throw semantical error if a declared variable is assinged in out of scope. For example:
This should be semantical error, so I cannot generetad this code:
int main()
{
while(0)
{
int var = 1;
}
if (1)
{
var = 2;
}
}
This also have to be semantical error:
int main()
{
if (0)
{
int var = 1;
}
else
{
if (1)
{
var = 5;
}
}
}
And this is allowed, I can generate this code:
int main()
{
if (0)
{
}
else
{
int var = 1;
if (1)
{
while (0)
{
while (0)
{
var = 2;
}
}
}
}
}
I tried lot of things, but I cannot solve when there is nested if, if/else or while.
I read tons of tutorials about symbol table, but none of them can explain properly how to manage a variable if it is become out of scope.
If you familiar with this topic and with bison, please do not just give me hints, like "use stack, and mark a variable if it become out of scope". I found lot of article about it. Instead of please give me pseudocode or concrate implementation sketch.
I think it cannot be so much difficult, because in the whole main function there can be one variable with the same name as I wrote.
Symbol table:
struct SymbolType
{
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
class Symbol
{
public:
void AddVariable(int _lineNumber, std::string _identifier, int _identifierValue, Type _type, int _functionArgumentNumber, Type _functionReturnType, Type _markType, bool _outOfScope);
void AddMarker(int _lineNumber, std::string _scopeName, Type _markType);
bool FindVariable(std::string _identifier);
int FindVariableValue(std::string _identifier);
void Remove();
void Print();
std::vector<SymbolType> symbolTable;
private:
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
Now let's assume the following: While you are in a nested scope you cannot add a variable to a parent scope. So we can work e.g. with a stack like structure (push/pop at the end only suffices, but with read access to all entries – the latter requirement disqualifying std::stack
, so we'd operate e.g. on std::vector
instead).
To be able to do 3. you have (at least) two options:
This processing makes your outOfScope
member obsolete.
Your addVariable
function takes too many parameters, by the way – why should a variable need a return type, for instance???
I recommend adding multiple functions for every specific semantic type your language might provide (addVariable
, addFunction
, ...). Each function accepts what actually is necessary to be configurable and sets the rest to appropriate defaults (e.g. Type
to Function
within addFunction
).
Edit – processing the example from your comment:
while(condition)
{ // opens a new scope:
// either place a sentinel or increment the counter (scope depth)
int var = 1; // push an entry onto the stack
// (with current counter value, if not using sentinels)
if (true)
{ // again a new scope, see above
var = 2; // finds 'var' on the stack, so fine
} // closes a scope:
// either you find immediately the sentinel, pop it from the stack
// and are done
//
// or you discover 'var' having a counter value less than current
// counter so you are done as well
} // closing next scope:
// either you find 'var', pop it from the stack, then the sentinel,
// pop it as well and are done
//
// or you discover 'var' having same counter value as current one,
// so pop it, then next variable has lower counter value again or the
// stack is empty, thus you decrement the counter and are done again