I'm using the LLVM IR C++ API to generate IR for my compiler. My question boils down to:
From a
CreateLoad
instruction, can I get theAllocaInst*
it was loaded from so I can store the result of arithmetic instructions in thatAllocaInst*
without needing to retrieve it from anamedValues
table?
My semantic analyzer and IR generator both implement the visitor pattern where visitor method is accept
. Below, the calls to accept
are for the IR generator and translate to a call to llvm::Value* ASTCodegenner::codegen(<subclass of AST>)
.
I've successfully implemented unary instructions so that my compiler can compile things like:
int a = 1;
int b = 3 + ++a; // b = 5, a = 2
Which translates roughly to (modified for brevity):
%a = alloca i32
%b = alloca i32
store i32 1, i32* %a // store 1 in %a
%a1 = load i32, i32* %a // load value from %a
%inctmp = add nsw i32 %a1, 1 // add 1 (unary increment, a + 1)
store i32 %inctmp, i32* %a // store in %a (a = a + 1)
%addtmp = add nsw i32 3, %inctmp // use incremented value (prefix unary operator, ++a)
store i32 %addtmp, i32* %b // store result of 3 + ++a in %b
The above is also equivalent to clang's IR representation of the same code in C
.
Unary expressions are parsed into a UnaryExprAST
which receives an operand
property of AST
(base class for all AST nodes). My reasoning for this is statements like ++1
should be valid in syntactic analysis but not semantic analysis (UnaryExprAST.operand
should be able to store VariableAST
, NumberAST
, etc.).
The solution I have now is an ugly one involving a dynamic_cast
from AST
up to VariableAST
so I can retrieve its AllocaInst*
from the namedValues
table. Hence my curiosity if there was a way to retrieve
llvm::Value* ASTCodegenner::codegen(UnaryExprAST* ast)
{
// codegen operand. if it's a VariableAST, this returns a load instruction
// (see below for acutal method)
llvm::Value* target = ast->operand->accept(*this);
// retrieve AllocaInst* from namedValues table
std::unique_ptr<VariableAST> operand = std::unique_ptr<VariableAST>(dynamic_cast<VariableAST*>(ast->operand->clone()));
llvm::AllocaInst* targetAlloca = namedValues[operand->id];
// this method just returns the result of the unary operation, e.g.
// target+1 or target-1, depending on the unary operator
llvm::Value* res = applyUnaryOperation(target, ast->op);
// store incremented value
builder->CreateStore(res, targetAlloca);
// if prefix unary, return inc/dec value; otherwise, return original value
// before inc/dec
return ast->isPrefix() ? res : target;
}
llvm::Value* ASTCodegenner::codegen(VariableAST* ast)
{
llvm::AllocaInst* val = namedValues[ast->id];
return builder->CreateLoad(val->getAllocatedType(), val, ast->id);
}
I thought about builder->CreateStore(res, target);
instead of builder->CreateStore(res, targetAlloca);
but that would violate SSA as target
is assigned the load operation.
A VariableAST
has a ctx
property which is a member of an enum:
enum class VarCtx
{
eReference, // referencing a variable (3 * a * 20)
eStore, // storing new value in a variable (a = ...)
eAlloc, // allocating a vairable (int a = ...)
eParam, // function parameter (func add(int a, int b))
};
During my semantic analysis phase (or even the constructor of UnaryExprAST
), I could dynamic_cast
the UnaryExprAST.operand
to VariableAST
, check for null
, and then fill the ctx
with VarCtx::eStore
. I could then modify the IR generation of VariableAST
to return the AllocaInst*
if its ctx
is VarCtx::eStore
.
Cast the result of IR generation on the operand (Value*
) up to LoadInst
.
llvm::LoadInst* target = static_cast<llvm::LoadInst*>(ast->operand->accept(*this));
llvm::Value* targetAlloca = target->getPointerOperand();
This works fine and should be OK with a cast from Value*
to LoadInst*
as unary operations should only be done on something that needs to be loaded with CreateLoad
anyways (correct me if I'm wrong).
Leave the dynamic_cast
in IR generation stage and completely rely on my semantic analyzer to let the right values through. I'm not entirely thrilled with that solution as what if I want to be able to define a unary operation for something other than a variable? It seems like a hacky solution that I will have to fix later.
Maybe I'm going about the IR generation completely wrong? Or maybe it's an XY problem and there's something wrong with my class architecture? I appreciate any insight!
From a CreateLoad instruction, can I get the AllocaInst* it was loaded from so I can store the result of arithmetic instructions in that AllocaInst* without needing to retrieve it from a namedValues table?
IRBuilder::CreateLoad()
always returns a LoadInst *
which has a getPointerOperand()
method that will return the same Value *
that you created the load with, whether it's an alloca or not. If you're loading something simple like a cast of an alloca, you could use V->stripPointerCasts()
(note that there is a family of ~8 strip... functions, pick the right one for your purpose). If the load was created as loading something other than an alloca, then no, the load doesn't know how to find which underlying alloca it's really loading, in general that requires solving pointer analysis (aka. alias analysis) which is a very hard problem.