Search code examples
llvmllvm-irllvm-c++-api

Traversal of LLVM Operands


Using a ModulePass, my goal is to traverse a SSA graph upwards: going from one Statement with 0..2 operands (most opcodes fall under that), I want to find out two things:

  1. Is an operand a metadata / constant (easy: just try casting to Constant-Type) or a variable?
  2. If it is a variable, get me the Statement where it is defined (as LLVM IR is in SSA form, this is a well-formed query), so I can recursively continue this traversal.

As an example, assume following LLVM IR:

define i32 @mul_add(i32 %x, i32 %y, i32 %z) {
entry:
  %tmp = mul i32 %x, %y
  %tmp2 = add i32 %tmp, %z
  ret i32 %tmp2
}

I will be starting with the return statement. Now I want to find out what I'm returning:

  1. I will get myself the operands of the return statement
  2. I detect that the operand is a variable called %tmp2
  3. I will get myself the operands of the statement where %tmp2 is defined
  4. I will first traverse the first operand, %tmp
  5. (...)
  6. %x is a function parameter and therefore probably the end of my traversal (or is it?)
  7. (... continue with the other branches in this DFS ...)

How would I use the C++ API to implement those steps?


Solution

  • The solution is simple and I will describe it as generic as possible.

    Each Operand of an llvm::Instruction in LLVM is of supertype llvm::Value. A subtype of Value is llvm::Instruction. This allows recursion via following methods:

    bool runOnInstruction(llvm::Instruction *instruction) {
      bool flag = false;
      for (auto operand = instruction->operands().begin();
                operand != instruction->operands().end(); ++operand) {
        printOperandStats(operand->get());
        flag = runOnOperand(operand->get()) | flag;
      }
      return flag;
    }
    
    bool runOnOperand(llvm::Value *operand) {
      operand->printAsOperand(errs(), true);
      // ... do something else ... 
      auto *instruction = dyn_cast<llvm::Instruction>(operand);
      if (nullptr != instruction) {
        return runOnInstruction(instruction);
      } else {
        return false;
      }
    }
    

    This equals a DFS as required by the question. Each Operand will be cast to an Instruction and will be recursively analyzed. The boolean return value is used as usual for LLVM Passes: a value of true describes a modification to the IR.