Search code examples
compiler-constructionllvm

How does LLVM translate three address LLVM IR `add` into X86 two address `add`?


EDIT:

I found the TwoAddressInstructionPass pass which handles this problem. Now I'm looking into the implementation.

The add of LLVM IR is a three address instruction:

%x = add i32 %y %z

But X86 add is two address:

add eax 2

It seems that the translation requires some kind of:

mov dst src1
add dst src2

So i'm curious about how LLVM does it. I looked into X86InstrArithmetic.td and found the definition of ArithBinOp_RF:

multiclass ArithBinOp_RF<bits<8> BaseOpc, bits<8> BaseOpc2, bits<8> BaseOpc4,
                         string mnemonic, Format RegMRM, Format MemMRM,
                         SDNode opnodeflag, SDNode opnode,
                         bit CommutableRR, bit ConvertibleToThreeAddress,
                         bit ConvertibleToThreeAddressRR> {
  let Defs = [EFLAGS] in {
    let Constraints = "$src1 = $dst" in {
      let isCommutable = CommutableRR in {
        let isConvertibleToThreeAddress = ConvertibleToThreeAddressRR in {
          def NAME#8rr  : BinOpRR_RF<BaseOpc, mnemonic, Xi8 , opnodeflag>;
          def NAME#16rr : BinOpRR_RF<BaseOpc, mnemonic, Xi16, opnodeflag>;
          def NAME#32rr : BinOpRR_RF<BaseOpc, mnemonic, Xi32, opnodeflag>;
          def NAME#64rr : BinOpRR_RF<BaseOpc, mnemonic, Xi64, opnodeflag>;
        } // isConvertibleToThreeAddress
      } // isCommutable

I suspect that Constraints is related to the translation:

  1. If that is the case, how does the Constraints work?
  2. If not, how and where does LLVM handle the translation (except the lea case) ? Does it insert mov? If that is the case, how does LLVM handle redundant movs?

Solution

  • There is a TwoAddressInstructionPass, which translates three-address mode instructions into two-address mode if some of MachineOperands in a MachineInstr are tied (TwoAddressInstructionPass.cpp):

    static bool isTwoAddrUse(MachineInstr &MI, unsigned Reg, unsigned &DstReg) {
      for (unsigned i = 0, NumOps = MI.getNumOperands(); i != NumOps; ++i) {
        const MachineOperand &MO = MI.getOperand(i);
        if (!MO.isReg() || !MO.isUse() || MO.getReg() != Reg)
          continue;
        unsigned ti;
        if (MI.isRegTiedToDefOperand(i, &ti)) {
          DstReg = MI.getOperand(ti).getReg();
          return true;
        }
      }
      return false;
    }
    

    "Tied" means two operands are mapped to the same register. When we write Constraints = "$src1 = $dst" in tablegen, generated code would set $src1 and $dst as "tied" (GlobalISel/Utils.cpp):

    bool llvm::constrainSelectedInstRegOperands(MachineInstr &I,
                                                const TargetInstrInfo &TII,
                                                const TargetRegisterInfo &TRI,
                                                const RegisterBankInfo &RBI) {
      assert(!isPreISelGenericOpcode(I.getOpcode()) &&
             "A selected instruction is expected");
      MachineBasicBlock &MBB = *I.getParent();
      MachineFunction &MF = *MBB.getParent();
      MachineRegisterInfo &MRI = MF.getRegInfo();
    
      for (unsigned OpI = 0, OpE = I.getNumExplicitOperands(); OpI != OpE; ++OpI) {
        MachineOperand &MO = I.getOperand(OpI);
    
        // There's nothing to be done on non-register operands.
        if (!MO.isReg())
          continue;
    
        LLVM_DEBUG(dbgs() << "Converting operand: " << MO << '\n');
        assert(MO.isReg() && "Unsupported non-reg operand");
    
        Register Reg = MO.getReg();
        // Physical registers don't need to be constrained.
        if (Register::isPhysicalRegister(Reg))
          continue;
    
        // Register operands with a value of 0 (e.g. predicate operands) don't need
        // to be constrained.
        if (Reg == 0)
          continue;
    
        // If the operand is a vreg, we should constrain its regclass, and only
        // insert COPYs if that's impossible.
        // constrainOperandRegClass does that for us.
        MO.setReg(constrainOperandRegClass(MF, TRI, MRI, TII, RBI, I, I.getDesc(),
                                           MO, OpI));
    
        // Tie uses to defs as indicated in MCInstrDesc if this hasn't already been
        // done.
        if (MO.isUse()) {
          int DefIdx = I.getDesc().getOperandConstraint(OpI, MCOI::TIED_TO);
          if (DefIdx != -1 && !I.isRegTiedToUseOperand(DefIdx))
            I.tieOperands(DefIdx, OpI);
        }
      }
      return true;
    }