Search code examples
compiler-constructionllvmllvm-ir

the expression of label in llvm IR code


Sometimes I found the label identifier in llvm IR is started with comma ';', such as ; <label> 6 however as I kown, the comma is used for comments. So how can llvm retrive the label info in comments? Am I missing something?
One simple test is followed.
the C source file:

#include <stdio.h>

int main()
{
 int a;
 scanf("%d", &a);
 if ( a > 3)
  a *= 2;
 return 0;
}

the llvm IR code generated by http://llvm.org/demo/index.cgi (same as clang -c -emit-llvm main.c) is following:

; ModuleID = '/tmp/webcompile/_13654_0.bc'

@.str = private unnamed_addr constant [3 x i8] c"%d\00", align 1

define i32 @main() nounwind uwtable {
  %a = alloca i32, align 4
  %1 = call i32 (i8*, ...)* @__isoc99_scanf(i8* getelementptr inbounds ([3 x i8]* @.str, i64 0, i64 0), i32* %a) nounwind
  %2 = load i32* %a, align 4, !tbaa !0
  %3 = icmp sgt i32 %2, 3
  br i1 %3, label %4, label %6

; <label>:4                                       ; preds = %0
  %5 = shl nsw i32 %2, 1
  store i32 %5, i32* %a, align 4, !tbaa !0
  br label %6

; <label>:6                                       ; preds = %4, %0
  ret i32 0
}

Solution

  • In LLVM IR a block does not need an explicit label. Instructions are the same way which leads to the %1, %2, %3. LLVM assigns numbers to unnamed instructions and blocks in increasing order. The br i1 %3... terminates the first block and the last used number label is 3 so the next block gets labelled as 4. That block ends with the next br instruction and the last used number is 5 so the next and final block is labelled with 6. At first it might seem weird that blocks and instructions share the same namespace, but remember that blocks are values too.