Search code examples
polymorphismllvmllvm-clangllvm-irvtable

How to distinguish between non-polymorphic and polymorphic class in the LLVM PASS?


I have a question about distinguishing StructType whether it is polymorphic class or not in the LLVM Pass.

I think that in the clang, distinguishing between non-polymorphic and polymorphic class is easy.

However, I don't know how to do this in the LLVM Pass.

Also, I searched below links. But, I couldn't find useful functions.

Could you give me answer how to distinguish StructType whether it is polymorphic class or not ?

For example, In the LLVM Pass,

Type *AI 
.........
StructType *STy = dyn_cast(AI)
(question) How to check whether STy is polymorphic class ?? 
.........``

Solution

  • TLDR: You can't.LLVM does not have knowledge of class. Clang lowered them to structure, at which point they are not really different from a C struct.

    You may be able to pattern match the fact that it has a vtable:

    struct MyClass {
        virtual void foo() {}
    };
    void bar(MyClass &C) { C.foo(); }
    

    IR contains: %class.MyClass = type { i32 (...)** }

    Note that since this file does not instantiate MyClass, the table is not emitted so you can't inspect it. If the source is changed this way:

    struct MyClass {
        virtual void foo() {}
    };
    MyClass C;
    void bar() { C.foo(); }
    

    Now you get a vtable:

    %struct.MyClass = type { i32 (...)** }
    
    @C = global %struct.MyClass zeroinitializer, align 8
    @_ZTV7MyClass = linkonce_odr unnamed_addr constant [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI7MyClass to i8*), i8* bitcast (void (%struct.MyClass*)* @_ZN7MyClass3fooEv to i8*)], align 8
    @_ZTVN10__cxxabiv117__class_type_infoE = external global i8*
    @_ZTS7MyClass = linkonce_odr constant [9 x i8] c"7MyClass\00"
    @_ZTI7MyClass = linkonce_odr constant { i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([9 x i8], [9 x i8]* @_ZTS7MyClass, i32 0, i32 0) }
    @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I_class.cpp, i8* null }]
    

    And later a constructor that initialize it:

    define linkonce_odr void @_ZN7MyClassC2Ev(%struct.MyClass*) unnamed_addr #1 align 2 { %2 = alloca %struct.MyClass*, align 8 store %struct.MyClass* %0, %struct.MyClass** %2, align 8 %3 = load %struct.MyClass*, %struct.MyClass** %2, align 8 %4 = bitcast %struct.MyClass* %3 to i32 (...)*** store i32 (...)** bitcast (i8** getelementptr inbounds ([3 x i8*], [3 x i8*]* @_ZTV7MyClass, i64 0, i64 2) to i32 (...)**), i32 (...)*** %4, align 8 ret void }

    However with optimizations enable, this all go away for:

    @C = global %struct.MyClass { i32 (...)** bitcast (i8** getelementptr inbounds ([3 x i8*], [3 x i8*]* @_ZTV7MyClass, i64 0, i64 2) to i32 (...)**) }, align 8
    @_ZTV7MyClass = linkonce_odr unnamed_addr constant [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI7MyClass to i8*), i8* bitcast (void (%struct.MyClass*)* @_ZN7MyClass3fooEv to i8*)], align 8
    

    Note also that this is ABI specific (won't look the same on Windows and Linux for instance]