Search code examples
clanglibclang

How to get the AST parent of attribute CXCursor


How can I get the parent cursor that appears in clang_visitChildren callback?

For instance, consider the following code to be parsed:

__attribute__((visibility("default"))) void func();

If I traverse recursively, I can locate the attribute cursor (visibility) with the parent setting to the function declaration. But when I try to get the parent of this cursor later with clang_getCursorSemanticParent or clang_getCursorLexicalParent, I both get a CXCursor_InvalidFile.


Solution

  • That is because attributes are "attached" to other entities.

    I suggest running clang with -ast-dump, as it will clearly show you the layout of elements. Specifically, we get something like this with the above code.

    |-FunctionDecl 0x1570f22d8 <example_enum.hpp:6:1, col:50> col:45 func 'void ()'
    | `-VisibilityAttr 0x1570f2380 <col:16, col:36> Default
    

    Thus, attributes for a function can be seen by visiting the children of the function itself.

    I currently use this function for a project I'm working on, but there are likely MUCH better ways, since this is the first time I've used libclang directly.

    void
    collect_attributes(CXCursor cursor, std::vector<std::string> & attributes)
    {
        auto visitor = [](CXCursor cursor, CXCursor, CXClientData data) {
            assert(data);
            auto & attributes = *reinterpret_cast<std::vector<std::string> *>(data);
            switch (clang_getCursorKind(cursor)) {
            case CXCursor_AnnotateAttr:
            case CXCursor_PackedAttr:
            case CXCursor_CXXFinalAttr:
            case CXCursor_CXXOverrideAttr:
            case CXCursor_VisibilityAttr:
            case CXCursor_FlagEnum:
            case CXCursor_WarnUnusedAttr:
            case CXCursor_WarnUnusedResultAttr:
            case CXCursor_AlignedAttr: {
                auto s = spelling(cursor);
                std::string_view v = s;
                if (s.empty()) {
                    s = spelling(clang_getCursorKind(cursor));
                    v = s;
                    if (v.starts_with("attribute(")) {
                        v.remove_prefix(10);
                        v.remove_suffix(1);
                    }
                }
                attributes.emplace_back(v.data(), v.size());
            } break;
            default:
                break;
            }
            return CXChildVisit_Continue;
        };
        clang_visitChildren(cursor, visitor, &attributes);
    }
    

    EDIT - EDIT - EDIT

    I have given up trying to get better results from just libclang alone. The information just isn't available.

    You can get most of what you want from the C++ API, and here are some diffs I made to the libclang source itself to do what I want. If you use just the C++ API, it should be obvious how to get that information from these diffs.

    Obviously, this would break other people expecting if this were just blatantly added to libclang, though almost all the changes provide information where there was none before.

    I am NOT suggesting you do this.

    I'm just showing what I did during experimentation to see what information was available for attributes.

    This change is needed because normalizeName will just crash on some things (one being Asm Label attributes).

    diff --git a/clang/lib/Basic/Attributes.cpp b/clang/lib/Basic/Attributes.cpp
    index da339d5b1bab..25d646e7fafe 100644
    --- a/clang/lib/Basic/Attributes.cpp
    +++ b/clang/lib/Basic/Attributes.cpp
    @@ -110,8 +110,11 @@ bool AttributeCommonInfo::isClangScope() const {
     static SmallString<64> normalizeName(const IdentifierInfo *Name,
                                          const IdentifierInfo *Scope,
                                          AttributeCommonInfo::Syntax SyntaxUsed) {
    -  StringRef ScopeName = normalizeAttrScopeName(Scope, SyntaxUsed);
    -  StringRef AttrName = normalizeAttrName(Name, ScopeName, SyntaxUsed);
    +  StringRef ScopeName, AttrName;
    +  if (Scope)
    +    ScopeName = normalizeAttrScopeName(Scope, SyntaxUsed);
    +  if (Name)
    +    AttrName = normalizeAttrName(Name, ScopeName, SyntaxUsed);
    
       SmallString<64> FullName = ScopeName;
       if (!ScopeName.empty()) {
    

    This one provides more than just an empty string for clang_getCursorSpelling with attribute types.

    diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp
    index 30416e46ce17..8fdfec4d7e74 100644
    --- a/clang/tools/libclang/CIndex.cpp
    +++ b/clang/tools/libclang/CIndex.cpp
    @@ -5067,6 +5067,11 @@ CXString clang_getCursorSpelling(CXCursor C) {
         llvm_unreachable("unknown visibility type");
       }
    
    +  if (clang_isAttribute(C.kind)) {
    +    const Attr *A = cxcursor::getCursorAttr(C);
    +    return cxstring::createDup(A->getSpelling());
    +  }
    +
       return cxstring::createEmpty();
     }
    

    This one provides an implementation for clang_getCursorPrettyPrinted for attributes.

    diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp
    index 30416e46ce17..8fdfec4d7e74 100644
    --- a/clang/tools/libclang/CIndex.cpp
    +++ b/clang/tools/libclang/CIndex.cpp
    @@ -5390,12 +5395,45 @@ CXString clang_getCursorPrettyPrinted(CXCursor C, CXPrintingPolicy
     cxPolicy) {
                                 : getCursorContext(C).getPrintingPolicy());
    
         return cxstring::createDup(OS.str());
    +  } else if (clang_isAttribute(C.kind)) {
    +    if (const Attr *A = getCursorAttr(C)) {
    +      SmallString<128> Str;
    +      llvm::raw_svector_ostream OS(Str);
    +      PrintingPolicy *UserPolicy = static_cast<PrintingPolicy *>(cxPolicy);
    +      A->printPretty(OS, *UserPolicy);
    +      StringRef SR = OS.str();
    +      return cxstring::createDup(
    +          SR.drop_front(std::min(SR.find_first_not_of(' '), SR.size())).str());
    +    }
       }
    
       return cxstring::createEmpty();
     }
    

    and is followed immediately by this one that returns a bit more information (the previous implementation just returned clang_getCursorSpelling).

     CXString clang_getCursorDisplayName(CXCursor C) {
    +  if (clang_isAttribute(C.kind)) {
    +    CXString S = clang_getCursorSpelling(C);
    +    // I know this is wrong, but there are way too many string APIs,
    +    // and simple things like checking if a string is empty or comparing
    +    // two strings is just not obvious at all.
    +    auto CS = static_cast<const char *>(S.data);
    +    if (C.kind == CXCursor_AsmLabelAttr) {
    +      return cxstring::createDup(StringRef(std::string("asm(\"") + CS + "\")"));
    +    }
    +    if (const Attr *A = getCursorAttr(C)) {
    +      std::string NS = A->getNormalizedFullName();
    +      std::string_view NV = NS;
    +      if (auto n = NV.find_last_of(':'); n < NV.size()) {
    +        NV = NV.substr(n + 1);
    +      }
    +      if (CS && NV != CS && !NV.empty()) {
    +          NS = NS + '(' + CS + ')';
    +      }
    +      return cxstring::createDup(StringRef(NS));
    +    }
    +    return S;
    +  }
    +
       if (!clang_isDeclaration(C.kind))
         return clang_getCursorSpelling(C);