Search code examples
c++clangabstract-syntax-treelibtoolingclang-ast-matchers

How to find the clang::SourceRange of a deleted function?


I am working on a Clang AST generated from the following source code:

struct has_deleted_function_member
{
void deleted_function1() = delete;
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ col:33
};

void deleted_function2() = delete;
//~~~~~~~~~~~~~~~~~~~~~^ col:24

int main()
{
  return 0;
}

I would like to find the complete SourceRange of the FunctionDecls of has_deleted_function_member::deleted_function1 and deleted_function2.

By complete, I mean that I would like these SourceRanges to include the complete declarations of these functions, beginning with and including the leading void and up to and excluding the trailing semicolon.

Calling getSourceRange on deleted_function1's FunctionDecl yields the desired result as expected.

However, getSourceRange on deleted_function2's FunctionDecl ends at its closing right paren.

The AST dump of these functions:

CXXMethodDecl 0x1299b40 <line:3:1, col:33> col:6 deleted_function1 'void ()' delete
FunctionDecl 0x1299c38 <line:7:1, col:24> col:6 deleted_function2 'void ()' delete

Is the exclusion of deleted_function2's trailing = delete a bug, or is this the intended behavior?

If it is not a bug, is it possible to programmatically find the complete SourceRange of deleted_function2?

Compiler details:

❯ clang++ --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Solution

  • I'm reasonably confident this is a bug. The desired location isn't available as far as I can tell. Moreover, I think I see where the bug is.

    In clang/lib/Parse/ParseCXXInlineMethods.cpp, we have:

    NamedDecl *Parser::ParseCXXInlineMethodDef(
      [...]
      if (TryConsumeToken(tok::equal)) {
        [...]
        if (TryConsumeToken(tok::kw_delete, KWLoc)) {
          Diag(KWLoc, getLangOpts().CPlusPlus11
                          ? diag::warn_cxx98_compat_defaulted_deleted_function
                          : diag::ext_defaulted_deleted_function)
            << 1 /* deleted */;
          Actions.SetDeclDeleted(FnD, KWLoc);
          Delete = true;
          if (auto *DeclAsFunction = dyn_cast<FunctionDecl>(FnD)) {
            DeclAsFunction->setRangeEnd(KWEndLoc);                  <============
          }
        } else if (TryConsumeToken(tok::kw_default, KWLoc)) {
          Diag(KWLoc, getLangOpts().CPlusPlus11
                          ? diag::warn_cxx98_compat_defaulted_deleted_function
                          : diag::ext_defaulted_deleted_function)
            << 0 /* defaulted */;
          Actions.SetDeclDefaulted(FnD, KWLoc);
          if (auto *DeclAsFunction = dyn_cast<FunctionDecl>(FnD)) {
            DeclAsFunction->setRangeEnd(KWEndLoc);
          }
        } else {
          llvm_unreachable("function definition after = not 'delete' or 'default'");
        }
    

    but in clang/lib/Parse/Parser.cpp, we have what looks like a partial copy+paste:

    Decl *Parser::ParseFunctionDefinition(ParsingDeclarator &D,
      [...]
      if (TryConsumeToken(tok::equal)) {
        assert(getLangOpts().CPlusPlus && "Only C++ function definitions have '='");
    
        if (TryConsumeToken(tok::kw_delete, KWLoc)) {
          Diag(KWLoc, getLangOpts().CPlusPlus11
                          ? diag::warn_cxx98_compat_defaulted_deleted_function
                          : diag::ext_defaulted_deleted_function)
              << 1 /* deleted */;
          BodyKind = Sema::FnBodyKind::Delete;
        } else if (TryConsumeToken(tok::kw_default, KWLoc)) {
          Diag(KWLoc, getLangOpts().CPlusPlus11
                          ? diag::warn_cxx98_compat_defaulted_deleted_function
                          : diag::ext_defaulted_deleted_function)
              << 0 /* defaulted */;
          BodyKind = Sema::FnBodyKind::Default;
        } else {
          llvm_unreachable("function definition after = not 'delete' or 'default'");
        }
    

    The above is missing the setRangeEnd call that the first block has.

    I confirmed that adding the required updates fixes the bug, and filed this as Issue 64805.

    Possible workaround: ad-hoc text search

    In cases where the clang SourceLocation information is missing or inadequate, I've gotten reasonably good results by getting the raw source from SourceManager ::getBufferData, turn SourceLocations into byte offsets with the SourceManager::getFileOffset method, and do ad-hoc text searches. You could do something like that to recognize and skip the =delete in this case.

    Obviously it's not foolproof, since macros will mess things up, and dealing with comments and preprocessor directives is annoying, but the clang SourceLocation also has some issues with macros when doing source-to-source transformations so it's not like driving off a fidelity cliff.

    I'll also note that I have not had much luck trying to use the Lexer class for this sort of thing, although it's of course possible I don't know how to use it properly.

    Update: OP has provided a nice example of using Lexer to work around this problem. I stand corrected!