I am working on a Clang AST generated from the following source code:
struct has_deleted_function_member
{
void deleted_function1() = delete;
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ col:33
};
void deleted_function2() = delete;
//~~~~~~~~~~~~~~~~~~~~~^ col:24
int main()
{
return 0;
}
I would like to find the complete SourceRange
of the FunctionDecl
s of has_deleted_function_member::deleted_function1
and deleted_function2
.
By complete, I mean that I would like these SourceRange
s to include the complete declarations of these functions, beginning with and including the leading void
and up to and excluding the trailing semicolon.
Calling getSourceRange
on deleted_function1
's FunctionDecl
yields the desired result as expected.
However, getSourceRange
on deleted_function2
's FunctionDecl
ends at its closing right paren.
The AST dump of these functions:
CXXMethodDecl 0x1299b40 <line:3:1, col:33> col:6 deleted_function1 'void ()' delete
FunctionDecl 0x1299c38 <line:7:1, col:24> col:6 deleted_function2 'void ()' delete
Is the exclusion of deleted_function2
's trailing = delete
a bug, or is this the intended behavior?
If it is not a bug, is it possible to programmatically find the complete SourceRange
of deleted_function2
?
Compiler details:
❯ clang++ --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
I'm reasonably confident this is a bug. The desired location isn't available as far as I can tell. Moreover, I think I see where the bug is.
In
clang/lib/Parse/ParseCXXInlineMethods.cpp
,
we have:
NamedDecl *Parser::ParseCXXInlineMethodDef(
[...]
if (TryConsumeToken(tok::equal)) {
[...]
if (TryConsumeToken(tok::kw_delete, KWLoc)) {
Diag(KWLoc, getLangOpts().CPlusPlus11
? diag::warn_cxx98_compat_defaulted_deleted_function
: diag::ext_defaulted_deleted_function)
<< 1 /* deleted */;
Actions.SetDeclDeleted(FnD, KWLoc);
Delete = true;
if (auto *DeclAsFunction = dyn_cast<FunctionDecl>(FnD)) {
DeclAsFunction->setRangeEnd(KWEndLoc); <============
}
} else if (TryConsumeToken(tok::kw_default, KWLoc)) {
Diag(KWLoc, getLangOpts().CPlusPlus11
? diag::warn_cxx98_compat_defaulted_deleted_function
: diag::ext_defaulted_deleted_function)
<< 0 /* defaulted */;
Actions.SetDeclDefaulted(FnD, KWLoc);
if (auto *DeclAsFunction = dyn_cast<FunctionDecl>(FnD)) {
DeclAsFunction->setRangeEnd(KWEndLoc);
}
} else {
llvm_unreachable("function definition after = not 'delete' or 'default'");
}
but in
clang/lib/Parse/Parser.cpp
,
we have what looks like a partial copy+paste:
Decl *Parser::ParseFunctionDefinition(ParsingDeclarator &D,
[...]
if (TryConsumeToken(tok::equal)) {
assert(getLangOpts().CPlusPlus && "Only C++ function definitions have '='");
if (TryConsumeToken(tok::kw_delete, KWLoc)) {
Diag(KWLoc, getLangOpts().CPlusPlus11
? diag::warn_cxx98_compat_defaulted_deleted_function
: diag::ext_defaulted_deleted_function)
<< 1 /* deleted */;
BodyKind = Sema::FnBodyKind::Delete;
} else if (TryConsumeToken(tok::kw_default, KWLoc)) {
Diag(KWLoc, getLangOpts().CPlusPlus11
? diag::warn_cxx98_compat_defaulted_deleted_function
: diag::ext_defaulted_deleted_function)
<< 0 /* defaulted */;
BodyKind = Sema::FnBodyKind::Default;
} else {
llvm_unreachable("function definition after = not 'delete' or 'default'");
}
The above is missing the setRangeEnd
call that the first block has.
I confirmed that adding the required updates fixes the bug, and filed this as Issue 64805.
In cases where the clang SourceLocation
information is missing or
inadequate, I've gotten reasonably good results by getting the raw
source from
SourceManager
::getBufferData
, turn SourceLocation
s into byte offsets with
the SourceManager::getFileOffset
method, and do ad-hoc text searches.
You could do something like that to recognize and skip the =delete
in this case.
Obviously it's not foolproof, since macros will mess things up, and
dealing with comments and preprocessor directives is annoying, but the
clang SourceLocation
also has some issues with macros when doing
source-to-source transformations so it's not like driving off a fidelity
cliff.
I'll also note that I have not had much luck trying to use the
Lexer
class
for this sort of thing, although it's of course possible I don't know
how to use it properly.
Update: OP has provided a nice example of using Lexer
to work around this problem. I stand corrected!