Search code examples
c++clangabstract-syntax-treeclang-tidylist-initialization

How to convert a copy initialization into a direct list initialization in clang-tidy check?


I would like to write a clang-tidy check that finds field declarations with copy initialization (ICIS_CopyInit) and can change them into direct list initializations (ICIS_ListInit).

There is a way to find exactly the field declarations with copy initialization:

  • match FieldDecl
  • skip if !hasInClassInitializer()
  • skip if ICIS_CopyInit != getInClassInitStyle()

Then I can get the declaration as a string according to that answer: https://stackoverflow.com/a/61510968/2492801 and replace the = sign with an opening curly brace and add a closing curly brace at the end.

But that feels clumsy. It would be okay for int test = 3;. But such an approach would for example turn the copy initialization MyStruct m_data = MyStruct(1, 3); into MyStruct m_data{MyStruct(1, 3)}; which is not better because it still allows narrow conversion of the MyStruct constructor arguments.

Instead I would like to have MyStruct m_data{1, 3};. Is creating a code replacement string for FixItHint::CreateReplacement(...) by doing search & replace on the declaration string the only thing I can do? Or can I use the clang AST methods to create such a code replacement for me?


Solution

  • This answer focuses on how to use the API to do refactoring generally, rather than the specific task of robustly refactoring field initializations, as the latter would be a pretty big job for an SO answer.

    Synthesizing and pretty-printing AST nodes

    can I use the clang AST methods to create such a code replacement for me?

    Yes. AST node classes typically have a static create method that accepts an ASTContext, as it is the context that "owns" and manages the node objects (allocation, deallocation, etc.). For example, to create a FieldDecl, call FieldDecl::Create.

    Then, once created, AST nodes can be pretty-printed as C++ code:

    • For Decl and subclasses, call print.
    • For Stmt and subclasses, call printPretty.
    • For QualType, call print. If you have a Type, wrap it in a QualType first.

    However, this approach has a couple problems:

    • The exact requirements for creating well-formed AST are sometimes unclear and often undocumented. Given that you'll only be pretty-printing them (as opposed to, say, generating LLVM assembly code), it probably won't be too difficult to make things work, but some trial and error will be needed.

    • Pretty printing removes comments and whitespace, potentially causing substantial disruption.

    Rearranging and inserting into existing text

    Instead, what is more commonly done by refactoring tools is to examine the AST to find the nodes that you want to keep, extract the original source code for those nodes (the answer you linked to shows how to do that), and then surround the extracted text with new syntax in order to make the desired final code.

    For example, if we are given:

    MyStruct m_data = MyStruct(1, 3);
    

    and want to turn this into:

    MyStruct m_data{1, 3};
    

    then we could extract the text elements:

    • MyStruct as the type specifier,
    • m_data as the field name,
    • 1 as the first argument to the original constructor call (while discarding the call node), and
    • 3 as the second argument.

    Then, concatenate these fragments, inserting braces and commas into the appropriate places to get the desired result.

    The Rewriter class can store all of the changes desired for a particular file, expressed as textual edits, and then write out all the changes at once. In the context of a clang-tidy check, one instead creates a set of FixItHint objects that contain the desired changes.

    This approach also has a couple drawbacks:

    • When inserting text, there is no automatic mechanism to ensure the result satisfies even basic well-formedness conditions like having balanced delimiters, let alone satisfies all C++ syntax rules.

    • Depending on the transformation, it may be challenging to satisfactorily match the surrounding indentation style.

    Even so, in practice, the text-based approach is usually easier to make work adequately.

    Example: UnnecessaryValueParamCheck.cpp

    An example demonstrating some of these ideas is UnnecessaryValueParamCheck.cpp (part of clang-tidy), which has at the end:

    void UnnecessaryValueParamCheck::handleMoveFix(const ParmVarDecl &Param,
                                                   const DeclRefExpr &CopyArgument,
                                                   ASTContext &Context) {
      auto Diag = diag(CopyArgument.getBeginLoc(),
                       "parameter %0 is passed by value and only copied once; "
                       "consider moving it to avoid unnecessary copies")
                  << &Param;
      // Do not propose fixes in macros since we cannot place them correctly.
      if (CopyArgument.getBeginLoc().isMacroID())
        return;
      const auto &SM = Context.getSourceManager();
      auto EndLoc = Lexer::getLocForEndOfToken(CopyArgument.getLocation(), 0, SM,
                                               Context.getLangOpts());
      Diag << FixItHint::CreateInsertion(CopyArgument.getBeginLoc(), "std::move(")
           << FixItHint::CreateInsertion(EndLoc, ")")
           << Inserter.createIncludeInsertion(
                  SM.getFileID(CopyArgument.getBeginLoc()), "<utility>");
    }
    

    This function is merely inserting std::move( and ) around an expression, without removing anything, but the main point here is that the original expression text is retained rather than being re-created from the AST.

    The clang-tidy sources have lots of other examples that could be useful as examples to imitate. They can be found by searching for FixItHint under the clang-tidy directory.