Search code examples
c++static-analysisclang-tidyclang-static-analyzer

Writing a specific Clang-tidy check to avoid passing an expression into std::vector::reserve()


In our codebase, we always use std::vector::reserve() to achieve higher performance. Most of the time they work well.

Also, sometimes we might pass an expression into the reserve() function. For example,

const double length = GetLength();
const double step = GetStep();
v.reserve(static_cast<std::size_t>(length / step));

However, due to some algorithm bugs, length might be a negative number, and it's almost impossible to know this problem without running the program. Every time this problem happens, it will through an error and then a crash is inevitable:

terminate called after throwing an instance of 'std::length_error'
terminate called recursively
  what():  vector::reserve

Fortunately, We are now using clang-tidy, so I'm wondering if I can apply some custom rules like avoid passing an expression into std::vector::reserve() function and manually check every variable if it might overflow.

Is there a way to write a clang-tidy rule to do this?


Solution

  • First, you'll have to decide exactly what you want reported. For simplicity, I am interpreting your question as wanting to report any call to std::vector::reserve where the argument is not an identifier.

    Next, the core of any clang-tidy check is the AST matcher expression. The tool clang-query can be used to directly run an AST matcher against a given input program, so I'll use that to show such a matcher. (Incorporating that matcher into a clang-tidy check is then done the like any other, as described in the documentation.)

    So, here is a command that runs clang-query on the file reserve1.cc and reports all calls to std::vector::reserve whose argument is not an identifier:

    clang-query -c='m
      callExpr(
        callee(
          cxxMethodDecl(
            matchesName("std::vector<.*>::reserve")
          )),
        unless(
          hasArgument(0,
            expr(declRefExpr())
          ))
      )' \
      reserve1.cc --
    

    The matcher expression is fairly self-explanatory except for declRefExpr, which is the matcher that matches identifiers, which Clang terms "declaration references".

    When the above command is run on the following input:

    // reserve1.cc
    // Example for a check that the 'reserve' argument is "simple".
    // https://stackoverflow.com/questions/70859738/writing-a-specific-clang-tidy-check-to-avoid-passing-an-expression-into-stdvec
    
    #include <vector>
    
    double GetLength();
    double GetStep();
    void bad()
    {
      const double length = GetLength();
      const double step = GetStep();
      std::vector<int> v;
      v.reserve(static_cast<std::size_t>(length / step));   // reported
    }
    
    void good(std::size_t len)
    {
      std::vector<int> v;
      v.reserve(len);                                       // not reported
    }
    
    struct Other {
      void reserve(int);
    };
    void irrelevant(Other o, int x)
    {
      o.reserve(x + x);                                     // not reported
    }
    
    // EOF
    

    It prints:

    Match #1:
    
    /home/scott/wrk/learn/clang/reserve-argument/reserve1.cc:14:3: note: "root"
          binds here
      v.reserve(static_cast<std::size_t>(length / step));   // reported
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1 match.
    

    I recommend spending some time interactively testing and tweaking the AST matcher expression using clang-query on your program of interest before doing the additional work of embedding it into clang-tidy.

    There is a bit more information about clang-query in this answer of mine that answers a vaguely similar question.