Search code examples
clangstatic-analysisclang-static-analyzer

How to keep track of a variable with Clang's static analyzer?


Suppose I'm working with the following C snippet:

void inc(int *num) {*num++;}
void dec(int *num) {*num--;}

void f(int var) {
    inc(&var);
    dec(&var);
}

By using a static analyzer, I want to be able to tell if the value of var didn't change during the function's execution. I know I have to keep its state on my own (that's the point of writing a Clang checker), but I'm having troubles getting a unique reference of this variable.

For example: if I use the following API

void MySimpleChecker::checkPostCall(const CallEvent &Call,
                                    CheckerContext &C) const {
    SymbolRef MyArg = Call.getArgSVal(0).getAsSymbol();
}

I'd expect it to return a pointer to this symbol's representation in my checker's context. However, I always get 0 into MyArg by using it this way. This happens for both inc and dec functions in the pre and post callbacks.

What am I missing here? What concepts did I get wrong?

Note: I'm currently reading the Clang CFE Internals Manual and I've read the excellent How to Write a Checker in 24 Hours material. I still couldn't find my answer so far.


Solution

  • Interpretation of question

    Specifically, you want to count the calls to inc and dec applied to each variable and report when they do not balance for some path in a function.

    Generally, you want to know how to associate an abstract value, here a number, with a program variable, and be able to update and query that value along each execution path.

    High-level answer

    Whereas the tutorial checker SimpleStreamChecker.cpp associates an abstract value with the value stored in a variable, here we want associate an abstract value with the variable itself. That is what IteratorChecker.cpp does when tracking containers, so I based my solution on it.

    Within the static analyzer's abstract state, each variable is represented by a MemRegion object. So the first step is to make a map where MemRegion is the key:

    REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)
    

    Next, when we have an SVal that corresponds to a pointer to a variable, we can use SVal::getAsRegion to get the corresponding MemRegion. For instance, given a CallEvent, call, with a first argument that is a pointer, we can do:

        if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {
    

    to get the region that the pointer points at.

    Then, we can access our map using that region as its key:

          state = state->set<TrackVarMap>(region, newValue);
    

    Finally, in checkDeadSymbols, we use SymbolReaper::isLiveRegion to detect when a region (variable) is going out of scope:

      const TrackVarMapTy &Map = state->get<TrackVarMap>();
      for (auto const &I : Map) {
        MemRegion const *region = I.first;
        int delta = I.second;
        if (SymReaper.isLiveRegion(region) || (delta==0))
          continue;              // Not dead, or unchanged; skip.
    

    Complete example

    To demonstrate, here is a complete checker that reports unbalanced use of inc and dec:

    // TrackVarChecker.cpp
    // https://stackoverflow.com/questions/23448540/how-to-keep-track-of-a-variable-with-clangs-static-analyzer
    
    #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
    #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
    #include "clang/StaticAnalyzer/Core/Checker.h"
    #include "clang/StaticAnalyzer/Core/CheckerManager.h"
    #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
    #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
    #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
    #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
    
    using namespace clang;
    using namespace ento;
    
    namespace {
    class TrackVarChecker
      : public Checker< check::PostCall,
                        check::DeadSymbols >
    {
      mutable IdentifierInfo *II_inc, *II_dec;
      mutable std::unique_ptr<BuiltinBug> BT_modified;
    
    public:
      TrackVarChecker() : II_inc(nullptr), II_dec(nullptr) {}
    
      void checkPostCall(CallEvent const &Call, CheckerContext &C) const;
      void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;
    };
    } // end anonymous namespace
    
    // Map from memory region corresponding to a variable (that is, the
    // variable itself, not its current value) to the difference between its
    // current and original value.
    REGISTER_MAP_WITH_PROGRAMSTATE(TrackVarMap, MemRegion const *, int)
    
    void TrackVarChecker::checkPostCall(CallEvent const &call, CheckerContext &C) const
    {
      const FunctionDecl *FD = dyn_cast<FunctionDecl>(call.getDecl());
      if (!FD || FD->getKind() != Decl::Function) {
        return;
      }
    
      ASTContext &Ctx = C.getASTContext();
      if (!II_inc) {
        II_inc = &Ctx.Idents.get("inc");
      }
      if (!II_dec) {
        II_dec = &Ctx.Idents.get("dec");
      }
    
      if (FD->getIdentifier() == II_inc || FD->getIdentifier() == II_dec) {
        // We expect the argument to be a pointer.  Get the memory region
        // that the pointer points at.
        if (MemRegion const *region = call.getArgSVal(0).getAsRegion()) {
          // Increment the associated value, creating it first if needed.
          ProgramStateRef state = C.getState();
          int delta = (FD->getIdentifier() == II_inc)? +1 : -1;
          int const *curp = state->get<TrackVarMap>(region);
          int newValue = (curp? *curp : 0) + delta;
          state = state->set<TrackVarMap>(region, newValue);
          C.addTransition(state);
        }
      }
    }
    
    void TrackVarChecker::checkDeadSymbols(
      SymbolReaper &SymReaper, CheckerContext &C) const
    {
      ProgramStateRef state = C.getState();
      const TrackVarMapTy &Map = state->get<TrackVarMap>();
      for (auto const &I : Map) {
        // Check for a memory region (variable) going out of scope that has
        // a non-zero delta.
        MemRegion const *region = I.first;
        int delta = I.second;
        if (SymReaper.isLiveRegion(region) || (delta==0)) {
          continue;              // Not dead, or unchanged; skip.
        }
    
        //llvm::errs() << region << " dead with delta " << delta << "\n";
        if (ExplodedNode *N = C.generateNonFatalErrorNode()) {
          if (!BT_modified) {
            BT_modified.reset(
              new BuiltinBug(this, "Delta not zero",
                             "Variable changed from its original value."));
          }
          C.emitReport(llvm::make_unique<BugReport>(
            *BT_modified, BT_modified->getDescription(), N));
        }
      }
    }
    
    void ento::registerTrackVarChecker(CheckerManager &mgr) {
      mgr.registerChecker<TrackVarChecker>();
    }
    
    bool ento::shouldRegisterTrackVarChecker(const LangOptions &LO) {
      return true;
    }
    

    To hook this in to the rest of Clang, add entries to:

    • clang/include/clang/StaticAnalyzer/Checkers/Checkers.td and
    • clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt

    Example input to test it:

    // trackvar.c
    // Test for TrackVarChecker.
    
    // The behavior of these functions is hardcoded in the checker.
    void inc(int *num);
    void dec(int *num);
    
    void call_inc(int var) {
      inc(&var);
    } // reported
    
    void call_inc_dec(int var) {
      inc(&var);
      dec(&var);
    } // NOT reported
    
    void if_inc(int var) {
      if (var > 2) {
        inc(&var);
      }
    } // reported
    
    void indirect_inc(int val) {
      int *p = &val;
      inc(p);
    } // reported
    

    Sample run:

    $ gcc -E -o trackvar.i trackvar.c
    $ ~/bld/llvm-project/build/bin/clang -cc1 -analyze -analyzer-checker=alpha.core.TrackVar trackvar.i
    trackvar.c:10:1: warning: Variable changed from its original value
    }
    ^
    trackvar.c:21:1: warning: Variable changed from its original value
    }
    ^
    trackvar.c:26:1: warning: Variable changed from its original value
    }
    ^
    3 warnings generated.