Search code examples
c++tclvera++

Vera ++ TCL rule : list all local variables


I am trying to write a rule for vera++ static analyzer. Since I did not found a group for vera++ here and that vera++ uses TCL to implement its rule for analysis I posted to TCL forum. I have worked on vera++ inspirel.com/vera/ce/doc/tclapi.html but since I do not know TCL much I wanted advices to approach further on.

Since I am a beginner in TCL programming but would like to know approach for a TCL program to list all local variables within a C++ source code file? I mean what approach and how it can achieved?

The issue I am facing is while parsing C++ source code files to detect local variable declaration?


Solution

  • It's pretty complicated to parse for local (or any other) variable definitions using vera++ rules, but doable of course. The basic C++ parsing and tokenizing is done by vera++.

    The basic approach is to use vera++'s getTokens function in conjunction with a little state machine that checks for completed C++ statements. You need to gather tokens (and may be their values additionally, since you'll need the variable names later to setup the list) and concatenate them until you have a complete statement. If you have a complete statement you can use a regular expression to check if it's a variable defintion and extract the variable name from a submatch. Also you need to remember if you're inside a {} block to know if it's a local variable definition.

    You can find a sample for building a simple statemachine to gather the tokens to statements in vera++'s rule T019 that checks for complete curly braced blocks of code, to take as a starting point.

    I've done parsing for variable defintions with vera++ (to check for various naming conventions), but unfortunately can't post the complete code since it's proprietary work for my employer. But I can give you a snippet showing the regular expression I'm using to check for variable declarations:

    set isVar false
    if [regexp {\s+((extern\s+)?(static\s+|mutable\s+|register\s+|volatile\s+)?(const\s+)?)?((identifier#[^#]+#\s+colon_colon\s+)*identifier#[^#]+#)\s+(star\s+|const\s+|and\s+|less.*greater\s+|greater\s+)*(identifier#[^#]+#\s+colon_colon\s+)*identifier#([^#]+)#(\s+leftbracket.*rightbracket)?(\s+assign)?.*semicolon$} $statement m s1 s2 s3 s4 s5 s6 s7 s8 s9 s10] {
        set locVarname $s9
        set isVar true
        set currentMatch $m
    } elseif [regexp {\s+((extern\s+)?(static\s+|mutable\s+|register\s+|volatile\s+)?(const\s+)?)?(char\s+|int\s+|short\s+|long\s+|void\s+|bool\s+|double\s+|float\s+|unsigned\s+|and\s+|star\s+|unsigned\s+)+(identifier#[^#]+#\s+colon_colon)*\s+identifier#([^#]+)#(\s+leftbracket.*rightbracket)?(\s+assign)?.*semicolon$} $statement m s1 s2 s3 s4 s5 s6 s7 s8] {
        set locVarname $s7
        set isVar true
        set currentMatch $m
    }
    

    $statement contains the complete statement as mentioned before. Note that I'm concatenating the token value to the identifier token using identifier#<value># and use a regex group to extract it.