Search code examples
shellunixsedawkgawk

Use awk to parse source code


I'm looking to create documentation from source code that I have. I've been looking around and something like awk seems like it will work, but I've had no luck so far. The information is split in two files, file1.c and file2.c.

Note: I've set up an automatic build environment for the program. This detects changes in the source and builds it. I would like to generate a text file containing a list of any variables which have been modified since the last successful build. The script I'm looking for would be a post-build step, and would run after compilation

In file1.c I have a list of function calls (all the same function) that have a string name to identify them such as:

newFunction("THIS_IS_THE_STRING_I_WANT", otherVariables, 0, &iAlsoNeedThis);
newFunction("I_WANT_THIS_STRING_TOO", otherVariable, 0, &iAnotherOneINeed);
etc...

The fourth parameter in the function call contains the value of the string name in file2. For example:

iAlsoNeedThis = 25;
iAnotherOneINeed = 42;
etc...

I'm looking to output the list to a txt file in the following format:

THIS_IS_THE_STRING_I_WANT = 25
I_WANT_THIS_STRING_TOO = 42

Is there any way of do this?

Thanks


Solution

  • Here is a start:

    NR==FNR {                     # Only true when we are reading the first file
        split($1,s,"\"")          # Get the string in quotes from the first field
        gsub(/[^a-zA-Z]/,"",$4)   # Remove the none alpha chars from the forth field
        m[$4]=s[2]                # Create array 
        next
    }
    $1 in m {                     # Match feild four from file1 with field one file2
        sub(/;/,"")               # Get rid of the ;
        print m[$1],$2,$3         # Print output
    }
    

    Saving this script.awk and running it with your example produces:

    $ awk -f script.awk file1 file2
    THIS_IS_THE_STRING_I_WANT = 25
    I_WANT_THIS_STRING_TOO = 42
    

    Edit:

    The modifications you require affects the first line of the script:

    NR==FNR && $3=="0," && /start here/,/end here/ {