Search code examples
cc-preprocessorconditional-compilation

Tool for compiling automatically all ifdef / ifndef directives


My C project uses preprocessor directives to activate / deactivate some features. It's not unusual to find some of the less common configurations do not compile anymore due to a change made a few days ago within an #ifdef.

We use a script to compile the most common configurations, but I'm looking for a tool to ensure everything is compiled (testing is not a problem in our case, we just want to detect ASAP nothing stops compiling). Usually ifdefs / ifndefs are independent, so normally each module have to be compiled just twice (all symbols defined, all undefined). But sometimes the ifdefs are nested, so those modules have to be compiled more times.

Do you know of any tool to search all ifdef / ifndef (also nested ones) and gives how many times a module have to be compiled (with the set of preprocessor symbols to be defined in each one) to ensure every single source line of code is analyzed by the compiler?


Solution

  • Here's a Perl script that does a hacky job of parsing #ifdef entries and assembles a list of the symbols used in a particular file. It then prints out the Cartesian Product of all the possible combinations of having that symbol on or off. This works for a C++ project of mine, and might require minor tweaking for your setup.

    #!/usr/bin/perl
    
    use strict;
    use warnings;
    
    use File::Find;
    
    my $path = $ENV{PWD};
    
    my $symbol_map = {};
    find( make_ifdef_processor( $symbol_map ), $path );
    
    foreach my $fn ( keys %$symbol_map ) {
       my @symbols = @{ $symbol_map->{$fn} };
    
       my @options;
       foreach my $symbol (@symbols) {
          push @options, [
             "-D$symbol=0",
             "-D$symbol=1"
          ];
       }
    
       my @combinations = @{ cartesian( @options ) };
       foreach my $combination (@combinations) {
          print "compile $fn with these symbols defined:\n";
          print "\t", join ' ', ( @$combination );
          print "\n";
       }
    }
    
    sub make_ifdef_processor {
       my $map_symbols = shift;
    
       return sub {
          my $fn = $_;
    
          if ( $fn =~ /svn-base/ ) {
             return;
          }
    
          open FILE, "<$fn" or die "Error opening file $fn ($!)";
          while ( my $line = <FILE> ) {
             if ( $line =~ /^\/\// ) { # skip C-style comments
                next;
             }
    
             if ( $line =~ /#ifdef\s+(.*)$/ ) {
                print "matched line $line\n";
                my $symbol = $1;
                push @{ $map_symbols->{$fn} }, $symbol;
             }
          }
       }
    }
    
    sub cartesian {
       my $first_set = shift @_;
       my @product = map { [ $_ ] } @$first_set;
    
       foreach my $set (@_) {
          my @new_product;
          foreach my $s (@$set) {
             foreach my $list (@product) {
                push @new_product, [ @$list, $s ];
             }
          }
    
          @product = @new_product;
       }
    
       return \@product;
    }
    

    This will definitely fail with C-style /* */ comments, as I didn't bother to parse those effectively. The other thing to think about is that it might not make sense for all of the symbol combinations to be tested, and you might build that into the script or your testing server. For example, you might have mutually exclusive symbols for specifying a platform:

    -DMAC
    -DLINUX
    -DWINDOWS
    

    Testing the combinations of having these on and off doesn't really make sense. One quick solution is just to compile all combinations, and be comfortable that some will fail. Your test for correctness can then be that the compilation always fails and succeeds with the same combinations.

    The other thing to remember is not all combinations are valid because many of them aren't nested. I think that compilation is relatively cheap, but the number of combinations can grow very quickly if you're not careful. You could make the script parse out which symbols are in the same control structure (nested #ifdefs for example), but that's much harder to implement and I've not done that here.