Confessions of a Wall Street Programmer

practical ideas (and perhaps some uncommon knowledge) on software architecture, design, construction and testing

cc_driver.pl

This script parses a compilation database, and executes a specified command for each matching file in the database. It generates a command that includes the -I and -D parameters used in the original compilation.

Usage

cc_driver.pl [-d] [-v] [-s] [-n] [-p build_path] [-i include_pattern] [-x exclude_pattern] command [parameters]

Parameters

Parameter Description
-d Debug – print generated command to stdout, but don’t execute it. Implies -v (verbose).
-v Be verbose. Displays generated commands to stdout.
-s Generate system include paths. See System Include Files.
-n Don’t include -I or -D parameters in the generated command. This can be useful when using the script to execute “normal” commands (e.g., grep) that don’t take such parameters.
-p build_path Specifies the path to a compilation database in JSON format. It can be either an absolute path to a compilation database, or a directory (which will be searched for a compile_commands.json file). If omitted, the current directory is used.
-i include_pattern File paths matching include_pattern are included in generated commands. May be specified multiple times. If omitted, all files in the compilation database are included.
-x exclude_pattern File paths matching exclude_pattern are not included in generated commands. May be specified multiple times. All exclude_patterns are matched after any include patterns.
command Specifies the command to run against each file. (See below for details on how the generated command line is constructed).
parameters Any parameters for the specified command. Note that you may need to quote the parameters if they include quotes themselves to avoid quote removal.

Environment Variables

If the -s option is specified, the value of ${CXX} is used as the name of the compiler. (If ${CXX} is not defined, the default is g++).

Notes

The generated command first changes to the original build directory before executing the specified command, so specifying [command] using a relative path is generally a mistake. Instead use an absolute path, or make sure that [command] can be found using the PATH environment variable.

Generated command format

The script generates a command line of the form

cd {directory}; [command] [parameters] -I .. -D .. {file}

The clang tools require a slightly different format – if a clang tool is specified for [command], the generated command format is:

cd {directory}; [command] {file} -- [parameters] -I .. -D ..

System Include Files

Some of the tools that can be scripted with this command work better (sometimes much better) if they can see all the include files used in the original compilation. If the -s flag is specified, the system include paths are appended to any include paths from the original compile command.

The system include paths are:

  • Any paths specified using the -isystem compiler flag. (See System Headers - The C Preprocessor for why you might want to use the isystem flag). As with the compiler’s -isystem flag, any directories specified are searched after all directories specified using the -I flag, regardless of where it occurs on the command line.

  • The default compiler search paths are appended to the generated command line. The compiler search paths are determined by parsing the output of ${CXX} -E -x c++ - -v 2>&1 1>/dev/null </dev/null

See this post for more information.

Code Listing

(cc_driver.pl) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
#!/usr/bin/perl
#
# Copyright 2016 by Bill Torpey. All Rights Reserved.
# This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.
# http://creativecommons.org/licenses/by-nc-nd/3.0/us/deed.en
#
use strict;

###############################################################
# get compiler's default include path
sub getCXXIncludes
{
   # get compiler command
   my $compiler = $ENV{'CXX'};
   if (!defined $compiler) {
      $compiler = "g++";
   }

   my @includes;
   my $capture = 0;
   my @lines = `$compiler  -E -x c++ - -v 2>&1 1>/dev/null </dev/null`;
   for my $line (@lines) {
      if ($line =~ /#include <...> search starts here:/) {
         $capture = 1;
      }
      elsif ($line =~ /End of search list./) {
         return join(" ", @includes);
      }
      else {
         if ($capture == 1) {
            $line =~ s/\(framework directory\)//;       # xcode silliness
            my $include = "-I" . trim($line);
            push @includes, $include;
         }
      }
   }
}

###############################################################
# trims commas, quotes, leading & trailing spaces from a string
sub trim
{
   my @out = @_;
   for (@out) {
      s/^\s+//;
      s/\s+$//;
      s/"//g;
      s/,//g;
   }
   return wantarray ? @out : $out[0];
}

###############################################################
# get cmd line args
use Getopt::Long qw(:config pass_through bundling);
# location and/or name of compile db
my $build_path = "compile_commands.json";
GetOptions('p=s' => \$build_path);
# debug mode
my $debug = 0;
GetOptions('d' => \$debug);
# whether to generate command a la compiler
my $no_params = 0;
GetOptions('n' => \$no_params);
# whether to include system headers
my $include_sys_headers = 0;
GetOptions('s' => \$include_sys_headers);
# verbose?
my $verbose = 0;
GetOptions('v' => \$verbose);
# match file path/name?
my @match;
GetOptions('i=s' => \@match);
my $match;
if ((scalar @match) > 0) {
   $match = join("|", @match);
}
my @exclude;
GetOptions('x=s' => \@exclude);
my $exclude;
if ((scalar @exclude) > 0) {
   $exclude = join("|", @exclude);
}
# debug implies verbose
($debug == 1) && ($verbose = 1);

($verbose == 1) && print "parameters=@ARGV\n";


###############################################################
# main

my $compile_commands;
if (-f $build_path) {
   $compile_commands = $build_path;
}
elsif(-d $build_path) {
    $compile_commands = $build_path . "/compile_commands.json";
}
else {
   die "$build_path doesn't exist!";
}

if (-e $build_path) {
}
else {
   die "$build_path doesn't exist!";
}

my $compiler_includes;
if ($include_sys_headers == 1) {
   $compiler_includes = getCXXIncludes();
   if ($verbose == 1) {
      print "system includes = $compiler_includes\n";
   }
}

open(INFILE, "<:crlf", "$compile_commands") or die "Cant open $compile_commands\n";

my @params;
my @system_includes;
my $directory;
my $file;
while (<INFILE>) {
   my @tokens = split(" ", $_);
   if ($tokens[0] eq '{') {
      # start of an entry
      @params  = ();
      @system_includes  = ();
      $directory = "";
      $file      = "";
   }
   elsif ($tokens[0] eq '"command":') {
      if ($no_params == 0) {
         for my $i (1 .. $#tokens) {
            if ($tokens[$i] eq "-D") {
               push @params, "-D" . $tokens[++$i];
            }
            elsif (substr($tokens[$i], 0, 2) eq "-D") {
               push @params, $tokens[$i];
            }
            elsif ($tokens[$i] eq "-I") {
               push @params, "-I" . $tokens[++$i];
            }
            elsif (substr($tokens[$i], 0, 2) eq "-I") {
               push @params, $tokens[$i];
            }
            elsif (substr($tokens[$i], 0, 8) eq "-isystem") {
               if ($include_sys_headers == 1) {
                  push @system_includes, "-I" . $tokens[++$i];
               }
            }
            elsif ($tokens[$i] eq "-fstack-usage") {
               # skip it -- not material to analysis, and clang errors out
            }
            elsif ($tokens[$i] eq "-frecord-gcc-switches") {
               # skip it -- not material to analysis, and clang errors out
            }
         }
      }
   }
   elsif ($tokens[0] eq '"directory":') {
      $directory = trim($tokens[1]);
   }
   elsif ($tokens[0] eq '"file":') {
      $file = trim($tokens[1]);
   }
   elsif (($tokens[0] eq '},') || ($tokens[0] eq '}')) {
      # end of an entry
      my $params = join(" ", @params);
      my $system_includes = join(" ", @system_includes);
      my $cmd;
      if ($ARGV[0] =~ /clang/) {
         # clang tools use a specific command line format
         $cmd = "cd $directory;@ARGV $file -- $params $system_includes $compiler_includes";
      }
      else {
         # else assume that command format is same as normal compiler command
         $cmd = "cd $directory;@ARGV $params $system_includes $compiler_includes $file";
      }
      # include/exclude file based on -i/-x param
      my $run = 1;
      ((defined $match)   && ($file !~ /$match/))   && ($run = 0);
      ((defined $exclude) && ($file =~ /$exclude/)) && ($run = 0);
      if ($run == 1) {
         my $output;
         ($verbose == 1) && print "$cmd\n";
         if ($debug != 1) {
            $output = `$cmd 2>&1`;
            my $rc = $?;
            if ($rc != 0) {
               die "$cmd returned $rc!";
            }
            print "$output\n";
         }
      }
      # exit on signal
      ($? & 127) && exit;
   }
}

close(INFILE);

0;

Comments