I’ve recently been dipping my toes in the very deep water that is “undefined behavior” in C and C++, and the more I learn about it, the scarier it is.
This was inspired by a rather tricky crash that I needed to track down as part of moving the code-base at my day job to more modern compilers and language standards.
Compiler writers have been getting more agressive about taking advantage of optimization opportunities presented by undefined behavior. And, while there has been some push-back to that, it doesn’t appear to have changed things much.
“Why should I care?” you may ask yourself. “My code compiles without any warnings, so I must not have any UB”.
Unfortunately, it’s not that simple. Some compilers warn about some UB, but it’s hit-or-miss. Strictly speaking, the compiler is under no obligation to warn you about your use of UB – in fact, the compiler is under no obligation to do anything at all once your code is found to contain UB.
And even if the compiler warns about some construct, it’s easy to ignore the warning, especially since any negative consequences won’t be apparent until running a release build. Why is that? It’s because the worst aspects of UB tend to get triggered in conjunction with code optimization.
In my case, one of my former colleagues had an extreme case of “lazy programmer syndrome” combined with a dose of cleverness, which is a dangerous combination. He needed to write code to generate C++ classes from a text definition, and one of the things the code needed to do was initialize the class members in the ctor.
Rather than generate initializers for the primitive types (non-primitive types have their ctors called by the compiler), he decided to just nuke everything with a call to memset
– after all, zeroes are zeroes, right?
If it doesn’t fit, force it. If it still doesn’t fit, get a bigger hammer.
Well, not quite – since the classes being generated were all virtual
, the memset
call was also nuking the vtable, causing the app to crash pretty much immediately on launch. That might have deterred others, but not our intrepid coder, who figured out a really “clever” way to get around the problem, using a dummy ctor and placement new
. The code he ended up with looked more or less like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
And guess what – it worked!
OK, every ctor turned into two calls, which caused two calls to all the base class ctor’s (but only ONE call to any base class dtors on the way out – sure hope those ctors didn’t allocate memory). But hey, the code is clever, so it must be efficient, right? Anyway, it worked.
Until it didn’t.
As part of bringing this codebase up to more modern standards, I started building it with different compilers, starting with the system compiler (gcc 4.8.5) on our production OS (CentOS 7), then on to gcc 5.3.0, and clang 10. Everything fine – no worries.
Then a couple of things happened – another colleague started working with CentOS 8, whose system compiler is gcc 8.x, and I started using Ubuntu 20.04, where the system compiler is gcc 9.3.0. All of a sudden, nothing worked, but only when built in release mode – in debug mode, everything was fine. No warning messages from the compiler, either1.
This was the first clue that UB might be the culprit, which was confirmed by running the code in the debugger and setting a breakpoint at the call to memset
.
Gregory (Scotland Yard detective): “Is there any other point to which you would wish to draw my attention?”
Holmes: “To the curious incident of the dog in the night-time.”
Gregory: “The dog did nothing in the night-time.”
Holmes: “That was the curious incident.”- “Silver Blaze”, Arthur Conan Doyle
Of course, nothing happened. The call to memset
was gone – the compiler having determined that the code was UB, and simply refused to generate any machine code for it at all. So, there was no place for the debugger to put the breakpoint.
Disassembling the generated code provided additional proof that the compiler simply ignored what it rightly determined was an obviously foolish construct:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The definition of undefined bahvior (from the C++98 standard, ISO/IEC 14882) is:
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements. Undefined behavior may also be expected when this International Standard omits the description of any explicit definition of behavior. [Note: permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed.
That’s quite a mouthful, but unfortunately doesn’t say much other than “if the Standard doesn’t specify the behavior of a piece of code, then the behavior of that code is undefined”.
Searching the standard for the word “undefined” yields 191 hits in 110 sections of the document. Some of these are not a whole lot more helpful:
If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer invalid for its intended use), the behavior is undefined.
I’m not aware of any definitive, comprehensive list of UB.
So, what are some useful definitions of UB? Well, here’s a list from the UBSAN home page:
Some instances of undefined behavior can be rather surprising – for instance, access to unaligned data (through a pointer), is strictly speaking undefined. Never mind that the x86 processor is perfectly capable of accessing misaligned data, and in fact does so with little or no penalty – according to the Standard it is still UB, and the compiler is free to do pretty much anything it wants with the code (including silently ignoring it).
So, for instance, typical deserialization code like this won’t necessarily do what you think:
1 2 3 |
|
Most people would consider it A Bad Thing if the compiler took the source code they wrote and just decided to generate machine code that had nothing to do with the source code. But that is exactly what the Standard allows – at least according to compiler writers. And compiler writers take advantage of that to implement optimizations that can completely change the meaning of the original source code – a good example can be found here.
There is starting to be some push-back on the “UB allows anything at all” approach, for instance this paper from one of the members of the ISO C Standard Committee. But, for now at least, compiler writers apparently feel free to get creative in the presence of UB.
Probably the worst thing about UB, though, is the part that I discuss at the beginning of this article – UB can harmlessly exist in code for years until “something” changes that triggers it. That “something” could be something “big”, like a new compiler, but it could also be something “small”, like a minor change to the source code that just happens to trigger an optimization that the compiler wasn’t using previously.
Interestingly, searching online for the class-memaccess
compiler warning (sometimes) associated with the memset
call in the original example returns a bunch of results where project maintainers simple disabled the warning (-Wno-class-memaccess
).
This is probably not what you want.
Ideally, you would eliminate all instances of potential UB from your code, since whether it “works” today is no guarantee that it will behave similarly in the future. But first, you have to find them, and that’s the tricky bit.
Detecting UB would seem to be a real Catch-22, since the compiler isn’t required to tell you about UB, but on the other hand is allowed to do whatever it wants with it.
The other problem with detecting UB is that it is often, but not always, detectable only at runtime.
The good news is that the clang folks have created a “sanitizer” expressly for UB.
If you’ve used AddressSanitizer or any of clang’s other sanitizers, using UBSAN is basically straightforward. A couple of points that may be helpful:
-O0
). UBSAN can’t detect UB in code that doesn’t exist.A lot of really smart people have been writing about UB for a while now – this article just scratches the surface of the topic:
“What Every C Programmer Should Know About Undefined Behavior” , Chris Lattner
“A Guide to Undefined Behavior in C and C++” , John Regehr
“Schrödinger’s Code, Undefined behavior in theory and practice” , Terence Kelly
“Undefined behavior can result in time travel” , Raymond Chen
“Garbage In, Garbage Out: Arguing about Undefined Behavior” , Chandler Carruth
“How ISO C became unusable for operating systems development” , Victor Yodaiken
There are several ways to get an assembler listing of C++ source code, with varying degrees of usefulness.
By far the best, in my opinion, is the listing produced by gdb
, which is the example used above. gdb
does a great job of marrying up generated code with source file lines, which makes it much easier for people who are not expert in x86 machine code (me!) to see what is happening. Just set a breakpoint at or near the code you want to see and enter
disas /m
Next best is objdump
, which does an OK job, but is not nearly as nice as gdb
. Use it like so:
objdump -drwCl -Mintel <file>
The least useful format is the intermediate assembler file produced as part of the compilation process. With gcc, you can generate an assembler listing with the -S -fverbose-asm
compiler flags. (If you’re using cmake
, specify -save-temps
instead of -S
). This will create .s
files with assembler listings alongside the .o
files in your build directory.
It turns out that the cast to void*
disables the warning message that the compiler would otherwise give on this code.↩
I recently needed to build a Linux development system from scratch, and while I was at it I decided to provide dual-boot capability between CentOS and Ubuntu.
Having used RH/CentOS pretty much exclusively since moving from Unix (Solaris) to Linux many years back, I learned that even though CentOS and Ubuntu are both Linux, they are very different in ways both large and small. I shaved a few yaks along the way, and made lots of notes – hopefully they’ll help if you’re thinking about making a similar transition.
With recent events in CentOS-land this has become even more relevant — read on to see how you can easily move back and forth between CentOS and Ubuntu.
Not too long ago my main Linux development machine, a tiny NUC-style box, stopped booting. On investigation it turned out that it may not have been a great idea to build it with a 1TB mSATA SSD — to get 1TB on an mSATA form-factor it ends up being really dense and prone to overheating. I bought a replacement 1TB SSD in a more capacious 2.5” form-factor, and decided to take the time to revisit the original configuration.
One thing that has changed for me over the past couple of years is that I have spent quite a bit of time at my day job developing a middleware transport based on ZeroMQ. My employer generously agreed to open-source the resulting code (which you can find here), but doing so opened up a bunch of issues. The biggest one was the fact that my employer’s choice of OS has been RedHat, and later CentOS, and while RH/CentOS has been a great choice in terms of stability for our production environment, it has been much less great as a development system. Which resulted in me spending a lot of time over the past several years figuring out things like how to build newer compilers in order to take advantage of improvements in C++ and related tools.
By contrast, most of the “cool kids” working on open-source projects use something other than RH/CentOS, with Ubuntu looking to be the most popular. It’s not reasonable to expect others to spin up a whole new development system just to check out a new open-source project, so being stuck on RH/CentOS would seriously impact any interest we might be hoping to generate in the project.
So, my original plan was to build out the new machine to support at least three OS’s: CentOS 7 (our current production environment), CentOS 8 (which we expected to be our next production environment), and Ubuntu (in order to better support our open-source project). About halfway through building the system RedHat/CentOS dropped the now well-known bombshell that CentOS 8 was no more — at least, not in any form that would be acceptable to us.
The result is that I ended up building just the CentOS 7 and Ubuntu systems, leaving space for a possible third OS at some point (perhaps Rocky?). I’ve come to really appreciate the more modern tools in Ubuntu, which are a boon for development, and the quirks that drove me nuts on CentOS (like not being able to paste text from my Mac) are pretty much gone. But I needed to learn (and un-learn) a lot in the process.
Moving to a new OS is a fiddly business, so if you’re thinking about moving from RH/CentOS to Ubuntu (which I suspect many people are at this point), this guide can definitely help you make that transition.
With that bit of background out of the way, let’s get started.
We’re using Ubuntu 20.04 LTS (long-term support) in this article, since it most closely matches the level of support that we (used to) expect from CentOS. You can grab an installation ISO here.
The Ubuntu install is pretty self-explanatory, (and there’s a nice tutorial here). I chose “Normal Installation” to get as much as possible at one go.
Specifying a user is where things start to get different – when installing CentOS, for instance, you enter a password for the superuser (root
) during installation.
Ubuntu installations, however, typically don’t have a root
user. Instead, the user you create during installation is automatically given sudo
rights to all the things that root
would normally be allowed to do.
So with Ubuntu, instead of using root
to administer the system directly, like so:
1 2 3 |
|
You would just use sudo
instead:
1
|
|
sudo
timeoutOne downside of using sudo
for administration is that by default Ubuntu will ask for your password every single time. To avoid that, edit the sudoers file:
sudo vi /etc/sudoers
And add the following line (this will cause the system to remember your sudo
password for five minutes):
Defaults timestamp_timeout=300
Even so, it can be a hassle typing sudo
over and over again, especially if you have a lot of taks to perform. To get around that, you can create a root shell like so:
sudo /bin/bash
or, equivalently:
sudo -i
So far, although we can run commands with superuser permissions using sudo
, we can’t actually login to the system as root
. There are lots of good reasons why this is A Good Idea, and they are well explained here.
So, just to be clear, you should never do what I’m about to tell you how to do…
As several readers have enthusiastically pointed out, you (a) should never need to enable root, and (b) if you do this on a machine that is exposed to the internet you are asking for Big Trouble. You have been warned …
But if you really need to login as root, then you’ll need to activate the root user by supplying a password:
sudo passwd root
sudo usermod -U root
root
from the consoleTo enable root login from the console, you need to edit /etc/pam.d/gdm-password
and comment out the line containing:
auth required pam_succeed_if.so user != root quiet_success
so that it looks like this:
#auth required pam_succeed_if.so user != root quiet_success
root
via sshOne more time – this is A Very Bad Idea, but if you insist …
To enable root login via ssh, edit /etc/ssh/sshd_config
and change
#PermitRootLogin prohibit-password
to
PermitRootLogin yes
On the other hand, if you want to sleep well at night, secure in the knowledge that you are (somewhat) safe from marauding script kiddies, instead change the setting in /etc/ssh/sshd_config
to:
PermitRootLogin no
Unlike CentOS, Ubuntu does not use bash as its default shell.
While there are lots of “better” shells out there, I’ve become familiar with bash, and I’ve got lots of scripts that ~~might~~ will break if moved to another shell, and which I just don’t want to futz with. Plus, if things get too hairy for bash, I generally just switch to a real programming language, like Perl.
To reconfigure the default shell on Ubuntu, you can use the following command:
sudo dpkg-reconfigure dash
To change a particular user’s default shell from sh
to bash
:
sudo chsh -s /bin/bash {user}
Many users, myself included, find SELinux to be a major hassle, and not appropriate for a development (desktop) OS. In addition, there is still some software, typically older programs, that don’t run properly with SELinux.
In CentOS, I disable SELinux, but it’s already disabled in Ubuntu, so nothing needs to be done. The Ubuntu equivalent, AppArmor has so far not interfered with anything in the way that SELinux does on CentOS, and so I haven’t had the need to disable it, or in fact tweak it at all.
In a similar vein, I generally disable iptables in CentOS. With Ubuntu, iptables is enabled, but by default it allows all traffic. So, out-of-the-box everything just works, but you can configure the firewall to be more restrictive if you want to.
Just to be clear, I disable iptables because (a) I’m on a private subnet with statically-assigned non-routable IP addresses that are not accessible other than from the subnet itself, and (b) I develop network middleware software that both connects to and listens at ephemeral ports, so iptables is pretty much out of the question. If you don’t have similar needs, you’re probably better off using iptables the way it was intended – unfortunately I can’t help you with that.
On CentOS, I’ve generally had to explicitly activate any swap partitions, but Ubuntu automatically detects and mounts any swap partitions that it finds on the boot disk.
It’s generally a good idea to keep the OS up-to-date, and with Ubuntu that can be accomplished with one or more of the following commands:
sudo apt update # Fetches the list of available updates
sudo apt upgrade # Installs some updates; does not remove packages
sudo apt full-upgrade # Installs updates; may also remove some packages, if needed
sudo apt autoremove # Removes any old packages that are no longer needed
See this for more on keeping Ubuntu up-to-date.
Even with a “normal” installation, there are some useful packages that don’t get installed initially:
sudo apt install tree
sudo apt install ddd
sudo apt install dwarves
sudo apt install oprofile
sudo apt install linux-tools
sudo apt install linux-tools-generic
sudo apt install linux-tools-`uname -r`
In my case, since I’m dual-booting between CentOS and Ubuntu, I wanted to create a user that can share files with the same user on CentOS.
To do that, create a user with the same username and userid as the CentOS user. In the example below, 8177 is the numeric ID of the CentOS user, referred to as myuser
. This user belongs to the group named shared
, that also shares the same group ID as the CentOS group.
sudo groupadd -g 8177 shared
sudo useradd -m -g shared -u 8177 myuser
sudo passwd myuser
sudo usermod -a -G users myuser
Another setting that will make it easier to share files between different users and/or OS’s is to make files group-writable by default. To do this, add the following to your .bashrc
:
# Set umask to allow group write access.
umask 002
Note that the umask setting applies only to newly-created files – it doesn’t affect existing files.
This step is optional – you could theoretically use sftp or even NFS (ugh!) to share files with other machines on your network.
The commands below will setup a minimal Samba system – again using myuser
as the name for the shared user – change that to whatever you choose.
sudo /bin/bash
apt install samba
cd /etc/samba
cp -p smb.conf smb.conf.orig
cat > smb.conf <<EOF
[global]
workgroup = WORKGROUP
server string = Samba server
security = user
passdb backend = tdbsam
[myuser]
path = /home/myuser
browseable = yes
writable = yes
valid users = @shared
[root]
path = /
browseable = yes
writable = no
EOF
smbpasswd -a myuser
systemctl enable smb.service
systemctl start smb.service
exit
You’re going to want to be able to login to the system remotely, so the sooner you setup ssh the better.
The ssh daemon may not have been installed – if not, you should install it now:
sudo apt install openssh-client
sudo apt install openssh-server
sudo systemctl start sshd.service
sudo systemctl status sshd.service
Then, from another machine where you have already generated a public/private key-pair:
ssh-copy-id -i ~/.ssh/<identity> myuser@<host>
This will copy the public key associated with
You will likely also want to copy and/or create private keys in your ~/.ssh directory, so you can access other resources like GitHub, Stash, etc.
The short version is that you’ll want to have a private key in
~/.ssh
of the system you are connecting from, and the corresponding public key in the~/.ssh/authorized_keys
file of the system you are connecting to. (Certain services, like GitHub, have their own mechansim for storing public keys).
You can read more about ssh here:
While the good old command line is fine for lots/most things, some applications are only available in GUI form, or can do things in GUI mode that they can’t do from the command line.
There are a few options for screen sharing in Ubuntu – the simplest is to activate Screen Sharing via the Settings application. This allows you to require a password, as well as to restrict connections to a particular network adapter.
You can connect to the shared screen using a VNC viewer application by specifying {hostname}:0
.
On Mac, you can also choose “Go”,”Connect to Server” from the Finder menu, and specify
vnc://{hostname}
.
This will give you a GUI into the (one-and-only) console screen. A disadvantage of this approach is that there is only one console screen, and it is a fixed size (matching the size of the physical screen).
Ubuntu defaults to TightVNC, but also provides TigerVNC, which for whatever reason seems to work better for me. To install it:
sudo apt install tigervnc-standalone-server
Once it’s installed, create a password for accessing your desktop:
vncpasswd
There are a bunch of different desktops that you can run with VNC, but I prefer to use Gnome – for that, configure your VNC startup script like so:
cd ~/.vnc
cp xstartup xstartup.orig
cat > xstartup <<EOF
#!/bin/sh
[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
vncconfig -iconic &
dbus-launch --exit-with-session gnome-session &
EOF
chmod +x xstartup
To start the VNC server:
vncserver -localhost no -geometry 1920x1050
(Or whatever geometry you prefer).
sudo loginctl unlock-sessions
. (See https://askubuntu.com/questions/1224957/i-cannot-log-in-a-vnc-session-after-the-screen-locks-authentification-error for more).There are a number of VNC viewers available:
Personally, I find the TigerVNC server and RealVNC viewer to be the best combination, but as always your mileage may vary.
If you’re running other people’s code, you may need to be able to debug core files ;-) By default, Ubuntu won’t create any, so follow these steps to enable core file creation.
First make sure ulimit
is set properly (e.g., in your .bash_profile
):
ulimit -c unlimited
Ubuntu has its equivalent to CentOS’ ABRT service called apport
, which definitely interferes with creation of core files, so you will need to disable it:
sudo systemctl disable apport.service
Next set the core file pattern used to create core files – I use a pattern of the form “{program name}.core.{pid}” (with core file in the processes’ current directory), but that is mostly an accident of history. The full documentation for the tokens you can include in the file name can be found here.
To change the current value (in memory):
sudo sysctl -w kernel.core_pattern=%e.core.%p
To make the change permanent, edit /etc/sysctl.conf
(as root) and add the following line:
kernel.core_pattern=%e.core.%p
I work with in-memory databases that store data in shared memory a lot, so a useful tweak for me is to exclude shared memory segments from core files:
echo 0x31 > /proc/self/coredump_filter
There are a number of non-default settings that can make gdb more useful, or just more pleasant to use. I set these in my ~/.gdbinit
:
# let gdb load settings from anywhere
set auto-load safe-path /
# allow breakpoints in dynmically loaded modules
set breakpoint pending on
# esp. useful w/set logging
set height 0
# more readable strings w/repeating characters
set print repeats 0
# show libraries as they are loaded
set verbose on
# load pretty-printers for std::
python
# find the printers.py file associated with current compiler
# (typically in usr/share/<compiler-version>/python/libstdcxx/v6/printers.py), installed from
cmd = "echo -n $(dirname $(find $(cd $(dirname $(which gcc))/.. && /bin/pwd) -name printers.py 2>/dev/null))"
import os
tmp = os.popen(cmd).read()
# import the pretty printers
import sys
sys.path.insert(0, tmp)
from printers import register_libstdcxx_printers
register_libstdcxx_printers (None)
end
# if you want to use Ctrl-C w/debugee
#handle SIGINT stop pass
By default, Ubuntu doesn’t let non-child processes attach to another process.
Obviously, this breaks gdb -p ...
and related. To disable this feature, edit /etc/sysctl.d/10-ptrace.conf
(as root) and change:
kernel.yama.ptrace_scope = 1
to
kernel.yama.ptrace_scope = 0
To change the current value in memory:
sudo echo 0 > /proc/sys/kernel/yama/ptrace_scope
The perf
program and its friends are very useful for seeing where a particular program spends its time. But by default, it has certain restrictions.
To remove those restrictions permanently, edit /etc/sysctl.conf
and add:
kernel.perf_event_paranoid = 0
To make a temporary change (until reboot):
echo 0 > /proc/sys/kernel/perf_event_paranoid
You can determine which compiler was used to build the kernel on Linux – on Ubuntu it shows that the system compiler is gcc 9.3.0 (2019) (vs gcc 4.8.5 (2015) on CentOS 7):
$ cat /proc/version
Linux version 5.8.0-43-generic (buildd@lcy01-amd64-018) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #49~20.04.1-Ubuntu SMP Fri Feb 5 09:57:56 UTC 2021
The newer compiler includes a bunch of new features, bug fixes, etc. and also has different default settings for some diagnostics, including:
-fasynchronous-unwind-tables
-fstack-protector-strong
-Wformat
-Wformat-security
-fstack-clash-protection
-fcf-protection
In addition to the above flags, gcc 9.3.0 on Ubuntu includes a default setting for -D_FORTIFY_SOURCE=2
, which causes additional checks to be inserted – one of them is a check for buffer overflow, which will cause an executable to abort if an overflow is detected:
*** buffer overflow detected ***: terminated
Aborted (core dumped)
A typical stack trace at the time of the core will look something like this:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f55d1bec859 in __GI_abort () at abort.c:79
#2 0x00007f55d1c573ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f55d1d8107c "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007f55d1cf9b4a in __GI___fortify_fail (msg=msg@entry=0x7f55d1d81012 "buffer overflow detected") at fortify_fail.c:26
#4 0x00007f55d1cf83e6 in __GI___chk_fail () at chk_fail.c:28
#5 0x00007f55d1cf7cc6 in __strcpy_chk (dest=dest@entry=0x7f55cd871808 "\001", src=src@entry=0x7f55c0039e0b ".0000000000000001", destlen=destlen@entry=17) at strcpy_chk.c:30
#6 0x00007f55cfa303d3 in strcpy (__src=0x7f55c0039e0b ".0000000000000001", __dest=0x7f55cd871808 "\001") at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:90
...
For more information, see Stackguard interals.
The default Ubuntu settings proved their worth quickly by identifying an “off-by-one” buffer overflow in OZ that had eluded Address Sanitizer, valgrind, glibc, cppcheck, clang-tidy and PVS-Studio.
If you suddenly start getting “unresolved symbol” errors from your builds, one possible reason is that the Ubuntu linker (ld
) works differently than on CentOS.
Unlike RedHat/CentOS, the Ubuntu linker only searches a library once, at the point that it is encountered on the command line (https://manpages.ubuntu.com/manpages/focal/man1/ld.1.html):
The linker will search an archive only once, at the location where it is specified on the command line. If the archive defines a symbol which was undefined in some object which appeared before the archive on the command line, the linker will include the appropriate file(s) from the archive. However, an undefined symbol in an object appearing later on the command line will not cause the linker to search the archive again.
This is the documented behavior in the man pages, but the CentOS linker actually behaves as if all the libraries specified on the command line were specified in --start-group
/--end-group
flags. – in other words, the order of libraries on CentOS is immaterial.
If you are getting “unresolved” errors at link time, it is most likely because the order of libraries used to build the executable is incorrect. You can either correct the order, add --start-group
/--end-group
commands, or possibly use a different linker, as discussed here.
Another difference between CentOS and Ubuntu linkers is the way they handle dependencies between shared libraries. You can see these DT_NEEDED dependencies with the readelf --dynamic
command.
These differences are caused by different default flags being passed to the linker – you can see these with:
gcc -dumpspecs | less
The output isn’t the easiest thing to understand, but if you look at the output you’ll see the template for default parameters following the *link:
line – e.g., on CentOS it will look something like this:
*link:
%{!r:--build-id} --no-add-needed ...
On CentOS, the linker defines --no-add-needed
(which is a deprecated alias for --no-copy-dt-needed-entries
), and does not define --as-needed
.
What this means is that the linker:
The second part changed as of CentOS 7, as a result of an upstream change in Fedora.
The short version is you get a DT_NEEDED entry for every library specified on the command line, but not for the libraries that those libraries need.
Ubuntu does things differently – its linker defaults to --as-needed
, which means that the linker:
The short version is that you get a DT_NEEDED entry only for libraries that are used to resolve a symbol.
In short, CentOS adds DT_NEEDED entries for all the libraries specified on the command line, but not for any of their dependencies; while Ubuntu adds entries for libraries specified on the command line, as well as their dependencies, but only if those libraries are actually needed.
As always, if you want or need to know more about shared libraries on Linux, you should check out Drepper’s paper, which is still the authoritative source.
clang goes to a lot of trouble to co-exist with gcc – for instance, preferring to use gcc’s libstdc++ for the C++ standard library, enabling code compiled by clang to call and be called by code compiled using gcc.
On Ubuntu this can be a problem though, because sometimes clang thinks it found a real installation of gcc, but in fact the installation is incomplete, and unusable. If your clang builds complain about missing include or library files, it’s likely that clang is trying to use a borked install of gcc.
But, how does clang know where to find those files in the first place? Partly this has to do with how clang is built, since clang is itself typically built using gcc. You can see which gcc installations clang finds at run-time, with the following command:
$ clang++ -v -E
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/10
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10
Candidate multilib: .;@m64
Selected multilib: .;@m64
In my case, the gcc 10 installation was incomplete, but clang tried to use it anyway. And, since ubuntu installs all its gcc versions in /usr
, passing --gcc-toolchain
to clang doesn’t really help. In my case, I had to remove the offending, unusable gcc installations:
sudo apt remove gcc-10
sudo apt remove gcc-10-base
sudo apt remove libgcc-10-dev
Once that was done, clang found the correct version (9) of gcc:
$ clang++ -v -E
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/10
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Selected multilib: .;@m64
That’s all I’ve found so far, but I’ll keep updating this post as I run into more differences between CentOS and Ubuntu. As I said above, I’m really enjoying Ubuntu, and I intend to use it almost exclusively for development going forward, booting back to CentOS only to regression-test changes, at least in the short term. In the meantime, I’ll be watching what goes on with Rocky and/or other projects that spring up to fill the void left by IBM/RH/CentOS.
If you have any questions, suggestions, etc. about this article, please leave a comment below, or email me directly.
]]>At my day job, I spend a fair amount of time working on software reliability. One way to make software more reliable is to use memory-checking tools like valgrind’s memcheck and clang’s AddressSanitizer to detect memory errors at runtime.
But these tools are typically not appropriate to use all the time – valgrind causes programs to run much more slowly than normal, and AddressSanitizer needs a special instrumented build of the code to work properly. So neither tool is typically well-suited for production code.
But there’s another memory-checking tool that is “always on”. That tool is plain old malloc
, and it is the subject of this article.
The GNU C library (glibc for short) provides implementations for the C standard library functions (e.g., strlen
etc.) including functions that interface to the underlying OS (e.g., open
et al.). glibc also provides functions to manage memory, including malloc
, free
and their cousins, and in most code these memory management functions are among the most heavily used.
It’s not possible to be a C programmer and not be reasonably familiar with the memory management functions in glibc. But what is not so well-known is the memory-checking functionality built into the library.
It turns out that glibc contains two separate sets of memory management functions – the core functions do minimal checking, and are significantly faster than the “debug” functions, which provide additional runtime checks.
The memory checking in malloc
is controlled by an environment variable, named appropriately enough MALLOC_CHECK_
(note the trailing underscore). You can configure malloc
to perform additional checking, and whether to print an error message and/or abort with a core file if it detects an error. You can find full details at http://man7.org/linux/man-pages/man3/mallopt.3.html, but here’s the short version:
Value | Impl | Checking | Message | Backtrace + mappings (since glibc 2.4+) | Abort w/core |
---|---|---|---|---|---|
default (unset) | Fast | Minimal | Detailed | Yes | Yes |
0 | Fast | Minimal | None | No | No |
1 | Slow | Full | Detailed | No | No |
2 | Slow | Full | None | No | Yes |
3 | Slow | Full | Detailed | Yes | Yes |
5 | Slow | Full | Brief | No | No |
7 | Slow | Full | Brief | Yes | Yes |
What may be surprising is that the default behavior is for malloc
to do at least minimal checking at runtime, and to abort the executable with a core file if those checks fail.
This may or may not be what you want. Given that the minimal checks in the fast implementation only detect certain specific errors, and that those errors (e.g., double free) tend not to cause additional problems, you may decide that a “no harm, no foul” approach is more appropriate (for example with production code where aborting with a core file is frowned upon ;-).
The other relevant point here is that setting MALLOC_CHECK_
to any non-zero value causes malloc
to use the slower heap functions that perform additional checks. I’ve included a sample benchmark program that shows the additional checking adds about 30% to the overhead of the malloc
/free
calls. (And while the benchmark program is dumb as dirt, its results are similar to results on “real-world” tests).
If the benchmark code is to be believed, the impact on performance of the extra checking when MALLOC_CHECK_
is set to a non-zero value is much (as in an order of magnitude) greater when multiple threads are accessing the heap concurrently. This would suggest that there is contention on the data structures used for full checking, over and above normal heap contention.
It would be nice if one could get a fast implementation with the option to output an error message and continue execution, but with the current1 implementation of glibc that doesn’t appear to be possible. If you want the fast implementation but you don’t want to abort on errors, the only option is to turn off checking entirely (by explicitly setting MALLOC_CHECK_
to 0).
Note also that the documentation is a bit misleading:
Since glibc 2.3.4, the default value for the M_CHECK_ACTION parameter is 3.
While it’s true that with no value specified for MALLOC_CHECK_
an error will cause a detailed error message with backtrace and mappings to be output, along with an abort with core file, that is NOT the same as explicitly setting MALLOC_CHECK_=3
– that setting also causes malloc
to use the slower functions that perform additional checks.
MALLOC_CHECK_=0
is “minimal” – the checks are still performed, but errors are simply not reported.
malloc
and free
.And, of course, the built-in checking in glibc can’t detect a lot of errors that can be found with more robust tools, like valgrind and AddressSanitizer. Nevertheless, MALLOC_CHECK_
can be a useful adjunct to those tools for everyday use in development.
MALLOC_CHECK_=3
. This provides additional checking over and above the default setting, at the cost of somewhat poorer performance.MALLOC_CHECK_=1
will allow execution to continue after an error, but will at least provide a message that can be logged2 to provide a warning that things are not quite right, and trigger additional troubleshooting, but at the cost of somewhat poorer performance.MALLOC_CHECK=0
, but any errors detected will be silently ignored.The code for this article is available here. There’s a benchmark program, which requires Google Benchmark. There are also sample programs which demonstrate a double-free error that can be caught even with minimal checking (double-free.c
), and which cannot (double-free2.c
), and a simple script that ties everything together.
Current for RedHat/CentOS 7 in any case, which is glibc 2.17.↩
The error message from glibc is written directly to the console (tty device), not to stderr
, which means that it will not be redirected. If you need the message to appear on stderr, you will need to set another environment variable:
export LIBC_FATAL_STDERR_=1
↩
I’ve written before about static analysis, but in those earlier posts I wasn’t able to give specific examples of real-world code where static analysis is able to discover latent errors.
In the earlier articles I used a synthetic code-base from ITC Research to test clang, cppcheck and PVS-Studio. I also ran all three tools on the code-bases that I’m responsible for maintaining at my “day job”, but I wasn’t able to share detailed results from that analysis, given that the code is not public.
In this article, I want to expand the discussion of static analysis by diving into a real-world, open-source code base that I’ve been working with lately, with specific examples of the kinds of problems static analysis can expose.
For this example, I’ll be using the OpenMAMA source code. OpenMAMA is an open-source messaging middleware framework that provides a high-level API for a bunch of messaging transports, including open-source (Qpid/AMQP, ZeroMQ) and commercial (DataFabric, Rendezvous, Solace, etc).
OpenMAMA is an interesting project – it started back in 2004 with Wombat Financial Software, which was attempting to sell its market-data software, but found it to be tough sledding. While Wombat’s software performed better and was less expensive than Tibco’s Rendezvous (the de-facto standard at the time), no one wanted to rewrite their applications to target an API from a small company that might not be around in a couple of years.
So Wombat developed an open API which could sit on top of any messaging middleware, and they called it MAMA, for Middleware Agnostic Messaging API. They also developed bindings for Rendezvous, in addition to their own software, so that prospective customers would have a warm and fuzzy feeling that they could write their applications once, and seamlessly switch out the underlying middleware software with little or no changes to their code.
That strategy worked well enough that in 2008 Wombat was acquired by the New York Stock Exchange, which renamed the software “Data Fabric” and used it as the backbone of their market-data network (SuperFeed).
When the company I was working for was also acquired by NYSE in 2009 I was tasked with replacing our existing middleware layer with the Mama/Wombat middleware, and in the process I came to appreciate the “pluggable” architecture of MAMA – it doesn’t make the issues related to different messaging systems go away, but it does provide a framework for dealing with them.
In 2011 NYSE Technologies donated OpenMAMA to the Linux Foundation. Then, in 2014, the Wombat business was sold by NYSE to Vela Trading Technologies (née SR Labs), which provides the proprietary Data Fabric middleware, and is also the primary maintainer for OpenMAMA. There are a number of different open-source and commercial implementations of OpenMAMA.
Which brings us to the present day – I’ve recently started working with OpenMAMA again, so it seemed like a good idea to use that code as an example of how to use static analysis tools to identify latent bugs.
And, just to be clear, this is not a criticism of OpenMAMA – it’s an impressive piece of work, and has proven itself in demanding real-world situations.
The analysis presented here is based on OpenMAMA release 6.2.1, which can be found here.
I used cppcheck version 1.80 and clang version 5.0.0.
Check out the earlier articles in this series for more on building and running the various tools, including a bunch of helper scripts in the GitHub repo.
For the OpenMAMA analysis, I first built OpenMAMA using Bear to create a compilation database from the scons build:
bear scons blddir=build product=mama with_unittest=n \
middleware=qpid with_testtools=n with_docs=n
With the compilation database in place, I ran the following scripts1, redirecting their output to create the result files:
cc_cppcheck.sh -i common/c_cpp/src/c/ -i mama/c_cpp/src/c/ -c
cc_clangcheck.sh -i common/c_cpp/src/c/ -i mama/c_cpp/src/c/ -c
cc_clangtidy.sh -i common/c_cpp/src/c/ -i mama/c_cpp/src/c/ -c
The results from running the tools on OpenMAMA can also be found in the repo, along with a compile_commands.json
file that can be used without the need to build OpenMAMA from source2. To do that, use the following commands:
cd [directory]
git clone https://github.com/OpenMAMA/OpenMAMA.git
git clone https://github.com/btorpey/static.git
export PATH=$(/bin/pwd)/static/scripts:$PATH
cp static/openmama/* OpenMAMA
cd OpenMAMA
cc_cppcheck.sh -i common/c_cpp/src/c/ -i mama/c_cpp/src/c/ -c
I use the wonderful Beyond Compare to, well, compare the results from different tools.
Before we do anything else, let’s deal with the elephant in the room – false positives. As in, warning messages for code that is actually perfectly fine. Apparently, a lot of people have been burned by “lint”-type programs with terrible signal-to-noise ratios. I know – I’ve been there too.
Well, let me be clear – these are not your father’s lints. I’ve been running these tools on a lot of real-world code for a while now, and there are essentially NO false positives. If one of these tools complains about some code, there’s something wrong with it, and you really want to fix it.
cppcheck includes a lot of “style” checks, although the term can be misleading – there are a number of “style” issues that can have a significant impact on quality.
One of them crops up all over the place in OpenMAMA code, and that is the “The scope of the variable ‘<name>’ can be reduced” messages. The reason for these is because of OpenMAMA’s insistence on K&R-style variable declarations (i.e., all block-local variables must be declared before any executable statements). Which, in turn, is caused by OpenMAMA’s decision to support several old and broken Microsoft compilers3.
The consensus has come to favor declaring variables as close to first use as possible, and that is part of the C++ Core Guidelines. The only possible down-side to this approach is that it makes it easier to inadvertently declare “shadow” variables (i.e., variables with the same name in both inner and outer scopes), but modern compilers can flag shadow variables, which mitigates this potential problem (see my earlier article “Who Knows What Evil Lurks…” for more).
Some other “style” warnings produced by cppcheck include:
[mama/c_cpp/src/c/bridge/qpid/transport.c:1413]: (style) Consecutive return, break, continue, goto or throw statements are unnecessary.
These are mostly benign, but reflect a lack of understanding of what statements like continue
and return
do, and can be confusing.
[common/c_cpp/src/c/list.c:295 -> common/c_cpp/src/c/list.c:298]: (style) Variable ‘rval’ is reassigned a value before the old one has been used.
There are a lot of these in OpenMAMA, and most of them are probably caused by the unfortunate decision to standardize on K&R-style local variable declarations, but in other cases this can point to a potential logic problem. (Another good reason to avoid K&R-style declarations).
Similar, but potentially more serious is this one:
[mama/c_cpp/src/c/bridge/qpid/transport.c:275]: (style) Variable ‘next’ is assigned a value that is never used.
Maybe the variable was used in an earlier version of the code, but is no longer needed. Or maybe we ended up using the wrong variable when we mean to use next
.
There are also cases where the analyzer can determine that the code as written is meaningless
[mama/c_cpp/src/c/bridge/qpid/subscription.c:179]: (style) A pointer can not be negative so it is either pointless or an error to check if it is.
If something cannot happen, there is little point to testing for it – so testing for impossible conditions is almost always a sign that something is wrong with the code.
Here are a few more of the same ilk:
[mama/c_cpp/src/c/dictionary.c:323]: (style) Checking if unsigned variable ‘*size’ is less than zero.
[mama/c_cpp/src/c/statslogger.c:731]: (style) Condition ‘status!=MAMA_STATUS_OK’ is always false
[mama/c_cpp/src/c/dqstrategy.c:543]: (style) Redundant condition: If ‘EXPR == 3’, the comparison ‘EXPR != 2’ is always true.
Whether these warnings represent real bugs is a question that needs to be answered on a case-by-case basis, but I hope we can agree that they at the very least represent a “code smell”, and the fewer of these in our code, the better.
There are bugs, and there are bugs, but bugs that have a “delayed reaction”, are arguably the worst, partly because they can be so hard to track down. Buffer overflows are a major cause of these kinds of bugs – a buffer overflow can trash return addresses on the stack causing a crash, or worse they can alter the program’s flow in ways that seem completely random.
Here’s an example of a buffer overflow in OpenMAMA that was detected by cppcheck:
[common/c_cpp/src/c/strutils.c:632]: (error) Array ‘version.mExtra[16]’ accessed at index 16, which is out of bounds.
Here’s the offending line of code:
version->mExtra[VERSION_INFO_EXTRA_MAX] = '\0';
And here’s the declaration:
char mExtra[VERSION_INFO_EXTRA_MAX];
It turns out that this particular bug was fixed subsequent to the release – the bug report is here. Interestingly, the bug report mentions that the bug was found using clang’s Address Sanitizer, which means that code must have been executed to expose the bug. Static analyzers like cppcheck can detect this bug without the need to run the code, which is a big advantage of static analysis. In this example, cppcheck can tell at compile-time that the access is out-of-bounds, since it knows the size of mExtra.
Of course, a static analyzer like cppcheck can’t detect all buffer overflows – just the ones that can be evaluated at compile-time. So, we still need Address Sanitizer, or valgrind, or some other run-time analyzer, to detect overflows that depend on the run-time behavior of the program. But I’ll take all the help I can get, and detecting at least some of these nasty bugs at compile-time is a win.
In contrast to the buffer overflow type of problem, dereferencing a NULL pointer is not mysterious at all – you’re going down hard, right now.
So, reasonable programmers insert checks for NULL pointers, but reasonable is not the same as perfect, and sometimes we get it wrong.
[mama/c_cpp/src/c/msg.c:3619] -> [mama/c_cpp/src/c/msg.c:3617]: (warning, inconclusive) Either the condition ‘!impl’ is redundant or there is possible null pointer dereference: impl.
Here’s a snip of the code in question – see if you can spot the problem:
3613 mamaMsgField
3614 mamaMsgIterator_next (mamaMsgIterator iterator)
3615 {
3616 mamaMsgIteratorImpl* impl = (mamaMsgIteratorImpl*)iterator;
3617 mamaMsgFieldImpl* currentField = (mamaMsgFieldImpl*) impl->mCurrentField;
3618
3619 if (!impl)
3620 return (NULL);
cppcheck works similarly to other static analyzers when checking for possible NULL pointer dereference – it looks to see if a pointer is checked for NULL, and if it is, looks for code that dereferences the pointer outside the scope of that check.
In this case, the code checks for impl
being NULL, but not until it has already dereferenced the pointer. cppcheck even helpfully ties together the check for NULL and the (earlier) dereference. (Ahem – yet another reason to avoid K&R-style declarations).
Similarly to checking for NULL pointers, detecting leaks is more of a job for valgrind, Address Sanitizer or some other run-time analysis tool. However, that doesn’t mean that static analysis can’t give us a head-start on getting rid of our leaks.
For instance, cppcheck has gotten quite clever about being able to infer run-time behavior at compile-time, as in this example:
[mama/c_cpp/src/c/transport.c:269]: (error) Memory leak: transport
[mama/c_cpp/src/c/transport.c:278]: (error) Memory leak: transport
Here’s the code:
253 mama_status
254 mamaTransport_allocate (mamaTransport* result)
255 {
256 transportImpl* transport = NULL;
257 mama_status status = MAMA_STATUS_OK;
258
259
260 transport = (transportImpl*)calloc (1, sizeof (transportImpl ) );
261 if (transport == NULL) return MAMA_STATUS_NOMEM;
262
263 /*We need to create the throttle here as properties may be set
264 before the transport is actually created.*/
265 if (MAMA_STATUS_OK!=(status=wombatThrottle_allocate (&self->mThrottle)))
266 {
267 mama_log (MAMA_LOG_LEVEL_ERROR, "mamaTransport_allocate (): Could not"
268 " create throttle.");
269 return status;
270 }
271
272 wombatThrottle_setRate (self->mThrottle,
273 MAMA_DEFAULT_THROTTLE_RATE);
274
275 if (MAMA_STATUS_OK !=
276 (status = wombatThrottle_allocate (&self->mRecapThrottle)))
277 {
278 return status;
279 }
280
281 wombatThrottle_setRate (self->mRecapThrottle,
282 MAMA_DEFAULT_RECAP_THROTTLE_RATE);
283
284 self->mDescription = NULL;
285 self->mLoadBalanceCb = NULL;
286 self->mLoadBalanceInitialCb = NULL;
287 self->mLoadBalanceHandle = NULL;
288 self->mCurTransportIndex = 0;
289 self->mDeactivateSubscriptionOnError = 1;
290 self->mGroupSizeHint = DEFAULT_GROUP_SIZE_HINT;
291 *result = (mamaTransport)transport;
292
293 self->mName[0] = '\0';
294
295 return MAMA_STATUS_OK;
296 }
cppcheck is able to determine that the local variable transport
is never assigned in the two early returns, and thus can never be freed.
Not to be outdone, clang-tidy is doing some kind of flow analysis that allows it to catch this one:
[mama/c_cpp/src/c/queue.c:778]: warning: Use of memory after it is freed
Here’s a snip of the code that clang-tidy is complaining about:
651 mama_status
652 mamaQueue_destroy (mamaQueue queue)
653 {
654 mamaQueueImpl* impl = (mamaQueueImpl*)queue;
655 mama_status status = MAMA_STATUS_OK;
...
776 free (impl);
777
778 mama_log (MAMA_LOG_LEVEL_FINEST, "Leaving mamaQueue_destroy for queue 0x%X.", queue);
779 status = MAMA_STATUS_OK;
780 }
781
782 return status;
783 }
clang-tidy understands that queue
and impl
are aliases for the same variable, and thus knows that it is illegal to access queue
after impl
has been freed. In this case, the access causes no problems, because we’re only printing the address, but clang-tidy can’t know that4.
I’ve ranted written before on how much I hate void*
’s. For better or worse, the core OpenMAMA code is written in C, so there are a whole bunch of casts between void*
s and “real” pointers that have the purpose of encapsulating the internal workings of the internal objects managed by the code.
In C this is about the best that can be done, but it can be hard to keep things straight, which can be a source of errors (like this one):
[mama/c_cpp/src/c/fielddesc.c:76]: (warning) Assignment of function parameter has no effect outside the function. Did you forget dereferencing it?
And here’s the code:
65 mama_status
66 mamaFieldDescriptor_destroy (mamaFieldDescriptor descriptor)
67 {
68 mamaFieldDescriptorImpl* impl = (mamaFieldDescriptorImpl*) descriptor;
69
70 if (impl == NULL)
71 return MAMA_STATUS_OK;
72
73 free (impl->mName);
74 free (impl);
75
76 descriptor = NULL;
77 return MAMA_STATUS_OK;
78 }
Of course mamaFieldDescriptor
is defined as a void*
, so it’s perfectly OK to set it to NULL, but since it’s passed by value, the assignment has no effect other than to zero out the copy of the parameter on the stack.
The preceding sections go into detail about specific examples of serious errors detected by cppcheck and clang. But, these are very much the tip of the iceberg.
Some of the other problems detected include:
strtok
) in multi-threaded code;gethostbyname
);printf
-style functions;strcpy
-style functions (e.g., leaving strings without terminating NULL characters);Some of these are nastier than others, but they are all legitimate problems and should be fixed.
The full results for both tools are available in the GitHub repo, so it’s easy to compare the warnings against the code.
The state of the art in static analysis keeps advancing, thanks to people like Daniel Marjamäki and the rest of the cppcheck team, and Gábor Horváth and the team supporting clang.
In particular, the latest releases of cppcheck and clang-tidy are detecting errors that previously could only be found by run-time analyzers like valgrind and Address Sanitizer. This is great stuff, especially given how easy it is to run static analysis on your code.
The benefits of using one (or more) static analysis tools just keep getting more and more compelling – if you aren’t using one of these tools, I hope this will encourage you to do so.
If you found this article interesting or helpful, you might want to also check out the other posts in this series. And please leave a comment below or drop me a line with any questions, suggestions, etc.
Simply clone the GitHub repo to any directory, and then add the scripts
directory to your PATH
.↩
OpenMAMA has its share of prerequisites – you can get a full list here.↩
The list of supported platforms for OpenMAMA is here. You can also find a lot of griping on the intertubes about Microsoft’s refusal to support C99, including this rather weak response from Herb Sutter. Happily, VS 2013 ended up supporting (most of) C99. ↩
Unless it knows what mama_log
does. It turns out that clang-tidy can do inter-procedural analysis, but only within a single translation unit. There is some work ongoing to add support for analysis across translation units by Gábor Horvath et al. – for more see “Cross Translational Unit Analysis in Clang Static Analyzer: Prototype and Measurements”.↩
I’ve been working on performance analysis recently, and a large part of that is scraping log files to capture interesting events and chart them.
I’m continually surprised by the things that you can do using plain old bash and his friends, but this latest one took the cake for me.
Did you know that Linux includes a utility named join
? Me neither. Can you guess what it does? Yup, that’s right – it does the equivalent of a database join across plain text files.
Let me clarify that with a real-world example – one of the datasets I’ve been analyzing counts the number of messages sent and received in a format roughly like this:
Timestamp | Recv |
---|---|
HH:MM:SS | x |
Complicating matters is that sent and received messages are parsed out separately, so we also have a separate file that looks like this:
Timestamp | Send |
---|---|
HH:MM:SS | y |
But what we really want is something like this:
Timestamp | Recv | Send |
---|---|---|
HH:MM:SS | x | y |
Here are snips from the two files:
$ cat recv.txt
Timestamp Recv
2016/10/25-16:04:58 7
2016/10/25-16:04:59 1
2016/10/25-16:05:00 7
2016/10/25-16:05:01 9
2016/10/25-16:05:28 3
2016/10/25-16:05:31 9
2016/10/25-16:05:58 3
2016/10/25-16:06:01 9
2016/10/25-16:06:28 3
$ cat send.txt
Timestamp Send
2016/10/25-16:04:58 6
2016/10/25-16:05:01 18
2016/10/25-16:05:28 3
2016/10/25-16:05:31 9
2016/10/25-16:05:58 3
2016/10/25-16:06:01 9
2016/10/25-16:06:28 3
2016/10/25-16:06:31 9
2016/10/25-16:06:58 3
I had stumbled across the join
command and thought it would be a good way to combine the two files.
Doing a simple join with no parameters gives this:
$ join recv.txt send.txt
Timestamp Recv Send
2016/10/25-16:04:58 7 6
2016/10/25-16:05:01 9 18
2016/10/25-16:05:28 3 3
2016/10/25-16:05:31 9 9
2016/10/25-16:05:58 3 3
2016/10/25-16:06:01 9 9
2016/10/25-16:06:28 3 3
As you can see, we’re missing some of the measurements. This is because by default join
does an inner join of the two files (the intersection, in set theory).
That’s OK, but not really what we want. We really need to be able to reflect each value from both datasets, and for that we need an outer join, or union.
It turns out that join
can do that too, although the syntax is a bit more complicated:
$ join -t $'\t' -o 0,1.2,2.2 -a 1 -a 2 recv.txt send.txt
Timestamp Recv Send
2016/10/25-16:04:58 7 6
2016/10/25-16:04:59 1
2016/10/25-16:05:00 7
2016/10/25-16:05:01 9 18
2016/10/25-16:05:28 3 3
2016/10/25-16:05:31 9 9
2016/10/25-16:05:58 3 3
2016/10/25-16:06:01 9 9
2016/10/25-16:06:28 3 3
2016/10/25-16:06:31 9
2016/10/25-16:06:58 3
A brief run-down of the parameters is probably in order:
Parameter | Description |
---|---|
-t $'\t' |
The -t parameter tells join what to use as the separator between fields. The tab character is the best choice, as most Unix utilities assume that by default, and both Excel and Numbers can work with tab-delimited files.The leading dollar-sign is a trick used to to pass a literal tab character on the command line . |
-o 0,1.2,2.2 |
Specifies which fields to output. In this case, we want the “join field” (in this case, the first field from both files), then the second field from file #1, then the second field from file #2. |
-a 1 |
Tells join that we want all the lines from file #1 (regardless of whether they have a matching line in file #2). |
-a 2 |
Ditto for file #2. |
As you can probably see, you can also get fancy and do things like left outer joins and right outer joins, depending on the parameters passed.
Of course, you could easily import these text files into a “real” database and generate reports that way. But, you can accomplish a surprising amount of data manipulation and reporting with Linux’s built-in utilities and plain old text files.
I couldn’t remember where I had originally seen the join
command, but recently found it again in a nice post by Alexander Blagoev. Check it out for even more obscure commands! And, thanks Alexander!
And thanks also to Igor for his own very nice post that led me back to Alexander’s.
]]>A while back I wrote an article that compared cppcheck and clang’s static analyzers (clang-check and clang-tidy). The folks who make PVS-Studio (the guys with the unicorn mascot that you’ve probably been seeing a lot of lately) saw the article, and suggested that I take a look at their Linux port, which was then in beta test, and write about it.
So I did. Read on for an overview of PVS-Studio, and how it compared to cppcheck.
In the earlier article, I used a benchmark suite developed by Toyota ITC, and written about by John Regehr, who is a professor of Computer Science at the University of Utah. The ITC suite consists of code that is specially written to exhibit certain errors that can be detected by static analysis, so that the code can be used to evaluate the performance of different tools.
In this article, I am going to use the same test suite to evaluate PVS-Studio, and to compare it against cppcheck. I’ll also talk about my experience using both tools to analyze two relatively large real-world codebases that I help maintain as part of my day job.
Using any static analysis tool is better than using none, and in general the more the merrier. Each tool has its own design philosophy, and corresponding strengths and weaknesses.
Daniel Marjamäki1 and the maintainers of cppcheck have done a terrific job creating a free tool that can go head-to-head with expensive commercial offerings. You can’t go wrong with cppcheck, either as a gentle introduction to static analysis, or as the one-and-only tool for the budget-conscious. But don’t take my word for it – the Debian project uses cppcheck as part of its Debian Automated Code Analysis project to check over 100GB of C++ source code.
PVS-Studio is also a terrific tool, but it is definitely not free. (When a product doesn’t have published prices, you know it’s going to cost serious money).
Whether PVS-Studio is worth the price is a judgement call, but if it can find just one bug that would have triggered a crash in production it will have paid for itself many times over.
And while PVS-Studio doesn’t appear to have been adopted by a high-profile project like Debian, the folks who make it are certainly not shy about running various open-source projects through their tool and reporting the results.
So, if your budget can handle it, use both. If money is a concern, then you may want to start out with cppcheck and use that to help build a case for spending the additional coin that it will take to include commercial tools like PVS-Studio in your toolbox.
Note also that PVS-Studio offers a trial version2, so you can give it a go on your own code, which is, after all, the best way to see what the tool can do. And, if you use the provided helper scripts (repo here), your results will be in a format that makes it easy to compare the tools.
In comparing cppcheck and PVS-Studio, I used the ITC test suite that I wrote about in an earlier article. I also used both tools to analyze real-world code bases which I deal with on a day-to-day basis and that I am intimately familiar with.
The ITC test suite that I’ve been using to compare static analyzers is intended to provide a common set of source files that can be used as input to various static analysis tools. It includes both real errors, as well as “false positives” intended to trick the tools into flagging legitimate code as an error.
So far, so good, and it’s certainly very helpful to know where the errors are (and are not) when evaluating a static analysis tool.
In my email discussion with Andrey Karpov of PVS, he made the point that not all bugs are equal, and that a “checklist” approach to comparing static analyzers may not be the best. I agree, but being able to compare analyzers on the same code-base can be very helpful, not least for getting a feel for how the tools work.
Your mileage can, and will, vary, so it makes sense to get comfortable with different tools and learn what each does best. And there’s no substitute for running the tools on your own code. (The helper scripts (repo here) may, well, help).
The ITC test suite includes some tests for certain categories of errors that are more likely to manifest themselves at run-time, as opposed to compile-time.
For instance, the ITC suite includes a relatively large number of test cases designed to expose memory-related problems. These include problems like leaks, double-free’s, dangling pointers, etc.
That’s all very nice, but in the real world memory errors are often not that clear-cut, and depend on the dynamic behavior of the program at run-time. Both valgrind’s memcheck and clang’s Address Sanitizer do an excellent job of detecting memory errors at run-time, and I use both regularly.
But run-time analyzers can only analyze code that actually runs, and memory errors can hide for quite a long time in code that is rarely executed (e.g., error & exception handlers). So, even though not all memory errors can be caught at compile-time, the ability to detect at least some of them can very helpful.
A similar situation exists with regard to concurrency (threading) errors – though in this case neither tool detects any of the concurrency-related errors seeded in the ITC code. This is, I think, a reasonable design decision – the subset of threading errors that can be detected at compile-time is so small that it’s not really worth doing (and could give users of the tool a false sense of security). For concurrency errors, you again will probably be better off with something like clang’s Thread Sanitizer or valgrind’s Data Race Detector.
Also, in the interest of full disclosure, I have spot-checked some of the ITC code, but by no means all, to assure myself that its diagnostics were reasonable.
With those caveats out of the way, though, the ITC test suite does provide at least a good starting point towards a comprehensive set of test cases that can be used to exercise different static analyzers.
The results of running PVS-Studio (and other tools) against the ITC code can be found in the samples directory of the repo.
I also ran both cppcheck and PVS-Studio on the code bases that I maintain as part of my day job, to get an idea of how the tools compare in more of a real-world situation. While I can’t share the detailed comparisons, following are some of the major points.
For the most part, both cppcheck and PVS-Studio reported similar warnings on the same code, with a few exceptions (listed following).
cppcheck arguably does a better job of flagging “style” issues – and while some of these warnings are perhaps a bit nit-picky, many are not:
explicit
static
or const
PVS-Studio, on the other hand, appears to include more checks for issues that aren’t necessarily problems with the use of C++ per se, but things that would be a bug, or at least a “code smell”, in any language.
A good example of that is PVS-Studio’s warning on similar or identical code sequences (potentially indicating use of the copy-paste anti-pattern – I’ve written about that before).
Some other PVS-Studio “exclusives” include:
operator=
, and vice-versa==
catch
clausesBoth tools did a good job of identifying potentially suspect code, as well as areas where the code could be improved.
False positives (warnings on code that is actually correct) are not really a problem with either cppcheck or PVS-Studio. The few warnings that could be classified as false positives indicate code that is at the very least suspect – in most cases you’re going to want to change the code anyway, if only to make it clearer.
If you still get more false positives than you can comfortably deal with, or if you want to stick with a particular construct even though it may be suspect, both tools have mechanisms to suppress individual warnings, or whole classes of errors. Both tools are also able to silence warnings either globally, or down to the individual line of code, based on inline comments.
If you care about building robust, reliable code in C++ then you would be well-rewarded to include static analysis as part of your development work-flow.
Both PVS-Studio and cppcheck do an excellent job of identifying potential problems in your code. It’s almost like having another set of eyeballs to do code review, but with the patience to trace through all the possible control paths, and with a voluminous knowledge of the language, particularly the edge cases and “tricky bits”.
Having said that, I want to be clear that static analysis is not a substitute for the dynamic analsyis provided by tools like valgrind’s memcheck and Data Race Detector, or clang’s Address Sanitizer and Thread Sanitizer. You’ll want to use them too, as there are certain classes of bugs that can only be detected at run-time.
I hope you’ve found this information helpful. If you have, you may want to check out some of my earlier articles, including:
Last but not least, please feel free to contact me directly, or post a comment below, if you have questions or something to add to the discussion.
I’ve posted the helper scripts I used to run PVS-Studio, as well as the results of running those scripts on the ITC code, in the repo.
The following sections describe a subset of the tests in the ITC code and how both tools respond to them.
For the most part, PVS-Studio and cpphceck both do a good job of detecting errors related to bit shifts. Neither tool detects all the errors seeded in the benchmark code, although they miss different errors.
cppcheck appears to do a more complete job than PVS-Studio of detecting buffer overrrun and underrun errors, although it is sometimes a bit “off” – reporting errors on lines that are in the vicinity of the actual error, rather than on the actual line. cppcheck also reports calls to functions that generate buffer errors, which is arguably redundant, but does no harm.
PVS-Studio catches some of the seeded errors, but misses several that cppcheck detects.
While not stricly speaking an overrun error, cppcheck can also detect some errors where code overwrites the last byte in a null-terminated string.
Both cppcheck and PVS-Studio do a good job of detecting conditionals that always evaluate to either true or false, with PVS-Studio being a bit better at detecting complicated conditions composed of contstants.
On the other hand, cppcheck flags redundant conditions (e.g., if (i<5 && i<10)
), which PVS-Studio doesn’t do.
Surprisingly, neither tool does a particularly good job of detecting loss of integer precision (the proverbial “ten pounds of bologna in a five-pound sack” problem ;-)
I say surprisingly because these kinds of errors would seem to be relatively easy to detect. Where both tools seem to fall short is to assume that just because a value fits in the target data type, the assignment is valid – but they fail to take into account that such an assignment can lose precision.
I wanted to convince myself that the ITC code was correct, so I pasted some of the code into a small test program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
When you run this program, you’ll get the following output:
$ gcc test1.c && ./a.out
Value of sink=-128
So, a
has the value 128, but when a
is assigned to the (signed) char ret
, the bit pattern 0x80
is interpreted in the context of a (signed) char, and the sign is lost. If ret
had been declared as an unsigned char, then the assigment would not lose the sign of a
.
cppcheck does do a slightly better job of detecting integer overflow and underflow in arithmetic expressions compared to PVS, but still misses a number of seeded errors.
Both PVS-Studio and cppcheck do a good job of catching potential divide-by-zero errors, with cppcheck having a slight edge.
PVS-Studio tends to do a somewhat better job than cppcheck at detecting various types of dead code, such as for
loops and if
statements where the condition will never be true.
PVS-Studio also very helpfully flags any unconditional break
statements in a loop – these are almost always going to be a mistake.
As mentioned above, neither tool detects any of the concurrency-related errors seeded in the ITC code. Again, I regard that as a reasonable design choice, given the relatively small percentage of such errors that can be detected at compile-time.
As discussed earlier, not all memory errors can be detected at compile-time, so the lack of any error output certainly doesn’t mean that the code doesn’t have memory errors – it just means that they can’t be detected by the tools. But while many memory errors cannot be detected at compile-time, for those that can be, detecting them is a big win.
cppcheck does an excellent job of detecting double-free errors (11 out of 12), while PVS-Studio only flags one of the seeded errors.
On the other hand, PVS-Studio does a better job of detecting attempts to free memory that was not allocated dynamically (e.g., local variables).
Neither tool does a particularly good job of catching these. Perhaps that is because freeing a NULL pointer is actually not an error, but doing so is certainly a clue that the code may have other problems.
cppcheck does a somewhat better job of detecting the use of dangling pointers (where the pointed-to object has already been freed).
If you’re writing code for an embedded system, then checking for and handling allocation failures can be important, because your application is likely written to expect them, and do something about them. But more commonly, running out of memory simply means that you’re screwed, and attempting to deal with the problem is unlikely to make things better.
Neither tool detects code that doesn’t handle allocation failures, but cppcheck does flag some allocation-related problems (as leaks, which is not correct, but it is a clue that there is a problem lurking).
Typically, memory leaks are only evident at run-time, but there are some cases where they can be detected at compile-time, and in those cases cppcheck does a pretty good job.
Both PVS-Studio and cppcheck do a good job of flagging code that dereferences a NULL pointer, although neither tool catches all the errors in the benchmark code.
Both PVS-Studio and cppcheck detect returning a pointer to a local variable that is allocated on the stack.
PVS-Studio does a somewhat better job than cppcheck of flagging accesses to uninitialized memory.
Both cppcheck and PVS-Studio detect some infinite loop errors, but miss several others. It could be that this is by design, since the code that is not flagged tends to resemble some idioms (e.g., ` while (true)`) that are often used deliberately.
PVS-Studio is quite clever here – it will complain about an unused return value from a function, if it can determine that the function has no side effects. It also knows about some common STL functions that do not have side effects, and will warn if their return values are ignored.
cppcheck doesn’t check for return values per se, but it will detect an assignment that is never referenced. This makes some sense, since warning on ignored return values could result in a large number of false positives.
Both tools detect certain cases of empty blocks (e.g., if (...);
– note the trailing semi-colon).
What neither tool does is warn about “short” blocks – where a conditional block is not enclosed in braces, and so it’s not 100% clear whether the conditional is meant to cover more than one statement:
if (...)
statement1();
statement2();
If you’ve adopted a convention that even single-statement blocks need to be enclosed in braces, then this situation may not pertain (and good for you!). Still, I think this would be a worthwhile addition – at least in the “style” category.
cppcheck does a particularly good job of detecting dead stores (where an assignment is never subsequently used). PVS-Studio, on the other hand, flags two or more consecutive assignments to a variable, without an intervening reference. PVS-Studio will also flag assignment of a variable to itself (which is unlikely to be what was intended).
The folks at PVS-Studio asked me to mention that they’ve also recently introduced a free version of their software for educational purposes. The free version does have some strings attached, see this post for details.↩
See here and here for an explanation of how floating-point arithmetic can produce unexpected results if you’re not careful.↩
As developers, we seem to take a special delight in personalizing the virtual worlds in which we work – from color palettes to keyboards, fonts, macros, you name it. “Off-the-rack” is never good enough, we want Saville Row tailoring for our environments.
And a lot of the tools we use support and encourage that customization, giving us control over every little option.
But not every tool we use does so – read on to learn a very simple trick to how to take control even when your tool doesn’t make that easy.
In Linux, we have a couple of common ways to customize the way our tools work – by defining environment variables, and by using configuration files. Sometimes these two mechanisms work well together, and we can include environment variables in configuration files to make them flexible in different situations.
Not every tool can expand environment varaiables in its configuration files, however. In that case, you can use this simple Perl one-liner to subsitute values from the environment into any plain-text file.
perl -pe '$_=qx"/bin/echo -n \"$_\"";' < sample.ini
What’s happpening here is
The -p
switch tells Perl to read every line of input and print it.
The -e
switch tells Perl to execute the supplied Perl code against every line of input.
The code snippet replaces the value of the input line ($_
) with the results of the shell command specified by the qx
function. That shell command simply echos1 the value of the line ($_
), but it does so inside double quotes (the \"
), which causes the shell to replace any environment variable with its value.
And that’s it! Since the subsitution is being done by the shell itself, you can use either form for the environment variable (either $VARIABLE
or ${VARIABLE}
), and the replacement is always done using the rules for the current shell.
Here’s an example – let’s create a simple .ini type file, like so:
username=$USER
host=$HOSTNAME
home-directory=$HOME
current-directory=$PWD
When we run this file through our Perl one-liner, we get:
perl -pe '$_=qx"/bin/echo -n \"$_\"";' < sample.ini
username=btorpey
host=btmac
home-directory=/Users/btorpey
current-directory=/Users/btorpey/blog/code/tailor
One thing to watch out for is that things can get a little hinky if your input file contains quotes, since the shell will interpret those, and probably not in the way you intend. At least in my experience, that would be pretty rare – but if you do get peculiar output that would be something to check.
Note that we use /bin/echo here, instead of just plain echo, to get around an issue with the echo command in BSD (i.e., OSX).↩
In my day job, one of my main focuses is software reliability and correctness, so it makes sense that I would be a big fan of static analysis.
I’ve written previously about the static analysis provided by clang. Today, I want to take a bit of a “deep-dive” into the whole subject by putting both clang and cppcheck through their paces, using them to analyze a benchmark suite designed to exercise static analysis tools. In the course of doing that, I’ll also provide some helper scripts that make working with the tools easier.
And what is good Phaedrus, and what is not good – need we ask anyone to tell us these things? 1
Obviously, the ultimate goal is to be able to run static analysis tools against our own codebase(s) to help detect and fix problems. But how do we know if a particular tool is actually finding problems? And, how do we know if we’re running the tool properly?
The perfect static analyzer would find all the latent bugs in our code, while not reporting any false positives2. Since there are no perfect analyzers, any tool we use is going to miss some errors, and/or wrongly flag correct code. So, the only way to evaluate an analyzer is to know where all the bugs are in our code – but if we knew that, we wouldn’t need an analyzer.
That’s a dilemma. To resolve it, we’re going to be using a codebase specifically designed to trigger static analysis warnings. The code was originally developed by Toyota ITC, and is available on John Regehr’s excellent blog.
The ITC benchmarks attempt to resolve our dilemma by providing both a set of code that contains errors which should trigger warnings, as well as a second set of code, similar to the first, but which doesn’t contain errors. Each source file is annotated with comments documenting where the errors are (and aren’t). And that lets us create a catalog of both real errors and potential false positives3.
To get started, download the code from its GitHub repository, and set the ITCBENCH_ROOT
environment variable (which will come in handy later):
$ git clone https://github.com/regehr/itc-benchmarks
$ export ITCBENCH_ROOT=$(pwd)/itc-benchmarks
The remainder of this article goes step-by-step through the process of creating a compilation database from the ITC benchmark code, running clang’s static analysis tools against that compilation database, building and installing cppcheck and running it against the compilation database, and analyzing the results.
This is all good stuff, especially if you’re going to be using these tools going forward. But, there’s a certain amount of unavoidable yak-shaving4 going on to get to that point. So if you prefer to skip all that, I’ve included the results of running the different tools in the samples directory of the repo. The samples include all the files we’re going to be generating the hard way, so you can follow along without all the requisite busy-work. Hopefully, when we’re done you’ll want to go back and use these tools on your own codebase.
To run both clang and cppcheck we first need to create a “compilation database” to supply them with required build settings. The compilation database format was developed as part of the clang project, to provide a way for different tools to query the actual options used to build each file.
A good overview of how the compilation database works with clang-based tools can be found at Eli Bendersky’s excellent site. His article illustrates the importance of making sure that code analysis tools are looking at the same (pre-processed) source that the actual compiler sees, in order to generate meaningful diagnostics with a minimum of false positives.
If you are using cmake to drive your builds, creating a compilation database couldn’t be easier – simply add the -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
parameter to the cmake build command, or add the following to your main CMakeLists.txt file:
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
If you’re not using cmake, you can still create a compilation database using plain old make by front-ending make with Bear5, like so:
bear make
In either case, the end result should be the creation of a compile_commands.json
file in the current directory.
Sadly, the ITC benchmark suite is stuck in the past using autotools, and worse yet, a version that needs to be installed from source (on RH6, at least).
So, in the interest of immediate gratification, I’ve included the compile_commands.json file here – simply save it to the directory where you’ve cloned the ITC code. (The compile_commands.json file is also contained in the samples directory of the repo for this article).
If you prefer to generate the compile_commands.json file yourself using Bear, you can do so like this:
$ cd ${ITCBENCH_ROOT}
$ ./bootstrap
$ ./configure
$ bear make
To make it possible to compare results from different analyzers, we first need to establish a baseline using the ITC benchmarks, and for that we’re going to need this set of helper scripts, which can be downloaded from this GitHub repo.
$ git clone https://github.com/btorpey/static
Once you’ve done that, you need to add the directory to your PATH:
$ export PATH=$(pwd)/static/scripts:$PATH
Enter the following command from the ITC source directory to create a csv file with the error annotations from the ITC code:
$ cd ${ITCBENCH_ROOT}
$ cc_driver.pl -n grep -Hni ERROR: |
itc2csv.pl -r ${ITCBENCH_ROOT}/ |
sort -u > itc.csv
The command will create a file named itc.csv
in the source directory that looks like this:
$ cat itc.csv
"01.w_Defects/bit_shift.c:106","/*ERROR:Bit shift error*/"
"01.w_Defects/bit_shift.c:120","/*ERROR:Bit shift error*/"
"01.w_Defects/bit_shift.c:133","/*ERROR:Bit shift error*/"
"01.w_Defects/bit_shift.c:146","/*ERROR:Bit shift error*/"
"01.w_Defects/bit_shift.c:163","/*ERROR:Bit shift error*/"
"01.w_Defects/bit_shift.c:175","/*ERROR:Bit shift error*/"
...
The format of the csv file is really simple – just an entry for file and line number, and another with the error annotation munged from the source file. This will give us a baseline against which to compare both clang and cppcheck.
In a couple of previous posts, I wrote about static analysis with clang, and how to build clang. This next bit assumes that you’ve got clang ready-to-go, but if that’s not the case, there can be a fair amount of work required to get to that point, so you may want to skip ahead to the section on using cppcheck.
We’re going to use a similar approach to the one we used above to generate the list of expected errors from the ITC code. The command below will run clang-check against all the files in compile_commands.json, filter the results, and reformat the output in csv format:
$ cd ${ITCBENCH_ROOT}
$ cc_driver.pl clang-check -analyze 2>&1 |
clang2csv.pl -r ${ITCBENCH_ROOT}/ |
sort -u > clangcheck.csv
This gives us the diagnostic messages produced by clang, in the same csv format as we used for the list of errors, above:
$ cat clangcheck.csv
"01.w_Defects/bit_shift.c:106","warning: The result of the '<<' expression is undefined"
"01.w_Defects/bit_shift.c:133","warning: The result of the '<<' expression is undefined"
"01.w_Defects/bit_shift.c:146","warning: The result of the '<<' expression is undefined"
"01.w_Defects/bit_shift.c:163","warning: The result of the '<<' expression is undefined"
"01.w_Defects/bit_shift.c:175","warning: The result of the '<<' expression is undefined"
...
We can already see that there are some differences: the ITC code expects to see a diagnostic at 01.w_Defects/bit_shift.c:120, but clang doesn’t output a warning for that line.
What I like to do at this point is fire up my all-time favorite tool, Beyond Compare, to generate a visual diff of the two files:
This view shows the expected diagnostics extracted from the ITC source files on the left, alongside the diagnostics generated by clang on the right. We can see that clang catches some of the bugs in the source file, but misses others. If we continue to read down the two files, we’ll also see some potential “false positives” – i.e., diagnostics issued by clang that are not marked as expected errors in the source files.
The visual approach using Beyond Compare works well for me, but with a csv-formatted datafile, other approaches are possible as well. We could import the diagnostic messages into a spreadsheet program, or even a DBMS, for archiving, tracking and comparison.
clang actually has two tools for doing static analysis – in the example above we ran clang-check -analyze
, but now we’re going to use clang-tidy
instead.
$ cd ${ITCBENCH_ROOT}
$ cc_driver.pl clang-tidy 2>&1 |
clang2csv.pl -r ${ITCBENCH_ROOT}/ |
sort -u > clangtidy.csv
If you compare the results from clang-check and clang-tidy, you’ll notice that clang-tidy generally reports more warnings than clang-check. Some of them are not necessarily defects, but are arguably bad practice (e.g., using strcpy
).
clang-tidy also outputs a slightly different format, including the name of the check in brackets. (The name can also be used to suppress the warning).
The choice of which to use is up to you – my preference is to use clang-check first, and follow up with clang-tidy, simply because the warnings produced by clang-tidy either duplicate those from clang-check, or are not as serious.
Note that you can get a list of available checks from clang with the following command:
$ clang -cc1 -analyzer-checker-help
...
core.DivideZero Check for division by zero
core.DynamicTypePropagation Generate dynamic type information
core.NonNullParamChecker Check for null pointers passed as arguments to a function whose arguments are references or marked with the 'nonnull' attribute
core.NullDereference Check for dereferences of null pointers
core.StackAddressEscape Check that addresses to stack memory do not escape the function
There’s another static analysis tool that can provide results comparable to clang. cppcheck has been around for a while, and I had tried to get it working in the past, but had given up after bumping into a few problems.
I kept hearing good things about cppcheck in articles and presentations by others, though, so I finally decided it would be worth the trouble to get it working.
It turns out the problems were not that difficult to solve, given a combination of documentation and experimentation. And the benefits were significant, so I’m quite happy to have added cppcheck to my tool box.
While cppcheck is available bundled with some distros, it’s often an older version, so we’re going to build and install it from source. As is more and more often the case, cppcheck has started using features of C++1x, so we’re going to need a C++1x-capable compiler to build it.
If you’re on an older distro (in my case, RH6) where the system compiler is not C++1x-capable, see my earlier post about how to build clang (and/or gcc) to get a C++1x-capable compiler. (Basically, it uses an older version of gcc to build a newer version, and the newer version to build clang).
It took some trial-and-error to get the cppcheck build parameters right, but the supplied build script should get the job done6.
$ ./build_cppcheck.sh 2>&1 | tee build_cppcheck.out
You’ll need to add the cppcheck directory to your PATH (assuming the install location from the build script):
$ export PATH=/build/share/cppcheck/1.73/bin:$PATH
If the build and install process worked, you should be able to invoke cppcheck from the command line, like so:
$ cppcheck --version
Cppcheck 1.73
If you see the message below instead, there’s a problem with the RPATH setting:
$ cppcheck --version
cppcheck: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by cppcheck)
The problem is typically either that the RPATH setting in the build is incorrect, or that the directory referenced by the RPATH setting does not exist.
Now we’re ready to run cppcheck, using the same approach we used with clang:
$ cd ${ITCBENCH_ROOT}
$ cc_driver.pl cppcheck.sh 2>&1 |
cppcheck2csv.pl -r ${ITCBENCH_ROOT} |
sort -u > cppcheck.csv
Note that instead of invoking cppcheck directly, we’re invoking it via the cppcheck.sh helper script, which supplies needed parameters to cppcheck. It also creates an include file with the compiler’s pre-defined macros, so those definitions will be visible to cppcheck. This turns out to be particularly important with cppcheck, especially if the code you’re trying to analyze uses #ifdef
’s to control what code actually gets compiled (or seen by cppcheck)7.
One of the settings in the helper script enables what cppcheck calls “inconclusive” results. These are exactly what the name implies – cppcheck isn’t positive that the code is wrong, but it is at least suspicious. Including these inconclusive results should tend to increase the number of false positives in theory, but in practice I haven’t found false positives to be a big problem with either cppcheck or clang.
One of the first things you notice with cppcheck is that it includes more checks than clang. Some of the additional warnings are for constructs that are not exactly wrong, but are either non-optimal, or indicators of potential problems. For instance, cppcheck will warn when a variable is defined in a broader scope than is actually required (“scope … can be reduced”).
You can get a list of all the checks cppcheck is performing like so:
$ cppcheck --doc
...
## Other ##
Other checks
- division with zero
- scoped object destroyed immediately after construction
- assignment in an assert statement
- free() or delete of an invalid memory location
- bitwise operation with negative right operand
- provide wrong dimensioned array to pipe() system command (--std=posix)
You can also generate a list of error ID’s with this command:
$ cppcheck --errorlist
<error id="stringLiteralWrite" severity="error" msg="Modifying string literal directly or indirectly is undefined behaviour."/>
<error id="sprintfOverlappingData" severity="error" msg="Undefined behavior: Variable 'varname' is used as parameter and destination in s[n]printf()."/>
<error id="strPlusChar" severity="error" msg="Unusual pointer arithmetic. A value of type 'char' is added to a string literal."/>
<error id="incorrectStringCompare" severity="style" msg="String literal "Hello World" doesn't match length argument for substr()."/>
<error id="literalWithCharPtrCompare" severity="style" msg="String literal compared with variable 'foo'. Did you intend to use strcmp() instead?"/>
<error id="charLiteralWithCharPtrCompare" severity="style" msg="Char literal compared with pointer 'foo'. Did you intend to dereference it?"/>
<error id="incorrectStringBooleanError" severity="style" msg="Conversion of string literal "Hello World" to bool always evaluates to true."/>
You can suppress any errors you don’t care to see by passing its id in the --suppress=
flag.
There’s a school of thought that says you should use as many compilers as possible to build your code, because each one will find different problems. That’s still a good idea, and even more so with static analysis tools.
There’s a certain amount of overlap between clang and cppcheck, but there are also significant differences. In my experience, if clang reports something as a problem, it almost certainly is one, but clang also misses a lot of problems that it could detect.
cppcheck can generate more warnings, and some of them are more stylistic issues, but it does detect certain classes of problems, like dead code and arithmetic over/underflow, that clang doesn’t.
As I mentioned earlier, I haven’t found false positives to be a major problem with either clang or cppcheck.
So, each tool has its place, and I like to use both.
Static analysis tools can add real value to the software development process by detecting errors, especially errors in code that is never or almost never executed.
Commercial tools can be expensive (although still cheap compared to the money they save), and open-source tools can sometimes be hard to use (or at least hard to learn how to use).
The provided helper scripts (repo here) should make it much easier to use these tools, and to keep track of warnings and compare the outputs of different tools by using a common format.
They can also be useful for before-and-after comparisions of different versions of a single codebase – for example, as changes are being made to address issues detected by the tools.
In addition to the people, projects and organizations mentioned earlier, the people at the NIST have been very helpful, and maintain an incredible collection of resources on the topic of static analysis for a number of languages, not just C++. Some of those resources include the following, and are well worth checking out:
https://samate.nist.gov/index.php/SAMATE_Publications.html
https://samate.nist.gov/SARD/
If you’ve read any of my other posts, you may have noticed that the contents sidebar at the beginning of the article is a new thing. Especially for longer-format articles, that TOC would seem to be very helpful. Many thanks to Robert Riemann for taking the trouble to explain how to do it.
I’ve been using the very nice MacDown editor to create these posts – thanks, Tzu-Ping!
Some helpful references that I ran across while researching this article:
Static Code Analysis, John Carmack
CppCon 2015: Jason Turner “The Current State of (free) Static Analysis”
CppCon 2015: Neil MacIntosh “Static Analysis and C++: More Than Lint”
Robert Pirsig, “Zen and the Art of Motorcycle Maintenance”↩
A “false positive” is when a tool reports an error that is actually not.↩
Full disclaimer: I have not taken the time to review all of the ITC source to verify that the annotations are accurate and/or complete. For the purpose of this exercise, we’ll agree to assume that they are – but if you’d like to suggest any improvements, I’m guessing the best place to do that would on the repo.↩
See https://en.wiktionary.org/wiki/yak_shaving for a description of this colorful term.↩
Building and installing Bear from source is relatively straightforward – just keep in mind that you need python >= 2.7.↩
As usual, I prefer installing external packages in a non-standard location, so the build script is set up to do that. See this post for an explanation and rationale of this approach).↩
Note that cppcheck does not particularly like it when you include system include directories using -I
. Accordingly, we don’t pass the -s
switch to cc_driver.pl when running cppcheck.↩
Nowadays it’s pretty common for applications to be distributed across multiple machines, which can be good for scalability and resilience.
But it does mean that we have more machines to monitor – sometimes a LOT more!
Read on for a handy tip that will let you do a lot of those tasks from any old session (and maybe lose some of those screens)!
For really simple tasks, remote shell access using ssh is fine. But oftentimes the tasks we need to perform on these systems are complicated enough that they really should be scripted.
And especially when under pressure, (e.g., troubleshooting a problem in a
production system) it’s good for these tasks to be automated. For one thing,
that means they can be tested ahead of time, so you don’t end up doing the
dreaded rm -rf *
by mistake. (Don’t laugh – I’ve actually seen that happen).
Now, I’ve seen people do this by copying scripts to a known location on the remote machines so they can be executed. That works, but has some disadvantages: it clutters up the remote system(s), and it creates one more artifact that needs to be distributed and managed (e.g., updated when it changes).
If you’ve got a bunch of related scripts, then you’re going to have to bite the bullet and manage them (perhaps with something like Puppet).
But for simple tasks, the following trick can come in very handy:
ssh HOST ‘bash –s ‘ < local_script.sh
What we’re doing here is running bash remotely and telling bash to get its input from stdin. We’re also redirecting local_script.sh to the stdin of ssh, which is what the remote bash will end up reading.
As long as local_script.sh is completely self-contained, this works like a charm.
For instance, to login to a remote machine and see if hyper-threading is enabled on that machine:
ssh HOST 'bash -s' < ht.sh
Where ht.sh looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
(The script above was cribbed from http://unix.stackexchange.com/a/33509 – thanks, Nils!)
Of course, all the normal redirection rules apply – you just have to keep in mind that you’re redirecting to ssh, which is then redirecting to bash on the input side. On the output side, it’s reversed.
Give this a try the next time you need to do some quick tasks over ssh and you’ll be able to get rid of a few of those monitors!
]]>I keep singing the praises of clang, and with good reason – the clang project has been advancing the state of C/C++ compiler technology on Linux and OS X for quite a while now.
The modular design of the compiler has also enabled the creation of a set of ancillary tools, including run-time “sanitizers” (which I wrote about earlier), as well as pretty-printers, and a tool to automatically upgrade code to C++11.
Today I want to talk about clang’s static analysis engine, which can do a deep-dive on your code and find problems that are hard for a human to detect, but that are amenable to a brute-force approach that models the run-time behavior of a piece of code, but at compile-time.
This is very different from dynamic analysis tools, like valgrind and clang’s own sanitizers, which instrument the code at run-time to detect actual errors (e.g., reading uninitialized memory) that happen while the code is running. With dynamic analysis, the only errors that can be detected are in code that is actually executed, so a latent bug that only manifests under unusual conditions1, can go un-detected. By contrast, static analysis can potentially find bugs in code that is never (or almost never) actually executed.
Sounds good, no? Who wouldn’t want to find bugs “automagically”, without even needing to do any testing. (Cause we all know how much programmers love testing ;-)
For example, running clang’s static analyzer on some sample code turns up warnings similar to the following:
Some of the above warnings (e.g., value stored is never read) are most likely harmless, and just sloppy coding (perhaps because of copy-paste syndrome, about which I have more to say here). Others (e.g., called pointer is null), might be false positives, given the algorithms the analyzer uses2. Or, they could be real bugs that you just haven’t hit yet, because the code is (almost) never executed.
Those are the really scary bugs, along with the ones where you can “get lucky” most of the time … except when you don’t. The “garbage value” and “unitialized value” warnings fall into that category, and can be very hard to eyeball. Again, dynamic analysis tools like valgrind can help find these bugs, but only if you hit that code in testing.
So, static analysis is good, but it’s not magic. Static analyzers can only find bugs that they are programmed to find, and they certainly don’t find all bugs. For instance, here’s a bug that clang’s static analysis doesn’t find:
1 2 3 4 5 6 7 |
|
But the fact is that static analysis will find bugs, and it will find bugs that you most likely wouldn’t find on your own, so it’s a a good tool to have in your toolbox. So, let’s take a look at how to do that using clang.
The first step is to install clang. If you’re on OS X or Ubuntu, you should already have it, but if you’re on RedHat this can be a bit tricky, so see my previous post on how to get clang working. (I’ve updated that post to add instructions for installing some of the static analysis tools that don’t normally get installed with clang).
It turns out that there are three (3) different ways to run clang’s static analyzer on your code, each with its own benefits and drawbacks. We’ll consider each of these in turn:
If you use reasonably normal-looking makefiles to build your code, you can get static analysis going with a minimum of fuss. If you’re using cmake to create your makefiles, the same approach will work fine, so long as you’re not overriding the values of CMAKE_C_COMPILER etc. (And, as usual, if you’re using autotools, you’re on your own;-).
In this approach, you export
some environment variables to invoke the analyzer instead of the compiler, like the following:
Variable | Value | Meaning |
---|---|---|
CC | ccc-analyzer | C compiler is redirected to clang analyzer (which in turn invokes the compiler, using the value of CCC_CC, below). |
CXX | c++-analyzer | Similar to above, but for C++. |
CCC_CC | clang | This environment variable is used by ccc-analyzer to invoke the actual C compiler. |
CCC_CXX | clang++ | ditto |
LD | clang++ | Specifies that the actual compiler should be used for linking. |
CCC_ANALYZER_VERBOSE | 1 | (Optional) Set this flag to get verbose output from the analyzer, including the list of errors checked. |
With those variables set, you should just be able to invoke make
to build your project, and Bob’s your uncle.
One nice thing about this approach is that you get both compiler warnings and analyzer warnings together – first, the analyzer invokes the compiler on the source file, and then performs the static analysis.
In a similar fashion as ccc-analyzer (above) front-ends make, you can use clang’s scan-build tool to front-end ccc-analyzer. In addition to invoking the compiler and analyzer, scan-build also collects the analyzer reports, including the control flow that the analyzer used to infer any errors, and presents that using a set of html pages that are written by default to the /tmp directory, and that look like this:
Personally, I find this fascinating. Not only does the analyzer tell about what it thinks is a problem, but also why it thinks so.
In the example above, you can see the steps that the analyzer follows to figure out that there is a problem with the code. If you are wondering whether a particular warning is a false positive or not, this presentation can help you figure that out. 3 It can also sometimes provide unexpected insights into the code that you might not come up with on your own.
To use this approach, you set your environment variables the same as described above, but instead of running make, you run scan-build -V make
. This will run your build and then launch a browser to view the results of the build.
Unfortunately, scan-build (and its scan-view companion) are not installed by default with clang. I’ve updated the build script from my earlier post on building clang on RedHat to install these files, but if you want to do it manually, run the following from the source tree you used to build and install clang:
1 2 3 4 5 6 7 |
|
In an earlier post, I talked about how to use the -isystem
flag to prevent selected headers from generating warnings. Unfortunately, the clang analyzer chokes on that flag – so if you’re using it, you will need to apply the patch below to successfully run the analyzer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Last but not least, you can also use a “compilation database” to invoke the static analyzer directly. So, what is a compilation database, you ask? This is a simple format introduced by clang that records the actual commands used to generate intermediate build products from source files, along with their parameters.
The analyzer needs this information to reproduce the environment used by the compiler, including pre-processor definitions and include file search paths.
If you are using cmake to drive your builds, creating a compilation database couldn’t be easier – simply add the -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
parameter to the cmake build command, or add the following to your main CMakeLists.txt file:
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
If you’re not using cmake, you can still create a compilation database using plain old make by front-ending make with Bear4, like so:
bear make
This will use Bear to drive the make process, leaving a compile_commands.json
file in the current directory.
Once you’ve got the compilation database, invoking the analyzer can be done with a command like the following:
1 2 3 4 5 6 7 8 9 |
|
(There are simpler ways to invoke the analyzer, but the approach shown here will visit each source file in the same order that it was originally built, which can be handy).
As we said earlier, static analysis is not magic, and it certainly won’t find all your bugs. But it will probably find some, and the ones it finds are likely to be nasty, so it’s worth a certain amount of trouble.
Last but not least, this is by no means a complete explanation of clang’s analyzer. Like much of clang, the documentation lags the code, sometimes by a lot, so much of this information was obtained by trial-and-error, and/or by reading the code. So, if you find something interesting, please drop me a line, or leave a note in the comments section.
http://clang-analyzer.llvm.org/index.html
http://clang.llvm.org/docs/ClangCheck.html
These are often called “Heisenbugs”, in a nerd-humor pun on the Heisenberg Uncertainty Principle.↩
For instance, clang’s analayzer attempts to figure out if a pointer can possibly be NULL by seeing if there is any code that checks for that condition. If there is, then clang complains about any code that dereferences the pointer outside an if (x != NULL)
block. This algorithm isn’t perfect, but it’s about the best that can be done, especially since the analyzer only looks at a single file at a time.↩
At least in my experience, many of the warnings that appear at first to be false positives turn out to be real bugs, especially if you follow through the control flow the analyzer uses.↩
Building and installing Bear from source is relatively straightforward – just keep in mind that you need python >= 2.7.↩
Pity the poor Shadow! Even with the recent glut of super-heroes in movies, games and TV, the Shadow is nowhere to be seen.
But I guess that’s the whole point of being the Shadow.
According to this, the Shadow had “the mysterious power to cloud men’s minds, so they could not see him”. Hmmm, that sounds like more than a few bugs I’ve known.
Read on to learn how to get your compiler to help you find and eliminate these “shadow bugs” from your code.
Recently I was cleaning up the code for one of our test programs, and I suddenly started getting a crash at shutdown that I hadn’t seen before. The stack trace looked more or less like I expected (except for the SEGV, of course), and I spent several minutes staring at the code before the light bulb came on.
As is often the case, once the light bulb did come on, my first reaction was “Duh!”. It was a dumb mistake, but then I started to think: if it’s such a dumb mistake, why didn’t the compiler warn me about it? Answering that question got me looking into the state of compiler diagnostics, and taught me a few things I hadn’t known (or had forgotten).
First, let’s take a look at the bug that was a bit of a head-scratcher, and that prompted this post. I’ve distilled it down to just a few lines of code — take a look and see if you can spot the bug:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
The bug is in C’s constructor, where instead of initializing the member variable (_pD), the code instead creates a local variable with the same name. The local variable goes out of scope on return and gets deleted (although the allocation persists), but the member variable of the same name remains uninitialized. The problem comes when we delete c, since C’s dtor deletes a pointer that is just a bunch of random bits1. The fix, of course, is to omit the type declaration on the assignment, which causes the compiler to assign to the member variable, rather than creating and then assigning to a local (stack) variable.
(I can already hear the howls of outrage at this code – see 2, 3 and 4 for a discussion if you’re so inclined).
Granted that there are ways to avoid this problem by writing the code “correctly” (perfectly?) in the first place. But still, if it’s such a dumb mistake, why didn’t the compiler warn about it?
That was what puzzled me, especially since I thought our “diagnostic hygiene” was pretty good. All our code is built with “-Wall -Wextra”, which is not quite “everything but the kitchen sink”, but close.
But when we build with those flags, the compiler is perfectly happy:
$ clang++ -g -Wall -Wextra shadow.cpp $
But running – that’s another story:
$ ./a.out *** glibc detected *** ./a.out: free(): invalid pointer: 0x0000000000400600 *** ... Aborted (core dumped) $
When we load the core file into the debugger, we see that the offending instruction is the delete of _pD in C’s destructor:
$ gdb a.out core.897 (gdb) bt #0 0x0000003a86032925 in raise () from /lib64/libc.so.6 #1 0x0000003a86034105 in abort () from /lib64/libc.so.6 #2 0x0000003a86070837 in __libc_message () from /lib64/libc.so.6 #3 0x0000003a86076166 in malloc_printerr () from /lib64/libc.so.6 #4 0x0000000000400720 in C::~C (this=0x7fffe6dbdec8) at shadow.cpp:23 #5 0x00000000004006a9 in main () at shadow.cpp:37 (gdb)
The result above is just one of three possible results. Let’s take a look at each of these in turn:
You may get no message at all - the code (appears to) work fine.
This is the result we get if we use gcc to compile the code. With gcc, the allocation is (presumably) being satisfied by the operating system (e.g., by calling sbrk). Typically, the OS will zero-fill any memory that it allocates as a security precaution (see here and here for details).
So, in this case, we’re deleting a nullptr, and that is perfectly kosher according to the standard. (Why that is may be a cause for debate, but it is).5
You may get a message similar to *** glibc detected *** ./a.out: free(): invalid pointer
followed by a stack trace.
This happens when glibc can detect that the pointer being freed was not previously allocated.
The memory management functions in glibc contain runtime checks to catch error conditions. Some of these checks are enabled in all cases (because they are relatively inexpensive), while others must be specifically enabled.6 In this particular case, the code in free
is checking to see if the address being passed in has previously been allocated using e.g. malloc
. If not, the code signals an error.
This is the result we get when using clang – with clang, the allocation request is being satisfied from memory that was previously allocated and freed, so the bits that make up the member variable _pD have already been scribbled on (i.e., they are non-zero), but glibc can tell the address is not one that was previously allocated.
You may get a message similar to Segmentation fault (core dumped)
.
In our case, C’s destructor is pretty minimal – it just deletes the _pD member variable. In other cases, though, C’s destructor may attempt to do more complicated processing before returning, and in those cases it’s quite possible that that processing will trigger a crash on its own. (For instance, if _pD is defined as a shared_ptr as opposed to a raw pointer, you would likely see a segmentation violation in the code that manipulates the shared_ptr).
This all could have been avoided if the compiler recongized that the declaration of _pD in C’s constructor hid the member variable, and with “-Wshadow” enabled in the compile, that is exactly what happens:
$ clang++ -Wall -Wextra -Wshadow shadow.cpp shadow.cpp:17:10: warning: declaration shadows a field of 'C' [-Wshadow] D* _pD = new D; ^ shadow.cpp:27:7: note: previous declaration is here D* _pD; ^ 1 warning generated. $
gcc supports the flag also, although the message is slightly different
$ g++ -Wall -Wextra -Wshadow shadow.cpp shadow.cpp: In constructor ‘C::C()’: shadow.cpp:17: warning: declaration of ‘_pD’ shadows a member of 'this' $
While both gcc and clang support the “-Wshadow” flag, the implementations are very different.
gcc appears to strive for completeness, and in the process produces so many warnings as to render the use
of “-Wshadow” pretty much useless. That was certainly Linus’ opinion when
he wrote this, and it’s hard to
disagree.
The good news is that the clang developers have come up with a much more useful implementation of “-Wshadow”, which avoids a lot of the problems Linus talks about. For example, on one legacy-ish code base, gcc reports over 1100 shadow warnings vs. just three for clang. There’s a terrific explanation here about how the clang team decides what category a particular diagnostic should belong to.
But, what if we use third-party libraries in our programs? While clang does a very good job of filtering out “false positive” shadow warnings, they can still crop up in some libraries, including Boost. One possible solution is to have wrapper includes that use #pragma’s to suppress (or enable) certain warnings, prior to including the real library headers. That is in fact the approach suggested by the Boost maintainer when someone posted a bug report about the shadow warnings in Boost.
But, that’s tedious, error-prone, inconvenient and expensive. Is there a better way?
It turns out that there is – both gcc and clang provide the -isystem
compiler flag to include header files, subject to
special rules that effectively eliminate warnings,
even in macros that are expanded at compile-time.
Note that if you’re using cmake, the way to enable -isystem
is to use the SYSTEM flag to include_directories, like so:
include_directories(SYSTEM ${Boost_INCLUDE_DIRS})
There are three types of shadow warnings, each with a different cause and potential to cause trouble. The different types are distinguished by what follows after the “warning: declaration shadows a” message:
Type | Explanation |
---|---|
local variable | These are typically the least likely to be real problems, since local variables have limited lifetimes by defintion. Even if these warnings don’t indicate a genuine problem, it is best to eliminate them by changing one or the other variable name, if only to prevent future confusion. |
field of “X” | As in the example above, these shadow warnings often point to a potential problem, given how easy it is to inadvertently include a type prefix in the statement meant to initialize the member variable, thereby declaring (and initializing) a new local variable instead. |
variable in the global namespace | This is perhaps the most dangerous of the three types, since accidentally introducing a new variable into scope can easily go unnoticed. Everything appears to be working correctly, until at some point the newly introduced variable goes out of scope, exposing the un-initialized global variable. |
I originally thought this would be a quick post about a somewhat obscure compiler warning – maybe a “tidbit”, but certainly nothing more than that. But, as Tolkien said about “Lord of the Rings”, “the tale grew in the telling”.
Let’s see what we’ve covered:
And that doesn’t even include one of my original goals, which was to talk about compiler warnings in general, and which ones you want to make sure you use in all your builds. That will have to wait for next time.
At least according to the standard. Different implementations can, and do, behave differently. Or, as the old saying goes: “In theory, there is no difference between theory and practice. In practice, there is”.↩
The first point is that we should be initializing the member variable in the ctor, rather than assigning it, which would make this mistake impossible. That’s a valid point, mostly, but there are times when you can make a case for assignment being a simpler approach — for example, when you have multiple member variables, and when the order of assignment matters. Remember that member variables are initialized in the order of their declaration, not the order in which the initializers appear. Given that the order of declaration is often not obvious, it’s easy to see why one might prefer to use assignment to enforce the order of assignment in the body of the constructor.↩
Another mostly valid point is that, if we are going to assign in the body of the ctor, we should at least initialize the members to some value before entering the constructor. The only defense to that is a misguided attempt to optimize out the initialization code, since we know we’re assigning to the member variable anyway. That’s arguably wrong, but not terribly so, and in any event is pretty common, at least in my experience. (OK, smarty-pants, do YOU always initialize ALL your member variables in EVERY constructor you write? Even if you’re going to assign to them in the body of the constructor? Really? Do you want your merit badge now, or at the jamboree?)↩
Last but not least, we could use value initialization to ensure that even POD types in the class are zero-initialized.↩
Another possibility is that the bits are non-NULL, but the call to free doesn’t immediately crash. Instead, it may leave the data structures used to manage the heap in an inconsistent state, in such a way that it will cause a crash later. This is the worst possible scenario, since this problem is almost impossible to debug. In an upcoming column we’re going to look at this situation in more detail, and talk about ways to avoid it.↩
We’ll be discusing how to use glibc’s error-checking in a future column.↩
clang is a great compiler, with a boatload of extremely helpful tools, including static analysis, run-time memory and data race analysis, and many others. And it’s apparently pretty easy to get those benefits on one of the supported platforms – basically Ubuntu and Mac (via XCode).
That’s fine, but if you get paid to write software, there’s a good chance it’s going to be deployed on RedHat, or one of its variants. And, getting clang working on RedHat is a huge pain in the neck. The good news is that I did the dirty work for you (ouch!), so you don’t have to.
Like almost all compilers, clang is written in a high-level language (in this case C++), so building clang requires a host compiler to do the actual compilation. On Linux this is almost always gcc, since it is ubiquitous on Linux machines.
There’s a hitch, though – as of version 3.3 some parts of clang are written in C++11, so the compiler used to compile clang needs to support the C++11 standard.
This is a real problem with RedHat, since the system compiler supplied with RedHat 6 (the most recent version that is in wide use), is gcc 4.4.7. That compiler does not support C++11, and so is not able to compile clang. So, the first step is getting a C++11-compliant compiler so we can compile clang. For this example, we’re going to choose gcc 4.8.2, for a variety of reasons1. The good news is that gcc 4.8.2 is written in C++ 98, so we can build it using the system compiler (gcc 4.4.7).
The next thing we have to decide is where to install gcc 4.8.2, and we basically have these choices:
We could install in /usr, where the new compiler would replace the system compiler. Once we do that, though, we’ve effectively created a custom OS that will be required on all our development/QA/production machines going forward. If “all our development/QA/production machines” == 1, this may not be a problem, but as the number increases things can get out of hand quickly. This approach also does not lend itself to being able to have more than one version of a particular package on a single machine, which is often helpful.
We could install in /usr/local (the default for gcc, and many other packages when built from source), so the new compiler would coexist with the system compiler. The problem with this approach is that /usr/local can (and in practice often does) rapidly turn into a dumping-ground for miscellaneous executables and libraries. Which wouldn’t be so bad if we were diligent about keeping track of what they were and where they came from, but if we’re going to do that we might as well …
Install somewhere else – it doesn’t really matter where, as long as there’s a convention. In this case, we’re going to use the convention that any software that is not bundled with the OS gets installed in /build/share/<package>/<version>. This approach makes it easy to know exactly what versions of what software we’re running, since we need to specify its install directory explicitly in PATH and/or LD_LIBRARY_PATH. It also makes it much easier to keep track of what everything is and where it came from.
Here’s a script that will download gcc 4.8.2 along with its prerequisites, build it and install it as per the convention we just discussed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
To run the script, change to an empty directory and then simply invoke the script. If you want to keep track of all the commands and output related to the build, you can invoke the script using the trick I wrote about in an earlier post.
Now that we’ve built gcc, we can get started building clang2. By default, clang is built to use the C++ standard library (libstdc++) that is included with gcc. That’s the good news, since that means code generated using clang can be intermixed freely with code generated with gcc – which is almost all the code on a typical Linux machine.
The libstdc++.so that is part of gcc is “versioned”, which means that different library versions can have different symbols defined. Since we chose to install gcc 4.8.2 in a non-standard location, there are several settings that need to be tweaked to have code find and use that version of libstdc++3.
Let’s start with a recap of how that works.
With a default installation of gcc, everything is easy: gcc itself is in /usr/bin, include files are in /usr/include (sort of), and library files are in /usr/lib and/or /usr/lib64. In cases where files are not installed in these locations, gcc itself keeps track of where it should look for dependencies, and the following command will show these locations:
> g++ -E -x c++ - -v < /dev/null
...
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/x86_64-redhat-linux
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/backward
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/include
/usr/include
End of search list.
...
LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.4.7/:/usr/lib/gcc/x86_64-redhat-linux/4.4.7/:/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../:/lib/:/usr/lib/
With our non-standard installation of gcc 4.8.2, the same command shows the values appropriate for that version of the compiler:
> /build/share/gcc/4.8.2/bin/g++ -E -x c++ - -v < /dev/null
...
#include "..." search starts here:
#include <...> search starts here:
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../../include/c++/4.8.2
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../../include/c++/4.8.2/x86_64-unknown-linux-gnu
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../../include/c++/4.8.2/backward
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/include
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/include-fixed
/shared/build/share/gcc/4.8.2/bin/../lib/gcc/../../include
/usr/include
...
LIBRARY_PATH=/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/:/shared/build/share/gcc/4.8.2/bin/../lib/gcc/:/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/shared/build/share/gcc/4.8.2/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../:/lib/:/usr/lib/
The situation with clang is a bit (!) more complicated. Not only does clang need to be able to find its own include files and libraries, but it also needs to be able to find the files for the compiler that clang is built with. In order to successfully build clang with a non-standard compiler, we are going to need to specify the following parameters to the clang build:
CMAKE_C_COMPILER |
The location of the C compiler to use. |
CMAKE_CXX_COMPILER |
The location of the C++ compiler to use. |
CMAKE_INSTALL_PREFIX |
The location where the compiler should be installed. |
CMAKE_CXX_LINK_FLAGS |
Additional flags to be passed to the linker for C++ programs. See below for more information. |
GCC_INSTALL_PREFIX |
Setting this parameter when building clang is equivalent to specifying the |
While all these settings are documented in one place or another, as far as
I know there is no single place that mentions them all. (The clang developers apparently prefer writing code to writing
documentation ;-) So, these settings have been cobbled together from a number
of sources (listed at the end of this article), and tested by much trial and
error.
The first three settings are plain-vanilla cmake settings, but the last two need some additional discussion:
In the clang build script this is set to "-L${HOST_GCC}/lib64 -Wl,-rpath,${HOST_GCC}/lib64"
. What this does is two-fold:
The -L
parameter adds the following directory to the search path for the linker. This is needed so the linker can locate the libraries installed with gcc 4.8.2.
The -Wl,-rpath,
parameter installs a “run path” into any executables (including shared libraries) created during the build. This is needed so any executables created can find their dependent libraries at run-time.
Note that you can display the run path for any executable (including shared libraries) with the following command:
> objdump -x /build/share/clang/trunk/bin/clang++ | grep RPATH
RPATH /build/share/gcc/4.8.2/lib64:$ORIGIN/../lib
Unfortunately, by default, clang looks for include and library files in the standard system locations (e.g., /usr), regardless of what compiler was used to build clang. (I filed a bug report for this behavior, but the clang developers apparently feel this is reasonable behavior. Reasonable people may disagree ;-)
The work-around for this is to specify GCC_INSTALL_PREFIX when building clang – this tells the clang build where the gcc that is being used to build clang is located. Among other things, this determines where the clang compiler will look for system include and library files at compile and link time.
Now that we have that out of the way, we can build clang. The following script will download clang source from svn, build and install it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
|
Note that you can specify a parameter to the script (e.g., -r 224019
) to get a specific version of clang from svn.
Since this article was originally published, there have been some changes to the prerequisites for building clang: you will need cmake 2.8.12.2 or later, and python 2.7 or later.
At this point, we should have a working clang compiler that we can use to build and run our own code. But once again, because the “host” gcc (and libstdc++) are installed in a non-standard location, we need to tweak a couple of build settings to get a successful build.
There are a bunch of ways to specify the compiler, depending on what build system you’re using – I’ll mention a couple of them here.
If you’re using make, you can prefix the make command as follows:
CC=clang CXX=clang++ make ...
If you’re using cmake you can specify the compiler to use on the cmake command line, as follows:
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ...
Personally, I find that ridiculously inconvenient, so in my CMakeLists.txt file I specify the compiler directly:
# cmake doc says this is naughty, but their suggestions are even worse...
if("$ENV{COMPILER}" STREQUAL "gcc")
set(CMAKE_C_COMPILER gcc)
set(CMAKE_CXX_COMPILER g++)
elseif("$ENV{COMPILER}" STREQUAL "clang")
set(CMAKE_C_COMPILER clang)
set(CMAKE_CXX_COMPILER clang++)
endif()
In any of the above, you can either specify the full path to the compiler, or just specify the name of the compiler executable (as above), and make sure that the executable is on your PATH.
Last but not least, if you’re using GNU autotools – you’re on your own, good luck! The only thing I want to say about autotools is that I agree with this guy.
Any code genrated using clang is also going to need to be able to find the libraries that clang was built with at run-time. There are a couple of ways of doing that:
Similar to what we did above when building clang, you can specify the -Wl,-rpath,
parameter to the linker to set a run path for your executables.
Note that if you’re using cmake, it will automatically strip the rpath from all files when running make install
, so you may need to disable that by setting CMAKE_SKIP_INSTALL_RPATH
to false in your build.
Alternatively, you will need to make sure that the proper library directory is on your LD_LIBRARY_PATH
at run-time4.
If you’ve followed the directions above, you should be good to go, but be warned that, just like in “Harry Potter”, messing up any part of the spell can cause things to go spectacularly wrong. Here are a few examples:
CMake Error at cmake/modules/HandleLLVMOptions.cmake:17 (message):
Host GCC version must be at least 4.7!
Linking CXX static library ../../../../lib/libclangAnalysis.a
[ 51%] Built target clangAnalysis
[ 51%] Building CXX object tools/clang/lib/Sema/CMakeFiles/clangSema.dir/SemaConsumer.cpp.o
[ 51%] Building CXX object tools/clang/lib/ARCMigrate/CMakeFiles/clangARCMigrate.dir/TransAutoreleasePool.cpp.o
[ 51%] Building CXX object tools/clang/lib/AST/CMakeFiles/clangAST.dir/ExprConstant.cpp.o
Scanning dependencies of target ClangDriverOptions
[ 51%] Building Options.inc...
../../../../../bin/llvm-tblgen: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by ../../../../../bin/llvm-tblgen)
-Wl,-rpath
parameter, clang won’t be able to find the libraries it needs at compile-time:> clang++ $* hello.cpp && ./a.out
clang++: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.14' not found (required by clang++)
clang++: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by clang++)
clang++: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by clang++)
> clang++ $* hello.cpp && ./a.out
In file included from hello.cpp:1:
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/iostream:40:
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/ostream:40:
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/ios:40:
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/exception:148:
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/exception_ptr.h:143:13: error: unknown type name 'type_info'
const type_info*
^
1 error generated.
I’ve probably missed a couple, but you get the idea.
There may be another way to build clang successfully on a RH-based system, but if there is I’ve yet to discover it. As mentioned earlier, bits and pieces of this information have been found in other sources, including the following:
http://llvm.org/docs/GettingStarted.html#getting-a-modern-host-c-toolchain
http://clang-developers.42468.n3.nabble.com/getting-clang-to-find-non-default-libstdc-td3945163.html
https://github.com/google/sanitizers/wiki/MemorySanitizerBuildingClangOnOlderSystems
http://llvm.org/docs/CMake.html
One reason is that gcc 4.9.0 can’t compile libc++, the llvm version of the C++ standard library – see http://lists.llvm.org/pipermail/cfe-dev/2014-April/036650.html for more detail. While we’re not going to discuss using libc++ in this post, we may get into that later on.↩
You will need at least version 2.8.12.2 of cmake to do the build, which is not native on RH/CentOS 6. That version can be installed using “Add/Remove Software” or yum. (Or, of course, you can build it from source). You will also need python 2.7 or later, which is probably better built from source, since the RH repos apparently use the non-standard name “python27” for the executable.↩
I may go into more detail on this in a later post, but in the meantime if you’re interested you should consult Ulrich Drepper’s “How to Write Shared Libraries” at http://www.akkadia.org/drepper/dsohowto.pdf.↩
This is the approach we use in my shop – we have a hard-and-fast rule that application code cannot contain a run path, and we deliberately strip any existing RPATH entries from code that is being deployed to QA and production as a security measure.↩
I keep reading talk of the sort “I don’t know why anyone bothers with C++ — real programmers use C. C++ is for wussies”, or words to that effect.
Well, a while ago I had to go back to C from working exclusively in C++ for a while, and I have to say that I think the C fanboys are just nuts.
The project I’m referring to involved packaging up NYSE’s (now SR Labs’) “Mama” middleware so it could be released as open source, as well as implementing a new transport adapter for OpenMama using the open-source Avis transport1.
Mama is a high-level API that provides access to a number of middleware transports, including Tibco Rendezvous, Informatica/29 West LBM and NYSE’s own Data Fabric middleware. Mama and Data Fabric are almost exclusively C code, written back in the days when people avoided C++ because of issues with the various compilers. (Does anyone remember the fun we used to have with gcc 2.95 and templates?)
So, at the time using C may have been the right choice, but it’s far from ideal.
Like a lot of C code, what Mama does is encapsulate functionality by using pointers to opaque structs. These ”handles” are created by calling API functions, and then later passed to other API functions to perform actions on the underlying objects represented by the handles.
This is a very popular idiom, and with good reason — hiding the inner details of the implementation insulates applications from changes in the implementation. It’s called “Bridge” by the GOF, and the more colorful “pImpl” by Herb Sutter.
Of course, in C the typical way to accomplish this is with
void
pointers, so the
implementation spends a lot of time casting back and forth between void*
’s and
“real” pointers. With, of course, absolutely no error checking by the compiler.
For example, in the Avis protocol bridge that I implemented for the initial release of OpenMama, there are a bunch of macros that look like this:
#define avisPublisher(publisher) ((avisPublisherBridge*) publisher)
Elsewhere, the code that uses the macro:
mamaMsg_updateString(msg, SUBJECT_FIELD_NAME, 0, avisPublisher(publisher)->mSubject);
Gee, wouldn’t it be nice to be able to define these handles in such a way that
they would be opaque to the applications using the API, but the compiler could
still enforce type-checking? Not to mention not having to cast back and forth
between void*
’s and actual types?
Never mind virtual functions, forget streams (please!) and the STL, ditto templates and operator overloading — if there’s one overriding reason to prefer C++ over C, it’s the compiler’s support for separating interface from implementation that is completely lacking in C.
You see this same “handle” pattern everywhere in C, and it’s “good” C code just because it’s the best that can be done, but if a programmer wrote that code in C++ he’d be laughed out of the building (and rightly so).
Has C++ become big and complicated? Sure. Is the syntax sometimes capricious and counter-intuitive? Absolutely.
But, at least for me, if I never see another void*
as long as I live, that won’t
be too long for me.
When I was a kid I went to Catholic school, and back in those days the nuns would indeed rap your knuckles with a ruler if you misbehaved. That doesn’t happen so much any more, but when I see someone making use of the copy-paste anti-pattern, I’m tempted to reach for a ruler myself. (I know, probably not a good career move ;-)
Short of rapping someone’s knuckles with a ruler, though, how do you show some poor sinner the error of his ways?
Enter CPD, or copy-paste detector. This does pretty much what you would guess from its name – it spins through all the code you give it, and analyzes it for repeated sequences. 1
Here’s an example of running the GUI version against the code I used in an earlier post on smart pointers.
(Note that the “Ignore literals” and “Ignore identifiers” checkboxes are disabled if you select C++ as the language - these options are only implemented for Java currently).
The site has several more examples, but this one just blew my mind – hard to imagine how anyone could write this code in the first place, much less be so confident that it is correct that they just copy and paste it in two different files (with nary a comment to tie the two together)?
===================================================================== Found a 19 line (329 tokens) duplication in the following files: Starting at line 685 of /usr/local/java/src/java/util/BitSet.java Starting at line 2270 of /usr/local/java/src/java/math/BigInteger.java static int bitLen(int w) { // Binary search - decision tree (5 tests, rarely 6) return (w < 1<<15 ? (w < 1<<7 ? (w < 1<<3 ? (w < 1<<1 ? (w < 1<<0 ? (w<0 ? 32 : 0) : 1) : (w < 1<<2 ? 2 : 3)) : (w < 1<<5 ? (w < 1<<4 ? 4 : 5) : (w < 1<<6 ? 6 : 7))) : (w < 1<<11 ? (w < 1<<9 ? (w < 1<<8 ? 8 : 9) : (w < 1<<10 ? 10 : 11)) : (w < 1<<13 ? (w < 1<<12 ? 12 : 13) : (w < 1<<14 ? 14 : 15)))) : (w < 1<<23 ? (w < 1<<19 ? (w < 1<<17 ? (w < 1<<16 ? 16 : 17) : (w < 1<<18 ? 18 : 19)) : (w < 1<<21 ? (w < 1<<20 ? 20 : 21) : (w < 1<<22 ? 22 : 23))) : (w < 1<<27 ? (w < 1<<25 ? (w < 1<<24 ? 24 : 25) : (w < 1<<26 ? 26 : 27)) : (w < 1<<29 ? (w < 1<<28 ? 28 : 29) : (w < 1<<30 ? 30 : 31))))); }
So, if you need to lead someone to the light, try PMD’s copy-paste detector. It may hurt a bit, but a lot less than a sharp rap on the knuckles!
One last caveat about CPD: it does not like symlinks at all – you must give it the real path names for any source files, or you will just get a bunch of “Skipping … since it appears to be a symlink” messages.
]]>One of the banes of corporate life is the status meeting. It would be nice to get rid of them, but then it would be nice to get rid of all the lawyers too1, and I don’t see that happening either.
So, how do we make them better? Well, for starters we could make them shorter. Here’s a way to do that.
I’ve had two really good managers in my career (although I have no idea whether that means I’ve been blessed or cursed ;-) I learned different things from each of my managers – some of them pragmatic and some more in the “touchy-feely” category. This one is purely pragmatic (which is not to say that is all I learned from him, but those stories are for another day).
My manager (I’ll call him Mike, because that’s his name) set up this structure for a trading system project I was working on under his direction. We would have a status meeting on Monday mornings, which would set the agenda for the coming week, and with the following structure in place that meeting would typically take no more than 30 minutes. At the end, everybody knew what they needed to do, and we all got to work with a clear purpose.
In preparation for this meeting, each person would create a status report, which consisted of the following categories, no more and no less:
This is where you list the tasks that you planned to get done during the period, and that were actually completed. And, by completed we’re not talking about 90% – once a task gets into this category, you don’t expect to see it again – if you do, then it wasn’t really complete in the first place.
This sort of implies that tasks can be accomplished in one iteration, and so naturally forces you to think in terms of discrete quanta of accomplishment. To paraphrase Woody Allen, “A project is like a shark. It has to constantly move forward or it dies”.
Of course, in the real world things never go as smoothly as they are supposed to. (Another way of saying this is: “In theory there is no difference between theory and practice. In practice there is.”) So, this is where you list those items that came up “out of the blue”, but which you had to deal with. An example might be production problems, customer engagements, or other things that are unpredictable.
You would like this category to be empty much of the time, but that doesn’t always happen. Certain tasks may have dependencies on external agents over which you have no control, for instance – so even if you’re the world’s best planner, you’re going to get tasks in here from time to time. Typically you would also include a brief explanation of why something didn’t happen that was supposed to, which might feed into the “Issues for Management Attention” section later.
This is where you list the tasks that you intend to complete during the next iteration. Some may be new, and if there’s anything in the “Planned and Not Accomplished” section, those would generally show up here as well until they are completed. Similarly, everything that appears here will end up in either “Planned and Accomplished” or “Planned and Not Accomplished” next time.
Here’s where you discuss any “blockers” that are preventing you from accomplishing your goals. In many cases, this will include dependencies on external entities, and these provide an opportunity for your manager to exercise his or her persuasive powers on other parts of the organization.
And that’s it – just five categories, with a real focus on what’s truly important.
After learning this technique, I’ve used it with great success in all sorts of situations, and have also passed it along to others, who have all been quite happy with it. It strikes just the right balance between too much and too little information, and also focuses the information in a way that is most useful.
On Mike’s project, we had these meetings on Monday mornings, but Friday also works – that way you get it out of the way and come into the new week rarin’ to go.
If you’re a manager, give this a try and see if it doesn’t streamline one of your more onerous tasks. If you’re not a manager, then you probably have one, and you can suggest using this for your status reports. Or, you can just use this template to help you focus yourself on what you need to be doing.
Actually, some of my best friends are lawyers, so this doesn’t apply to them ;-)↩
No, not that – it’s Perl day. (Well, actually it’s just Wednesday, but you get the idea).
Sometimes it seems that everybody likes to hate on Perl, but I think their animus is misdirected. It’s not Perl that’s the problem, it’s those \^\$(.#!)?$ regular expressions.
Or, as Jamie Zawinski once said “Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.”.
Well, I’m here to tell you that it’s possible to write whole Perl programs that actually accomplish useful work, without any regular expressions at all! And, if you do that, you can actually read the code!
It turns out that Perl is a dandy scripting language, and while some may take issue with its flexibility (“There’s more than one way to do it”), others (including me) find that flexibility very useful.
One example of that flexibility is how easy it is to create a Perl program that can read input either from stdin, or from a file specified on the command line.
local *INFILE;
if (defined($ARGV[0])) {
open(INFILE, "<:crlf", "$ARGV[0]") or die "Cant open $ARGV[0]\n";
}
else {
*INFILE = *STDIN;
}
while (<INFILE>) {
}
close(INFILE);
The above snippet does just that, and also works well with command-line parsers
(e.g., GetOpt
) that eat their parameters by removing them from the ARGV
array.
From Robinson Crusoe to Gilligan’s Island to Lost, tales of being stranded on a desert island seem to resonate with people in a special way. Some of that likely has to do with the exotic locales, and the practical challenges of getting water, food and shelter.
But an even more basic part is the unanswered question: “Where am I?” that makes things so – well, mysterious.
Shell scripting can be pretty mysterious too at times, but in this installment we’ll learn how to answer that basic question of “Where am I?” to make shell scripting a little less mysterious.
One of the tenets of the Unix way is brevity, and one consequence of that is that well-behaved programs should be able to find whatever other resources they need without having to be told where they are. Windows attempts to solve this problem with the (gack!) registry, but Unix tends to use a simpler approach: needed resources are placed either in well-known locations (e.g., /etc for system programs), or where they can be found relative to the location of the program itself.
Another attribute of a well-behaved Unix program is that it should be able to run from any location, whether it’s invoked with a full path, or found via the PATH variable.
So, how do we reconcile those two requirements? And specifically, how do we do that in shell scripts? Since – regardless of what your “main” language is, if you’re programming in Unix/Linux, you’re probably also writing a boatload of shell scripts too.
It turns out that, at least in bash, there is a simple but non-obvious way to do get the location of the script file itself, which goes something like this:
SCRIPT_DIR=$(cd $(dirname ${BASH_SOURCE}) && /bin/pwd)
Let’s follow this through and see how it works:
The $( ... )
construct invokes a sub-shell. This is handy since it
allows us to change the environment of the sub-shell (e.g., current directory)
without affecting the current environment.
$BASH_SOURCE
is a builtin variable that gives us the path to the shell
script itself. For instance, if we invoke a script with ./scriptname.sh
,
then that’s what will end up in ${BASH_SOURCE}
.
To get the full path then we extract just the path part with dirname
, again
in a sub-shell.
cd
into that directory, and if successful get the full pathname
with /bin/pwd
.
/bin/pwd
to get the path. This version resolves any
symbolic links to return the actual physical path. There is also a pwd
built-in to bash, but that one does not expand symbolic links by default.
We now have the full path of the script file itself, and can use that to locate any other resources needed by the script. For a real-world example, you can check out the these scripts from my earlier post on visualizing latency.
]]>I’m a visual thinker (I think I may have mentioned that before ), so when I’m analyzing performance, latency, etc. I find it really helpful to be able to visualize what is going on on the machine.
As a result, I had gotten reasonably good at using Excel to produce charts, which sometimes helped to correlate observed behaviors like latency spikes with other events on the machine.
For a bunch of reasons I wanted to move away from Excel, though, and find another tool that would give me the same or better functionality.
For one thing, a little over a year ago I switched to a Mac as my main machine after years of using Windows. There was a certain amount of adjustment, but for the most part it’s been smooth sailing. More than that, I was actually able to recapture some of the fun and excitement I remember from my first Apple (an Apple ][).
I also wanted something that would run on both the Mac and Linux, where I do most of my testing. Last but not least, I wanted something that would be scriptable so I could easily produce consistent charts for multiple test runs.
I looked briefly at R, but ditched it when it used up all the 8GB in my laptop, plus the entire hard disk as swap, for a single dataset of 100,000 points. Probably my bad, but I didn’t have the patience to figure out what I might be doing wrong.
At that point I turned to venerable (some would say crusty) gnuplot. It’s a bit long in the tooth, but I just wanted to plot latency over time, so how hard could that be? Well, I guess it’s pretty easy if you already know how, but starting from scratch is another story.
Which brings me to my rant of the day, directed at techies in general, and to the (us?) Linux/Unix techies in particular.
Short version: I don’t want to learn gnuplot. I don’t even want to have learned gnuplot – even if I could do that by just taking a pill. What I want is to be able to produce decent-looking charts without knowing anything about gnuplot.
To be fair, the gnuplot docs did have some examples – more anyway than you would find in a typical man page, although that’s admittedly a low bar. And while my google-fu is usually pretty good, I just couldn’t find anything on the intertubes that would work for me, so I had to learn just a little gnuplot.
When all else fails, read the instructions.
It turns out that gnuplot works pretty well, and will probably work even better once I learn (sigh) how to use it better.
But you don’t have to learn diddly if you don’t want to. Here is the first in what will hopefully be a series of recipes that you can use with little or no modification. Once you’ve downloaded the repo, enter the following at the command prompt:
./tsd.sh ping.csv x11
Which should result in something like this:
It’s primitive, but that very primitiveness has its own appeal, especially for those of us for whom “UI” means bash, vi or emacs.
A couple of points about the gnuplot command files:
Sometimes you care about the actual time that an event took place, so you can correlate it with some other event; sometimes you don’t. Accordingly, I’ve created two different files: one which displays actual time (ts.gp), the other which calculates and displays deltaT (tsd.gp).
I’ve been programming in C (and later C++) for many years, but I don’t think I’ve ever purposely used the comma operator before. Well, expressions in gnuplot follow C language rules for operators, precedence, etc. and that comma operator turns out to be handy – in this case it lets us update the origin in the same expression that calculates deltaT. (The return value of the comma operator is the right-hand expression).
– (Note that the above requires something like gnuplot 4.6)
gnuplot -e "set terminal"
.Comments, suggestions, pull requests, etc. welcome.
]]>Well, I took one of these tests a while back that actually told me something about myself – it was the “Learning-Style Inventory” test, and what it said about me is that I’m waaaayyy over at the end of the scale when it comes to visual thinking. That gave me an insight into the way my brain works that I’ve found really helpful ever since. So, this next bit was right up my alley, but I’m guessing you’ll like it too.
We read a lot lately about NUMA architecture and how it presents a fundamental change in the way we approach writing efficient code: it’s no longer about the CPU, it’s all about RAM. We all nod and say “Sure, I get that!” Well, I thought I got it too, but until I saw this web page, I really didn’t.
See the full discussion at http://overbyte.com.au/index.php/overbyte-blog/entry/optimisation-lesson-3-the-memory-bottleneck.
]]>Valgrind has been an indispensable tool for C/C++ programmers for a long time, and I’ve used it quite happily – it’s a tremendous tool for doing dynamic analysis of program behavior at run time. valgrind1 can detect reads of uninitialized memory, heap buffer overruns, memory leaks, and other errors that can be difficult or impossible to find by eyeballing the code, or by static analysis tools. But that comes with a price, which in some cases can be quite steep, and some new tools promise to provide some or all of the functionality valgrind provides without the drawbacks.
For one thing, valgrind can be extremely slow. That is an unavoidable side-effect of one of valgrind’s strengths, which is that it doesn’t require that the program under test be instrumented beforehand – it can analyze any executable (including shared objects) “right out of the box”. That works because valgrind effectively emulates the hardware the program runs on, but that leads to a potential problem: valgrind instruments all the code, including shared objects –and that includes third-party code (e.g., libraries, etc.) that you may not have any control over.
In my case, that ended up being a real problem. The main reason being that a significant portion of the application I work with is hosted in a JVM (because it runs in-proc to a Java-based FIX engine, using a thin JNI layer). The valgrind folks say that the slowdown using their tool can be up to 20x, but it seemed like more, because the entire JVM was being emulated.
And, because valgrind emulates everything, it also detects and reports problems in the JVM itself. Well, it turns out that the JVM plays a lot of tricks that valgrind doesn’t like, and the result is a flood of complaints that overwhelm any potential issues in the application itself.
So, I was very interested in learning about a similar technology that promised to address some of these problems. Address Sanitizer (Asan from here on) was originally developed as part of the clang project, and largely by folks at Google. They took a different approach: while valgrind emulates the machine at run-time, Asan works by instrumenting the code at compile-time.
That helps to solve the two big problems that I was having with valgrind: its slowness, and the difficulty of excluding third-party libraries from the analysis.
Since I was already building the application using clang for its excellent diagnostics and static analysis features, I thought it would be relatively straightforward to introduce the Asan feature into the build. Turns out there is a bump in that road: clang’s version of Asan is supplied only as a static library that is linked into the main executable. And while it should be possible to re-jigger things to make it work as a shared library, that would turn into a bit of science project. That, and the fact that the wiki page discussing it (https://github.com/google/sanitizers/wiki/AddressSanitizerAsDso) didn’t sound particularly encouraging (“however the devil is in the detail” – uhh, thanks, no).
Rats! However, the wiki page did mention that there was a version of Asan that worked with gcc, and that version apparently did support deployment as a shared object. So, I decided to give that a try…
It turns out that the gcc developers haven’t been sitting still – in fact, it looks like there is a bit of a healthy rivalry between the clang and gcc folks, and that’s a good thing for you and me. Starting with version 4.8 of the gcc collection, Asan is available with gcc as well.2
Getting the latest gcc version (4.8.2 as of this writing), building and installing it was relatively straight-forward. By default, the source build installs into /usr/local, so it can co-exist nicely with the native gcc for the platform (in the case of Red Hat/CentOS 6.5, that is the relatively ancient gcc 4.4 branch).
Including support for Asan in your build is pretty simple – just include the -fsanitize=address
flag in both the compile and link step. (Note that this means you need to invoke the linker via the compiler
driver, rather than directly. In practice, this means that the executable you specify for the link step should be
g++ (or gcc), not ld).
While not strictly required, it’s also a very good idea to include the -fno-omit-frame-pointer
flag
in the compile step. This will prevent the compiler from optimizing away the frame pointer (ebp) register. While
disabling any optimization might seem like a bad idea, in this case the performance benefit is likely minimal at best3, but the
inability to get accurate stack frames is a show-stopper.
If you’re checking an executable that you build yourself, the prior steps are all you need – libasan.so will get linked
into your executable by virtue of the -fsanitize=address
flag.
In my case, though, the goal was to be able to instrument code running in the JVM. In this case, I had to force libasan.so
into the executable at runtime using LD_PRELOAD
, like so:
LD_PRELOAD=/usr/local/lib64/libasan.so.0 java ...
And that’s it!
There are a bunch of options available to tailor the way Asan works: at compile-time you can supply a “blacklist” of functions that
Asan should NOT instrument, and at run-time you can further customize Asan using the ASAN_OPTIONS
environment variable, which
is discussed here.
By default, Asan is silent, so you may not be certain that it’s actually working unless it aborts with an error, which would look like one of these.
You can check that Asan is linked in to your executable using ldd:
$ ldd a.out linux-vdso.so.1 => (0x00007fff749ff000) libasan.so.0 => /usr/local/lib64/libasan.so.0 (0x00007f57065f7000) libstdc++.so.6 => /usr/local/lib64/libstdc++.so.6 (0x00007f57062ed000) libm.so.6 => /lib64/libm.so.6 (0x0000003dacc00000) libgcc_s.so.1 => /usr/local/lib64/libgcc_s.so.1 (0x00007f57060bd000) libc.so.6 => /lib64/libc.so.6 (0x0000003dad000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003dad800000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003dad400000) /lib64/ld-linux-x86-64.so.2 (0x0000003dac800000)
You can also up the default verbosity level of Asan to get an idea of what is going on at run-time:
export ASAN_OPTIONS="verbosity=1:..."
If you’re using LD_PRELOAD
to inject Asan into an executable that was not built
using Asan, you may see output that looks like the following:
==25140== AddressSanitizer: failed to intercept 'memset' ==25140== AddressSanitizer: failed to intercept 'strcat' ==25140== AddressSanitizer: failed to intercept 'strchr' ==25140== AddressSanitizer: failed to intercept 'strcmp' ==25140== AddressSanitizer: failed to intercept 'strcpy' ==25140== AddressSanitizer: failed to intercept 'strlen' ==25140== AddressSanitizer: failed to intercept 'strncmp' ==25140== AddressSanitizer: failed to intercept 'strncpy' ==25140== AddressSanitizer: failed to intercept 'pthread_create' ==25140== AddressSanitizer: libc interceptors initialized
Don’t worry – it turns out that is a bogus warning related to running Asan as a shared object. Unfortunately, the Asan developers don’t seem to want to fix this (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58680).
So, how did this all turn out? Well, it’s pretty early in the process, but Asan has already caught a memory corruption problem that would have been extremely difficult to track down otherwise. (Short version is that due to some unintended name collissions between shared libraries, we were trying to put 10 pounds of bologna in a 5 pound sack. Or, as one of my colleagues more accurately pointed out, 8 pounds of bologna in a 4 pound sack ;-)
valgrind is still an extremely valuable tool, especially because of its convenience and versatility; but in certain edge cases Asan can bring things to the table, like speed and selectivity, that make it the better choice.
Before closing there are a few more things I want to mention about Asan in comparison to valgrind:
If you look at the processes using Asan with top, etc. you may be a bit shocked at first to see they are using 4TB (or more) of memory. Relax – it’s not real memory, it’s virtual memory (i.e., address space). The algorithm used by Asan to track memory “shadows” actual memory (one bit for every byte), so it needs that whole address space. Actual memory use is greater with Asan as well, but not nearly as bad as it appears at first glance. Even so, Asan disables core files by default, at least in 64-bit mode.
As hoped, Asan is way faster than valgrind, especially in my “worst-case” scenario with the JVM, since the only code that’s paying the price of tracking memory accesses is the code that is deliberately instrumented. That also eliminates false positives from the JVM, which is a very good thing.
As for false positives, the Asan folks apparently don’t believe in them, because there is no “suppression” mechanism like there is in valgrind. Instead, the Asan folks ask that if you find what you think is a false positive, you file a bug report with them. In fact, when Asan finds a memory error it immediately aborts – the rationale being that allowing Asan to continue after a memory error would be much more work, and would make Asan much slower. Let’s hope they’re right about the absence of false positives, but even so this “feature” is bound to make the debug cycle longer, so there are probably cases where valgrind is a better choice – at least for initial debugging.
Asan and valgrind have slightly different capabilities, too:
Asan can find stack corruption errors, while valgrind only tracks heap allocations.
Both valgrind and Asan can detect memory leaks (although Asan’s leak checking support is “still experimental” - see https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer).
valgrind also detects reads of un-initialized memory, which Asan does not.
A detailed comparison of Asan, valgrind and other tools can be found here.
http://en.wikipedia.org/wiki/AddressSanitizer
https://github.com/google/sanitizers/wiki/AddressSanitizer
http://clang.llvm.org/docs/AddressSanitizer.html
In this paper, I use the term valgrind, but I really mean valgrind with the memcheck tool. valgrind includes a bunch of other tools as well – see http://valgrind.org for details.↩
As is another tool, the Thread Sanitizer, which detects data races between threads at run-time. More on that in an upcoming post.↩
Omitting the frame pointer makes another register (ebp) available to the compiler, but since there are already at least a dozen other registers for the compiler to use, this extra register is unlikely to be critical. The compiler can also omit the code that saves and restores the register, but that’s a couple of instructions moving data between registers and L1 cache. ↩