Strace, Truss & Dtrace
At work, attempting to port a large program from linux to Solaris i hit a snag - I successfully compiled it with the Sun C compiler in 32 bit successfully. However this needed to be ported to 64 bit. The compilation was fine all the errors were solved with relative success by passing compiler flags and finding the correct libraries, unfortunately running it gave an instance Segmentation fault.
A Segmentation Fault is a fault that is given when a program tries to use memory to which it does not have permission either by trying to read or write to it… Basically it is a total pain in the ass… Debugging these types of problems can be tricky.
Using C one of the most common, and first found causes of this is with the scanf() function. For instace compiling the following code will read a number from the standard input and write it back out:
#include <stdio.h> int main(void) { int n; while (scanf("%d", &n) == 1){ printf ("%d\n", n); } return 0; }
Running this should work fine, now lets try editing this code and instead of passing &n (the pointer) as an option to scanf() lets just pass the int n:
#include <stdio.h> int main(void) { int n; while (scanf("%d", n) == 1) printf ("%d\n", n); return 0; }
Running this will give us a segmentation fault. This in a large program can be insanely difficult to debug as your pointer could be defined a long time before it has been called to a function. So how can you start to debug this?
Well the tool i used for the first time was truss (the linux version being strace, and the osx version being dtrace) what this does is allow you to view as system calls a process makes as it is being called - or at what system call it crashes on. In other words if you are trying to open a file or read a memory address that can’t be called you will see it here:
read(0, "1\n", 1024) = 2
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Here is the final lines of output after running the strace command on my faulty scanf() code. It shows clearly that my program dies as i try to read the line from the standard input. The other important thing is that i didn’t have to recompile my code with a debugging flag to try to work out where my bug was. Strace (etc) reads the system calls - therefore telling you when and where you read memory or files, remember here that unix assumes they are essentially the same thing.
There is a huge amount of these programs that i haven’t talked about here - it is a very simple use of the command however one i wanted to document (as much for my own reference in the future as much as for anyone else!) because it is an essential command for trying to figure out why a program is dying, you don’t even need the source code in order to do this kind of debugging. I think its also important to be aware that this tool can also be used for ensuring your code is secure, integer wrapping or other similar vulnerabilities can be assessed using tools like this.