Friday, 17 May 2013

Kernel tracing using lttng

LTTng (Linux Trace Toolkit - next generation) is a highly efficient system tracer that allows tracing of the kernel and userspace. It also provides tools to view and analyse the gathered trace data.  So let's see how to install and use LTTng kernel tracing in Ubuntu. First, one has to install the LTTng userspace tools:
 sudo apt-get update  
 sudo apt-get install lttng-tools  
LTTng was already recently added into the Ubuntu 13.10 Saucy kernel, however, with earlier releases one needs to install the LTTng kernel driver using lttng-modules-dkms as follows:

 sudo apt-get install lttng-modules-dkms  
It is a good idea to sanity check to see if the tools and driver are installed correctly, so first check to see the available kernel events on your machine:
 sudo lttng list -k  
And you should get a list similar to the following:
 Kernel events:  
 -------------  
    mm_vmscan_kswapd_sleep (loglevel: TRACE_EMERG (0)) (type: tracepoint)  
    mm_vmscan_kswapd_wake (loglevel: TRACE_EMERG (0)) (type: tracepoint)  
    mm_vmscan_wakeup_kswapd (loglevel: TRACE_EMERG (0)) (type: tracepoint)  
    mm_vmscan_direct_reclaim_begin (loglevel: TRACE_EMERG (0)) (type: tracepoint)  
    mm_vmscan_memcg_reclaim_begin (loglevel: TRACE_EMERG (0)) (type: tracepoint)  
 ..  
Next, we need to create a tracing session:
 sudo lttng create examplesession  
..and enable events to be traced using:
 sudo lttng enable-event sched_process_exec -k  
One can also specify multiple events as a comma separated list. Next, start the tracing using:
 sudo lttng start  
and to stop and complete the tracing use:
 sudo lttng stop  
 sudo lttng destroy  
and the trace data will be saved in the directory ~/lttng-traces/examplesession-[date]-[time]/.  One can examine the trace data using the babeltrace tool, for example:
 sudo babeltrace ~/lttng-traces/examplesession-20130517-125533  
And you should get a list similar to the following:
 [12:56:04.490960303] (+?.?????????) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/bin/firefox", tid = 4892, old_tid = 4892 }  
 [12:56:04.493116594] (+0.002156291) x220i sched_process_exec: { cpu_id = 0 }, { filename = "/usr/bin/which", tid = 4895, old_tid = 4895 }  
 [12:56:04.496291224] (+0.003174630) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/lib/firefox/firefox", tid = 4892, old_tid = 4892 }  
 [12:56:05.472770438] (+0.976479214) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/lib/libunity-webapps/unity-webapps-service", tid = 4910, old_tid = 4910 }  
 [12:56:05.478117340] (+0.005346902) x220i sched_process_exec: { cpu_id = 2 }, { filename = "/usr/bin/ubuntu-webapps-update-index", tid = 4912, old_tid = 4912 }  
 [12:56:10.834043409] (+5.355926069) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/top", tid = 4937, old_tid = 4937 }  
 [12:56:13.668306764] (+2.834263355) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/bin/ps", tid = 4938, old_tid = 4938 }  
 [12:56:16.047191671] (+2.378884907) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/sudo", tid = 4939, old_tid = 4939 }  
 [12:56:16.059363974] (+0.012172303) x220i sched_process_exec: { cpu_id = 3 }, { filename = "/usr/bin/lttng", tid = 4940, old_tid = 4940 }  
The LTTng wiki contains many useful worked examples and is well worth exploring.

As it stands, LTTng is relatively light weight.   Research by Romik Guha Anjoy and Soumya Kanti Chakraborty shows that LTTng describes how the CPU overhead is ~1.6% on a Intel® CoreTM 2 Quad with four 64 bit Q9550 cores.  With measurements I've made with oprofile on a Nexus 4 with 1.5 GHz quad-core Snapdragon S4 Pro processor shows a CPU overhead of < 1% for kernel tracing.  In flight recorder mode, one can generate a lot of trace data. For example, with all tracing enabled running multiple stress tests I was able to generate ~850K second of trace data, so this will obviously impact disk I/O.

Wednesday, 8 May 2013

Getting started with oprofile on Ubuntu

Oprofile is a powerful system wide profiler for Linux.  It can profile all running code on a system with minimal overhead.   Running oprofile requires the uncompressed vmlinux image, so one has to also install the kernel .ddeb images.

To install oprofile:
 sudo apt-get update && sudo apt-get install oprofile
..and then install the kernel .ddebs:
 echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | \  
 sudo tee -a /etc/apt/sources.list.d/ddebs.list  
 sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01  
 sudo apt-get update  
 sudo apt-get install linux-image-$(uname -r)-dbgsym  
 ..the installed vmlinux image can be found in /usr/lib/debug/boot/vmlinux-$(uname-r)

Oprofile is now ready to be used.  Let's assume we want to profile the following command:
 dd if=/dev/urandom of=/dev/null bs=4K  
First, before running opcontrol, one may have to stop the NMI watchdog to free up counter 0 using the following:
 echo "0" | sudo tee /proc/sys/kernel/watchdog  
Next, we tell opcontrol the location of vmlinux, separate out kernel samples, initialize, reset profiling and start profiling:
 sudo opcontrol --vmlinux=/usr/lib/debug/boot/vmlinux-$(uname -r)  
 sudo opcontrol --separate=kernel  
 sudo opcontrol --init  
 sudo opcontrol --reset  
 sudo opcontrol --start  
 ..and run the command we want to profile for the desired duration. Next we stop profiling, generate a report for the executable we are interested in and de-initialize oprofile using:
 sudo opcontrol --stop  
 sudo opreport image:/bin/dd -gl  
 sudo opcontrol --deinit  
The resulting output from opreport is as follows:
 Using /var/lib/oprofile/samples/ for samples directory.  
 warning: /kvm could not be found.  
 CPU: Intel Ivy Bridge microarchitecture, speed 2.501e+06 MHz (estimated)  
 Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000  
 samples %    image name        symbol name  
 55868  59.8973 vmlinux-3.9.0-0-generic sha_transform  
 14942  16.0196 vmlinux-3.9.0-0-generic random_poll  
 10971  11.7622 vmlinux-3.9.0-0-generic ftrace_define_fields_random__mix_pool_bytes  
 3977   4.2638 vmlinux-3.9.0-0-generic extract_buf  
 1905   2.0424 vmlinux-3.9.0-0-generic __mix_pool_bytes  
 1596   1.7111 vmlinux-3.9.0-0-generic _mix_pool_bytes  
 900    0.9649 vmlinux-3.9.0-0-generic __ticket_spin_lock  
 737    0.7902 vmlinux-3.9.0-0-generic copy_user_enhanced_fast_string  
 574    0.6154 vmlinux-3.9.0-0-generic perf_trace_random__extract_entropy  
 419    0.4492 vmlinux-3.9.0-0-generic extract_entropy_user  
 336    0.3602 vmlinux-3.9.0-0-generic random_fasync  
 146    0.1565 vmlinux-3.9.0-0-generic sha_init  
 133    0.1426 vmlinux-3.9.0-0-generic wait_for_completion  
 129    0.1383 vmlinux-3.9.0-0-generic __ticket_spin_unlock  
 72     0.0772 vmlinux-3.9.0-0-generic default_spin_lock_flags  
 69     0.0740 vmlinux-3.9.0-0-generic _copy_to_user  
 35     0.0375 dd            /bin/dd  
 23     0.0247 vmlinux-3.9.0-0-generic __srcu_read_lock  
 22     0.0236 vmlinux-3.9.0-0-generic account  
 15     0.0161 vmlinux-3.9.0-0-generic fsnotify
...  
This example just scratches the surface of the capabilities of oprofile. For further reading I recommend reading the oprofile manual as it contains some excellent examples.

Friday, 26 April 2013

Firmware Test Suite New Features in Ubuntu Raring 13.04

The Firmware Test Suite (fwts) is a tool containing a large set of tests to exercise and diagnose firmware related bugs in x86 PC firmware.  So what new shiny features have appeared in the new Ubuntu Raring 13.04 release?

UEFI specific tests to exercise and stress test various UEFI run time services:
 
  * Stress test for miscellaneous run time service interfaces.
  * Test get/set time interfaces.
  * Test get/set wakeup time interfaces.
  * Test get variable interface.
  * Test get next variable name interface.
  * Test set variable interface.
  * Test query variable info interface. 
  * Set variable interface stress test.
  * Query variable info interface stress test.
  * Test Miscellaneous runtime service interfaces.

These use a new kernel driver to allow fwts to access the kernel UEFI run time interfaces.  The driver is built and installed using DKMS.

ACPI specific improvements:

  * Improved ACPI 5.0 support
  * Annotated ACPI _CRS (Current Resource Settings) dumping.

Kernel log scanning (finds and diagnoses errors as reported by the kernel):

  * Improved kernel log scanning with an additional 450 tests.

This release also includes many small bug fixes as well as minor improvements to the layout of the output of some of the tests.

Many thanks to Alex Hung, Ivan Hu, Keng-Yu Lin and Matt Fleming for all the improvements to fwts for this release.

Sunday, 21 April 2013

Valgrind stack traces

Sometimes when debugging an application it is useful to generate a stack dump when a specific code path is being executed.  The valgrind tool provides a very useful and easy to use mechanism to do this:

1. Add in the following to the source file:
 #include <valgrind/valgrind.h>  
2. Generate the stack trace at the point you desire (and print a specific message) using VALGRIND_PRINTF_BACKTRACE(), for example:
 VALGRIND_PRINTF_BACKTRACE("Stack trace @ %s(), %d", __func__, __LINE__);  
3. Run the program with valgrind.  You may wish to use the --tool=none option to make valgrind run a little faster:
  valgrind --tool=none ./generate/unix/bin64/acpiexec *.dat  
4. Observe the strack trace. For example, I added this to the ACPICA acpiexec in AcpiDsInitOneObject() and got stack traces such as:
 ACPI: SSDT 0x563a480 00249 (v01 LENOVO TP-SSDT2 00000200 INTL 20061109)  
 **7129** Stack trace @ AcpiDsInitOneObject(), 174  at 0x416041: VALGRIND_PRINTF_BACKTRACE (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x4160A6: AcpiDsInitOneObject (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x441F76: AcpiNsWalkNamespace (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x416312: AcpiDsInitializeObjects (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x43D84D: AcpiNsLoadTable (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x450448: AcpiTbLoadNamespace (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x4502F6: AcpiLoadTables (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x405D1A: AeInstallTables (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  
 ==7129==  by 0x4052E8: main (in /home/king/repos/acpica/generate/unix/bin64/acpiexec)  

There are a collection of very useful tricks to be found in the Valgrind online manual which I recommend perusing at your leisure.

Friday, 1 March 2013

Pragmatic Graphing

Over the past few days I have been analysing various issues and also doing some background research, so I have been collecting some rather large sets of data to process.   Normally I filter, re-format and process the data using a bunch of simple tools such as awk, tr, cut, sort, uniq and grep to get the data into some form where it can be plotted using gnuplot. 

The UNIX philosophy of piping together a bunch of tools to produce the final output normally works fine, however, graphing the data with gnuplot always ends up with me digging around in the online gnuplot documentation or reading old gnuplot files to remind myself exactly how to plot the data just the way I want.   This is fine for occasions where I gather lots of identical logs and want to compare results from multiple tests, the investment in time to automate this with gnuplot is well worth the hassle.   However, some times I just have a handful of samples and want to plot a graph and then quickly re-jig the data and perhaps calculate some statistical information such a trend lines.  In this case, I fall back to shoving the samples into LibreOffice Calc and slamming out some quick graphs.

This makes me choke a bit.  Using LibreOffice Calc starts to make me feel like I'm an accountant rather than a software engineer.  However, once I have swallowed my pride I have come to the conclusion that one has to be pragmatic and use the right tool for the job.  To turn around small amounts of data quickly, LibreOffice Calc does seem to be quite useful.   For processing huge datasets and automated graph plotting, gnuplot does the trick (as long as I can remember how to use it).   I am a command line junkie and really don't like using GUI based power tools, but there does seem to be a place where I can mix the two quite happily.

Wednesday, 19 December 2012

..and now something seasonal

The Christmas holidays are almost here and to celebrate I've mildly obfuscated some C that prints the lyrics to a traditional English carol. The source code can be found here and compiled and run using:

#include <stdio.h>
#include <stdbool.h>
#include <malloc.h>
#define H malloc (1237) ;
#define c while
#define W >> /* fallen right tree */
#define w << /* fallen left tree */
#define B(wish,times) ((_ &wish) times 2)
#define L {o Q=q;p(3[Z]),P(false[Z],q),\
p(q>2?"th":N),p(Z[2]);c(true+Q)p(q|Q?Q?\
N:4[Z]:"a "),P(1[Z],Q),p(Q>1?", ":N),--\
Q;p(".\n\n"),++q;}
char typedef o;void typedef d

                         ;
                        o*N
                       ="";o
                      q;o*O(o
                       *p){o
                      *P,_;o*
                     f=P=H;c(_
                    =*p)_^=1,*f
                   ++=B(1,w)|B(4
                    ,W)|B(2,w)|
                   B(8,W)|(_&240
                  ),++p,*f=false;
                 return P;}d p(o*f
                ){fputs(f,stdout);}
                 d P(o*s,o o){c( o
                --){c(122-*s)s++; s
               ++;}c(122-*s)putchar(
              *s++);} o main(void){o*
             Z[]={O("hgy}p{}dmnj`{pcg"
              "y`{hny{hgh{}gs{}dxdj{"
             "dglc{jgj{pdj{dbdxdj{p|d"
            "bh{"),O("qeypyg`ld!gj!e!q"
           "dey!pydd{p|n!ptypbd!`nxd}{p"
          "cydd!Hydjmc!Cdj}{hnty!mnbbw!i"
                       "gy`"
                       "}{h"
                       "gxd"
                       "!ln"
                       "b`d"
                 "j!ygjl}{}gs!ldd}"
                 "d&e&bewgjl{}dxdj"
                  "!}|ej}&e&}|gff"
                  "gjl{dglcp!feg`"
"}&e&fgbogjl{jgjd!be`gd}!`ejmgjl{pdj!bny`}&e&bdeqgjl{"
"dbdxdj!qgqdy}&qgqgjl{p|dbxd!`ytffdy}!`ytffgjl{") ,O (
"!`ew!nh!Mcyg}pfe}!fw!pytd!bnxd!lexd!pn!fd!"),O("Nj!p"
"cd!"),O("!ej`!e!")};L L L L L L L L L L L L /*Xmas*/}