Tuesday, 30 June 2009

Linux Filesystem Benchmarks

Yesterday Phoronix published an article benchmarking the ext3, ext4, xfs, btrfs and NILFS2 filesystems based upon the 21/06/2009 daily build of Ubuntu 9.10 (Linux 2.6.30). The tests used a 7200RPM SATA 2.0 Seagate ST3250310AS drive and unfortunately didn't use any fast Solid State Drives.

Kudos to Phoronix for producing a wide set of benchmarks. The results show ext4 proves to be a good all-rounder choice. However, a few things to remember are:

1) Ubuntu 9.10 will be 2.6.31 based, so expect some improvements with btrfs in the newer kernel.
2) Ubuntu will have some SSD based optimisations, such a aligning the partitions to natural flash block boundaries to squeeze more performance out of SSDs.

SSDs are becoming more popular on netbooks and laptops, and with the benefit of excellent random seek time and fast parallel block read/writes (assume 2,4 or 8 way native striping on the SSD) the landscape may change with respect to benchmarkable filesystem performance.

I suspect we will see btrfs get tweaked and tuned more over the next 6-12 months, and perhaps ext4 will lose it's crown on the Phoronix tests.

What ever happens, the move to SSD and the focus on ext4 vs btrfs will lead to faster booting and more efficient filesystem performance. Onward and upward!

Monday, 29 June 2009

Ubuntu Kernel Team Knowledge Base

The Ununtu Kernel Team Knowledge base is a wiki page containing all sorts of Kernel related know-how, tricks and tips and runes. We try to keep it up to date, and we hope it will help anyone who wants to build their own kernels or generally wants to poke around at all things kernel related.

Please feel free to visit this wiki page and look around, read and contribute too! We believe that good know-how should be documented and shared around, so we welcome contributions and corrections! Enjoy!

Sunday, 28 June 2009

Some Much Complexity, So Little Time

There used to be a time when one could get to understand how things worked. For example, back in the late 80's I had a Commodore 64 and after spending many hours pouring over Kernal ROM disassembly books I got pretty familiar with how it ticked. I even had a hefty 1541 drive disassembly reference book as I was trying to figure out how to write turbo loaders. OK, so I didn't have the original source, but I had time, and there was only tens of thousands of lines of assembler to understand.

Nowadays, things are radically different. The Linux Kernel is huge: Millions of lines of C and some assembler, tens of thousands of lines change between versions, but we at least have the source (and documentation!). There just is not enough time in the day to get to understand all of it well enough!

Fortunately there are well defined interfaces and well documented sub-systems, but I fear we are in an age were no mere mortal can fully understand exactly what is going on in the kernel. Complexity reigns.

A typical week may see me dipping in and out of different parts of the kernel - looking at wifi, audio and ACPI. Figuring out BIOS interactions and trying to work out the best kernel tweaks to sort out issues. It can be like an intellectual game, figuring out exactly how some parts work or interact with hardware for the first time can be a challenge, but equally, very rewarding too. Some bugs sometimes seem like they just cannot happen, then you get some different insight to how a system is behaving and then you get that "ahah!" moment and everything becomes very clear and you get a solution, and this is very satisfying indeed.

So, in a way, it's good that the kernel is so large, complex and mysterious in places. There is plenty of scope to keep my gray cells busy; always new features to learn about, new twists to negotiate, new things to discover.

I may get nostalgic about the good old days where life was simpler, code was smaller and less of it to comprehend, but less face it, I'd rather be working on the Linux kernel than writing 8 bit 6510 assembler any day :-)

Increasing Wifi Beacon Interval to Save Power

I was reading "802.11 Wireless Networks" (Matthew S Gast, O'Reilly, ISBN 0-596-00183-5) and stumbled upon the fact that increasing the Beacon Interval on an Access Point (AP) can be a good thing. Firstly, the default setting on my AP was 50ms which means all my Wifi enabled kit in the house is being interrupted 20 times a second, which is a tad excessive. By increasing the interval to 1000ms my laptop is interrupted less frequently and hence saves some power. Secondly, increasing the beacon interval means I get a little more available channel capacity for my data.

The downside is that passive scans of the network take a little longer and also mobile devices on my network cannot move about so rapidly while maintaining network capacity - but this is not a big deal for me since my laptops generally stay fixed on my desk, dining room table or lap.

So now I've tweaked this setting powertop is reporting that iwl3945 wifi interrupts have dropped significantly from ~20/sec to ~1/sec on a idle system.

The beacon intervals section of the Less Watts Wifi power management page also concurs with this as being a good way to reduce power consumption. Scale this up in an office environment when tens of machines are connected to an AP, and you start to see some valid reasons for making this change to the default setting.

Saturday, 27 June 2009

Getting more out of your kernel oops message.

A Kernel Oops message provides a range of debugging information: register dump, process state, and stack trace too. Sometimes the stack dump can be fairly long and the first lines can scroll off the top of a 25 line Virtual Console. To capture more of the oops message, try the following:

chvt 1
consolechars -f /usr/share/consolefonts/Uni1-VGA8.psf.gz

This switches to an 8 pixel high font, doubling the number of rows to 50.

Sometimes this is not sufficient, a stack dump may still scroll the top of the Oops message off the console. One trick I use is to rebuild the kernel with the stack dump removed, just to capture the initial Oops information. To do this, modify dump_stack() in arch/x86/kernel/dumpstack_*.c and comment out the call to show_trace(). It's a hack, but it works!

Ultra Drinkable Dell Laptop?!

While on my travels to UDS my colleague spotted this sign in an airport technology shop which I could not resist taking a photo of. Is Dell making an Ultra Drinkable Laptop now? :-)

Friday, 26 June 2009

Flash and battery management

In order to reduce power consumption in my laptop, I've been using powertop, which is a nifty tool from Intel. If you've not used it already, you can install and run it using:

sudo apt-get install powertop

sudo powertop

Powertop is a great tool to see which processes and causing the most wakeups on a so-called "idle" system. In most cases I see power being sucked away by unnecessary wakeups with Flash running in my web browser. Every little animated advert us causing wakeups, scale that up with multiple flash instances on a busy page and lots of tabs open on the browser and soon you see battery life being reduced.

So just how much power is being used? A couple of Watts per laptop? Scale this up across the globe and we soon see a massive carbon footprint caused Flash adverts and the like. And I'm just as hypocritical; this blog has Google ads embedded in it which are sucking your battery life away as you are reading this!

For netbooks that chug along on Atom processors at 1.3 to 1.6GHz this can be a significant chunk of CPU cycles consumed by Flash animation. So next time you wonder why your battery lasted only 2 hours instead of the promised 3 or 4 hours don't blame the OS - blame the glorious Web experience! :-)

Ubuntu Power Saving Improvements

Yesterday I compared power consumption between two different versions of Ubuntu. I dual boot installed Ubuntu Intrepid and Jaunty on a Dell Inspiron 6400 laptop (Dual core 1.6GHz, 1MB RAM) and installed powertop and also DVD reading software using packages from Medibuntu. I then removed the battery from the laptop and plugged the mains power adapter into a power meter to measure power consumption.

My aim was to benchmark the power consumption for three scenarios:

1) Idle for 10 minutes
2) Running Firefox on 3 Flash enabled sites (slashdot.com, news.bbc.co.uk and digg.com) for 10 minutes
3) Playing 4 minutes from a DVD

I ran powertop to look at how much time the system was in the C0 state (CPU running) and to observe the number of Wakeups-from-idle per second. I also installed sensors-applet to monitor CPU and HDD temperature changes during the benchmarking.

In the Idle case, Intrepid was waking up ~60.5 times a second and drawing 30.0 Watts while Jaunty was waking up ~56.8 times a second and drawing 29.2 Watts, so there was some power saving improvement with Jaunty. CPU temperatures were comparable at 42 degrees C (which is quite hot!). There was very little difference between times in C0 (running) states between Intrepid and Jaunty.

For the Firefox test the number of wakeups was difficult to get a stable measurement, since the flash was generating a lot of irregular activity and so it was difficult to compare like for like. Intrepid was drawing ~30.6 Watts and the CPU was ~43.5 degrees C, where as for Jaunty I observed ~29.3 Watts and a higher temperature of ~45.4 degrees C, which is a little perplexing.

Finally, for the VLC DVD playing test. Intrepid drew 35.7 Watts and the CPU warmed to 49.3 degrees C while Jaunty drew 35.0 Watts and warmed to 51.8 degrees C (again a little perplexing why it's warmer and drew less power overall).

From my crude set of tests we can see that Jaunty does seem to show some power improvements in all three test scenarios. As yet, I'm perplexed why Jaunty is drawing less power but the CPU is getting slightly warmer in the non-idle tests; this needs further investigation (any ideas why?)

Wednesday, 24 June 2009

Installing Jaunty UNR in VirtualBox

Sometimes it's useful to be able to run Ubuntu as a virtual machine inside VirtualBox for testing and debugging purposes. The problem with the basic version of VirtualBox is that it does not support installing from USB, which is a problem when faced with installing from USB images, such as Ubuntu Netbook Remix (UNR). To get around this, install UNR as follows:

If you've not already installed VirtualBox, you can install it using:

sudo apt-get install virtualbox-ose

Convert the USB bootable image into a bootable VirtualBox VDI image:

VBoxManage convertfromraw ubuntu-9.04-netbook-remix-i386.img ubuntu-9.04-netbook-remix-bootable-i386.vdi

Create an 8GB empty disc image to install this into:

VBoxManage createhd -filename ubuntu-9.04-netbook-remix-i386.vdi -size 8192 -register

Create a new VM using the ubuntu-9.04-netbook-remix-bootable-i386.vdi (at the Virtual Hard Disk dialogue box, select "Existing" and browse the filesystem to select the .vdi file). Once you have created the VM, add the ubuntu-9.04-netbook-remix-i386.vdi as a IDE primary slave drive. Then boot and install UNR to the IDE primary slave.

After the installation is complete remove the ubuntu-9.04-netbook-remix-bootable-i386.vdi
primary drive from your virtual machine and set ubuntu-9.04-netbook-remix-i386.vdi to be the primary IDE drive.

OK, so it's a little bit of messing around, but it does the trick.

Jaunty UNR installed in VirtualBox

Tuesday, 23 June 2009

Debugging the Linux Kernel over a Serial Console.

There are times when the luxury of a text based console just do not exist when debugging kernels on a PC and one has to dump debug out over a serial console. This used to be relatively straight forward a few years ago since every PC had a DB9 serial port and bit banging data over this was fairly low tech and just worked.

However, the modern PC does not have such legacy ports anymore, so one has to fall back to using a USB serial dongle instead. One can purchase such kit quite easily, I use a PL2303 based USB serial dongle, it's fairly inexpensive (about £3.00) and Linux provides a serial console tty driver for this. I attach one serial console to the PC that needs debugging and connect this via a null modem cable to my host PC which captures the debug using a serial console terminal such as minicom.

Install minicom on your host debugging machine:

sudo apt-get install minicom

and build a debug kernel for the target machine with the following config options:


and enable the appropriate driver, e.g.:


Install this one the target machine to debug. Configure minicom to run at 115200 baud, 8 bits per char, no parity stop bits, no flow control and start it up on the host. Boot the target machine with the following kernel boot parameters:

console=tty console=ttyUSB0,115200n8

..and hopefully you will see all the console text appear in the minicom terminal, albeit rather slowly(!) You may see some dropped characters as flow control has been turned off.

This kind of debug is especially useful when you just cannot get normal VGA console debug output. However, because it's over USB, it can be a little useless for debug in the late stages of suspend or early stages of resume or boot.

I've used this technique to dump scheduling state over the console during X and network hangs. All in all, rather basic and crude, but it's another tool in my box for sorting out problematic kernel issues.

USB persist, Webcams and Suspend/Resume

The other day I was looking at a problem with a laptop running Ubuntu Hardy that had an integrated USB webcam that failed to work over a suspend/resume cycle. A webcam viewer application such as Cheese or UCView was being run during the suspend/resume and had locked up after the resume. A workaround was to enable USB persist on this device as described in Documentation/usb/persist.txt in the kernel source.

From my understanding, if a USB host controller loses power (e.g. during a system syspend) then it is treated as if it has been unplugged. For devices like integrated webcams we know it cannot be unplugged, so we can enable USB persist.

USB host controllers can get reset after suspend to RAM on some systems, and you will see kernel messages such as "root hub lost power or was reset" (use dmesg to see this). In such cases, USB persist can save the day.

UBS persist keeps the USB device's core data structures persistent when power session disruption occurs. Essentially on resume the kernel checks if persist is set on the device, if so, it does a USB port reset, re-enumates and checks if the device is the same as the one before (e.g. checking descriptors and information such as produce and vendor IDs) it then re-uses the original device data structures.

USB persist for a device can be enables by echo'ing 1 to the devices persist file as root:

echo 1 >/sys/bus/usb/devices/.../power/persist

Below is a shell script to set USB persist for a device with a given IDVENDOR and IDPRODUCT USB ID.

for I in /sys/bus/usb/devices/*/*
if [ -e $I/idVendor -a -e $I/idProduct ]; then
idvendor=`cat $I/idVendor`
idproduct=`cat $I/idProduct`
if [ x$idvendor = x$IDVENDOR -a x$idproduct = x$IDPRODUCT ]; then
if [ -e $I/../power/persist ]; then
echo 1 > $I/power/persist
if [ -e $I/../power/persist ] ; then
echo 1 > $I/../power/persist

Note that doing USB persist on pluggable USB devices is NOT recommended as it can cause kernel panics!

Monday, 22 June 2009

Version Control with Git

At last, my long awaited Amazon pre-ordered book "Version Control with Git" (ISBN: 978-0-596-52012-0) arrived a couple of days ago. I quickly read the first couple of chapters over the weekend. It's very readable, has plenty of worked examples and has over 300 pages of goodness in it.

It has little side notes about traps one can fall into, which are helpful to git novice and git guru alike.

Looks like a worthwhile addition to my ever growing O'Reilly book collection!

Sunday, 21 June 2009

Fun with my ESKY Lama Helicopter

The weekend is here and it's time to relax a little!

My brother sent me an ESKY Lama remote controlled Helicopter for my 40th birthday a few weeks ago (very generous!) and I've been a waiting for the right weather conditions to fly it outdoors. Fortunately this weekend there almost no breeze at all, so I gave it at spin in the back garden.

David, my 6 year old, got a little over excited about it all (he kind of lost the ability to speak coherently in English and then threw is chuck glider at it!)

As you can see from the video, I'm still in newbie learning mode! (Actually, just out of shot is my apple tree which I was trying to avoid..). I need to do a whole load more training on a simulator! I've already put in an order for 12 pairs of new rotor blades as my landings in the rose bushes and close encounters with the apple and pear trees have wrecked the first set(!) Fortunately they are not too expensive to replace and I'm waiting for some to be delivered by Wednesday!


Anyhow, this is my first flying radio controlled toy and dealing with 4 channels of control as a newbie is keeping my hand/eye co-ordination busy! Perhaps by the end of the Summer I will have figured out how to do something a little more impressive than take off, hover and land!

References: ESKY

Saturday, 20 June 2009

FIBMAP ioctl example - get the file system block number of a file

The FIBMAP ioctl() is an obscure and rarely used ioctl() that returns the file system block number of a file.

To find the Nth block of a file, one uses the FIBMAP ioctl() as follows:

int block = N;
int ret = ioctl(fd, FIBMAP, &block);


fd is the opened file descriptor of the file being examined,
N is the Nth block,
block is the returned file system block number of the Nth block.

is example of a program that interrogates all the blocks of a file and dumps them out to stdout. From this one can see if a large file is contiguously allocated on your filesystem. One needs to have super user priviliges to be able to perform the FIBMAP ioctl, so run it using sudo. Any ideas for a good use of this ioctl() are most welcome!


An alternative way to observe the output from fibmap is to use hdparm:

sudo hdparm --fibmap /initrd.img

/initrd.img: underlying filesystem: blocksize 4096, begins at LBA 63; assuming 512 byte sectors
byte_offset begin_LBA end_LBA sectors
0 860223 868414 8192
4194304 14114879 14122334 7456
Also check out the fiemap ioctl() for getting extent information from a file's inode.

Seven Ways to Reboot a PC

There are many ways to reboot a PC, some methods are well documented, others are a little more obscure.

Method 1. Via the Keyboard (Embedded Controller) port.

Writing 0xfe to the keyboard (Embedded Controller) port 0x64 does the trick. This pulses the reset line to low and forces a reboot. To do so under Linux (as super user) in C do:

ioperm(0x64, 1, 1);
outb(0xfe, 0x64);

..make sure every filesystem is sync()'d and/or unmounted first! This can be selected in Linux with the reboot=k kernel boot parameter and reboot.

Method 2. Resetting PCI

This is more ugly and apparently works on most Intel PCI chipsets. Write 0x02 and then 0x04 to the PCI port 0xcf9, example C code as follows (again run as super user):

ioperm(0xcf9, 1, 1);
outb(0x02, 0xcf9);
usleep(10); /* a very small delay is required, this is plenty */
outb(0x04, 0xcf9);

Alternatively, boot Linux using the reboot=p kernel boot parameter and reboot. Note that the delay can be very small - as short as doing another port write.

Method 3. Triple faulting the CPU.

This is an Intel undocumented feature; one basically forces a triple fault and the processor just reboots. The idt is loaded with an empty descriptor pointer and an int3 (trap the debugger) instruction is executed. It's quite a brutal thing to do, but always seems to work.

Boot Linux with the reboot=t kernel boot parameter to select this mode of rebooting.

Method 4. Reboot by jumping to the BIOS (32 bit CPUs only!)

By flipping back to real mode, and jumping to 0xffff0000 using a ljmp $0xffff,$0x000 the CPU executes the BIOS reset. Who knows how the BIOS reboots, but it should work, as long as your BIOS is written correctly!

Boot Linux with the reboot=b kernel boot parameter to do this on a reboot.

Method 5. Executing a reset on the BSP or another CPU (32 bit CPUs only!)

Quite frankly, I've not figured out how this method works yet, but it can be achieved in Linux with the reboot=s kernel boot parameter and rebooting.

Method 6. Via the ACPI

Section of the ACPI spec describes RESET_VALUE and RESET_REG which can be configured in the FADT. The OS writes the value in RESET_VALUE to the register RESET_REG to perform a reboot. I've seen this achieved by various ways, for example one BIOS has implemented this as the PCI reset method, by writing 0x06 into register 0xcf9, this works in 95% the time, but one really does need to write 0x02, delay and then write 0x04 for this to reliably work 100% of the time. So BIOS writers beware!

In Linux, this can be configured by using the reboot=a kernel boot parameter and rebooting.

Method 7. Using the EFI (Extensible Firmware Interface) reset_system runtime service.

If your Linux PC supports EFI firmware (e.g. Intel based Apple Macs), you can reboot using EFI using the reboot=e kernel boot parameter for reboots.  Basically this makes the kernel call the EFI reboot service support call.

Using reboot() system call..

Rebooting from userspace with super user privileges can be achieved in Linux using the reboot() system call:

You need to include unistd.h and linux/reboot.h and then call:


Hopefully this article will solve most of your Linux rebooting issues. Have fun!

Post Script, using kexec()

I have not mentioned kexec() which allows one to execute a new kernel image, allowing one to effectively reboot very quickly without involving a hard reset and executing BIOS code... that's for another blog entry! The beauty of kexec() is it allows one to jump to a pre-loaded kernel image and reboot with this, avoiding the need to do a BIOS reboot. This gives Linux the ability to seemlessly reboot and at speed.

Friday, 19 June 2009

The Linux-Ready Firmware Development Kit

The Linux-Ready Firmware Developer Kit is a useful tool to check if important firmware features on your PC are correct or not.

One can download a live CD which will run the tests without the need to install any software on your machine whatsoever. Alternatively you can grab the source and build it yourself.

To build and run it on my Ubuntu server I first had to install xutils-dev, bison, flex and iasl. I simply downloaded the source tar ball, gunzip and untar'd it and ran make. To run it from the build directory one has run as root and do:



It runs a bunch of tests, such as checking the ethernet functionality, CPU frequency scaling, fan testing, HPET configuration checking, MTRR and BIOS checks to name but a few. Hopefully with this kit one can determine if a BIOS is configured correctly for Linux before shipping the BIOS on PCs. We can only hope :-)

Hacking the Chumby

A few weeks ago I was given a Chumby. What is a Chumby? Well it's a nifty little Linux based internet enabled device that plays Flash Lite widgets. The Chumby has a small 3.5" 320x240 colour touch screen, a couple of USB 2.0 ports, stereo 2W speakers and a headphone socket. The processor is a 350Mhz Freescale MX21ADS (ARM926EJ-Sid(wb) rev 4 (v5l)) and on board is 64 MB SDRAM and 64MB NAND flash.

It also has a squeeze sensor and a motion sensor (accelerometer) - the latter is used for interaction, such as games Widgets.

Hacking this device is fairly straight forward, there is a hidden button on a configuration screen that selects a hidden options menu. This has an option to allow one to enable SSH, and once enabled one can SSH in as root and start playing! To keep SSH enabled even after a reboot one needs to touch the file /psp/start_sshd.
Another hidden option enables a simple Web Server. I've hacked this to allow me to check the system status, it's rather crude, but it works!

The Chumby Website allows one to select from over a 1000 widgets - you simply add these to one of your personal Chumby channels and the device downloads these Flash Lite widgets which are free to use. There are a huge range of widgets, ranging from internet radio players, clocks, RSS news feed viewers, webcam viewers, games, photo viewers and more beside!

As for hacking, there is a Wiki with plenty of information on how to download the GCC toolchain and kernel source - with which one can start porting apps. It's early days for me - I've only rebuilt the kernel and ported bash, but I'm looking to do some hoopy things, such as get a C64 emulator ported - there may be enough horsepower on this device for it to work. Watch this space!

Chumby References:

Chumby Industries


Thursday, 18 June 2009

Join in the Ubuntu IRC discussions at Freenode!

I'd like to encourage anyone using Ubuntu to join in the community discussions using Internet Relay Chat (IRC). IRC allows group discussion in forums called channels, but one can also chat privately in 1-to-1 chat in private channels too.

There are many IRC Ubuntu channels hosted at Freenode, as a Ubuntu Kernel Team member I'm usually in channel #ubuntu-kernel under the IRC nickname of cking.

It's a great way of informally asking and answering questions - generally somebody will reply to you. Please hang around in a channel, as sometimes people cannot get back to you immediately. Please remember to be polite! :-)

For more details, check out the Ubuntu InternetRelayChat wiki page.

Linux Wifi - What's going on?

When one cannot associate with a Wifi Access Point it can be helpful to find out exactly what is going on. Below is the recipe I use to see what NetworkManager is doing:

Start a bash shell as super user:

$ sudo -i

Kill the current NetworkManager:

$ killall NetworkManager

Start the script command to capture terminal output:

$ script

Start NetworkManager to run on non-daemon mode:

$ NetworkManager --no-daemon

Try to associate to your Access Point to get some debug data. Then exit the script session:

$ exit

And then look through the generate typescript file to see what NetworkManager is doing.

Alternatively, one can get some idea of what is happening using iwevent:

$ iwevent
Waiting for Wireless Events from interfaces...
07:04:25.174275 wlan0 New Access Point/Cell address:Not-Associated
07:04:57.908360 wlan0 Scan request completed
07:04:57.910007 wlan0 Set Mode:Managed
07:04:57.910038 wlan0 Set Frequency:2.412 GHz (Channel 1)

Hopefully using these methods can give you an inkling to what could be causing the problem!

Wednesday, 17 June 2009


Well, you may think I'm selling out if I mention Windows, but sometimes we have to admit the PC ecosystem has other operating systems than just Linux :-)

Wubi is an installer that allows Windows users to install and un-install Ubuntu just like a Windows application. Wubi does require Ubuntu partitions to be added and does not install a new boot loader. However, it does allow Ubuntu to be dual-booted on the Windows PC and run just like a conventionally installed Ubuntu system.

There is a little magic going on. Wubi creates a large file on the Windows NTFS partition which contains Ubuntu in the form of ext3 filesystem. When Ubuntu boots up, it mounts the NTFS partition using NTFS-3G (via fuse, "file system in user space") and then loop mounts the file on the NTFS drive that contains the ext3 Ubuntu file system.

To make sure dirty pages are written back to disk for file system consistency Wubi does some vm tweaks. This is required just because Wubi uses stacked file systems (NTFS-3G+fuse, loop) and dirty pages sometimes hang around a while in memory.

The Wubi installer is on the Ubuntu ISO image, so a Windows user can load a Ubuntu Live CD and Windows will auto-run the Wubi installer. This allows Windows users to install Ubuntu and give it a try!

Postscript: I've added more Wubi notes in this newer blog article.

Debugging Intel X Hangs

While I was at UDS I learnt some useful X debugging tricks for Intel based chipsets. One can debug the driver using debugfs as follows:

1. ssh into the machine with locked-up X and then mount debugfs:

$ sudo mount -t debugfs none /sys/kernel/debug

2. Find the dri debug directory:

$ cd /sys/kernel/debug/dri/0

3. Check for a system lock up by looking to see if sequence numbers are advancing or not:

$ cat i915_gem_seqno
Current sequence: 1732368
Waiter sequence: 0
IRQ sequence: 1732364

If the sequence numbers are not increasing then we know that the GPU has locked up for some reason.

4. If that's working, then check to see what the X server is doing:

$ cat /proc/pid-of-X-server/wchan

(where 'pid-of-X-server' is the process id of the X server)

This will show you what it is waiting for. If you see it changing then it's not an X hang.

5. Look at the interrupt activity

$ cat i915_gem_interrupt

Check that the masks are restored correctly after a resume - interrupts may be masked and hence not able to respond to interrupts.

The IRQ sequence generally is a little behind Waiter sequence - if IRQ sequence does not increment it's a GPU hang. The Current Sequence *SHOULD NOT* be zero. Waiter Sequence is zero when there is nothing queued up to process.

Also, check out http://intellinuxgraphics.org/documentation.html

Monday, 15 June 2009

GCC built in #defines

The following rune dumps out the #define values in GCC:

gcc -E -dM -xc /dev/null

Quite handy sometimes!

The Debugfs Interface

Last week I was tweaking the rt73usb driver and doing some debugging using the debugfs kernel interface (Greg Kroah-Hartman wrote the debugfs interface back in 2004). Debugfs is a virtual filesystem devoted to debugging information, and makes it easier for kernel developers to tweak and twiddle drivers and hardware using a relatively simple API. Debugfs can be compiled in for debug purposes, and compiled out for final kernel delivery.

One generally mounts the debugfs as follows:

sudo mount -t debugfs debug /sys/kernel/debug

and then drivers populate this mount point according to their own whim.

To use debugfs in your drivers start with:

struct dentry *debugfs_create_dir(const char *name, struct dentry *parent);

name being the name of the directory you want to create, and parent typically is NULL (causing the directory to be created in debugfs root).

There are some helper functions to allow one to read/write values as follows:
struct dentry *debugfs_create_u8(const char *name, mode_t mode, struct dentry *parent, u8 *value);
struct dentry *debugfs_create_u16(const char *name, mode_t mode, struct dentry *parent, u16 *value);
struct dentry *debugfs_create_u32(const char *name, mode_t mode, struct dentry *parent, u32 *value);
struct dentry *debugfs_create_bool(const char *name, mode_t mode, struct dentry *parent, u32 *value);

The above functions create debugfs files that can be read/written to modify the u8,u16,u32 and bool values.

Debugfs does not tidy up these files when a module is removed, so one has to do this for each file using:
void debugfs_remove(struct dentry *dentry);
It's quite a handy little interface, and can be replace printk()'s as a debug interface. As I said earlier in this article, I've used it now for tweaking and twiddling values in the rt73usb WiFi driver with success.



Sunday, 14 June 2009

Does pre-linking speed up boot times?

Pre-linking binaries is a method of modifying ELF libraries so that the relocation link time overhead is performed not at load time and hence theoretically speeding up program start time (see the Wikipedia entry for more details).

To test this, I installed Karmic 9.10 Alpha 2 on a Dell Inspiron 6400 laptop using ext4 as my default file system and added some instrumentation to measure the time from when the first rc scripts are started to the time the desktop completes loading when auto-logging in.

First, I measured the startup time 5 times without prelinking; this came to an average of 16.397 seconds with a standard deviation of 0.21 seconds.

Next I installed prelink and pre-linked all the system libraries (which takes ~5 minutes) using the following recipe:

  1. apt-get install prelinkuse apt-get or synaptic to install prelink.
  2. Open /etc/default/prelink with your favorite editor, using sudo
  3. Modify PRELINKING=unknown from unknown to yes
  4. Start the first prelink by running: sudo /etc/cron.daily/prelink
Then I repeated the boot time measurements with prelinking enabled; this came to an average of 16.343 seconds with a standard deviation of 0.09 seconds.

So I am seeing tiny ~0.33% speed up of 0.054 seconds which is within the margin of measuring error. All in all this took me ~1.5 hours to set up and measure, which means that if I boot my laptop once a day it will take me 274 years before I start saving time :-)

All in all it's not really worth enabling prelinking. At least I now know!

(Note: pre-linking may be considered a security issue, as Ubuntu makes use of randomizing the start of code to make it more difficult for rogue programs to exploit.)

I suspect that if the processor was significantly slower and there was less I/O delay (perhaps using SSD rather than HDD) I may actually see more of a significant speed up since prelinking saves some CPU cycles. I speculate that prelinking may be therefore worth considering on lower speed or less powerful platforms such as ARM or Intel Atom based machines.

The next test will be to see if large applications that use tens of libraries start up faster...

Saturday, 13 June 2009

I Have No Tomatoes

So I've been squishing bugs all week, how about squishing some tomatoes? "I Have No Tomatoes" is an amusing little game that basically involves steering your little red tomato around 10 grid mazes and in the process one squishes a whole load of other tomatoes.

Your tomato has the ability to drop bombs which blow any tomato in its path to smithereens, and you pick up power-ups by doing this. My technique is to pick up a Willow-the-Wisp which starts picking up power-ups for you. Then try to get a Potato Man which then automatically goes squishing more tomatoes for you. There a bunch of other power-ups, ranging from a lightning zaps, traps and more powerful bombs.

It's all light relief from tracking down and squishing kernel bugs! My scores are around the 420-450 mark, and any suggestions on how to get better are gratefully received!

To install, use: apt-get install tomatoes

Go on.. squish a bunch of tomatoes today! :-)

Brother 2170W Laser Printer

A little while ago my old cranky HP InkJet printer finally died and it was time to look for a new printer. This time around I was looking for a simple black and white laser printer as these work out cheaper per page than colour Inkjets. Lack of colour is a bonus in my books as it stops the kids wanting to print out lots of pictures from Tuxpaint and Gcompris :-)

As a luxury option I wanted to see if I could find a reasonably priced laser printer that also had Wifi connectivity to give me a little more flexibility in where I put the box.

After some searching around I ended up buying a Brother HL2170W, which can print upto 22 pages a minute and has very good driver support in Ubuntu. My Ubuntu Intrepid and Jaunty laptops found the printer on the network with no effort at all, can configuring it was a no-brainer. Once configured, one can easily check the printer's status and do further configuration using the printers web based admin tool, as shown below:

The downside was that when it associates with my ancient and flakey 3COM Office Connect router/Access Point the router occasionally crashes when using WPA2 PSK(!). I don't believe this is a fault with the Brother's Wifi at all. Until I'm brave enough to reflash my router with a firmware upgrade or buy a better one I won't be using my wireless laser printer wirelessly :-(

The Brother so far is reliable and just works as it should, with zero hassles. The only downside is that the printer has 32MB of memory, so it's a little slow at printing very large graphics.

The full specifications of the printer can be found here.

UPDATE: 24th Aug 2009. I've re-flashed my Wireless Access Point and I still cannot get the printer to associate. When I have time I will experimenting with another bit of wireless kit and see what's going on. Hmph.

Friday, 12 June 2009

Bonnie++, a useful benchmarking tool

When it comes to figuring out how a file system behaves on a HDD or SSD I generally first turn to Bonnie++ as a straight forward and easy to use disk benchmarking tool.

Bonnie performs a series of tests, covering character and block read/write I/O, database style I/O operations as well as tests for multiple file creation/deletion and some random access I/O operations.

To install bonnie for Ubuntu, simply do:

sudo apt-get install bonnie

To get a fair I/O benchmark (without the memory cache interfering with results) Bonnie generates test files that are at least twice the size of the memory of your machine. This can be a problem if you have servers with a lot of memory as the generated files can be huge and testing can therefore take quite a while. I generally boot a system and specify ~1GB of memory using the mem=1024M kernel parameter just so that Bonnie only tests with a 2GB file to speed my testing up.

My rule of thumb is to run Bonnie at least 3 times and take an average of the results. If you see a large variation in your results double check that there isn't some service running that is interfering with the benchmarking.

Bonnie references:



Weird kernel issues - is your BIOS broken?

A buggy BIOS can cause many different and subtle problems to Linux, ranging from reboot problems, incorrect battery power readings, suspend/resume not working correctly, and weird ACPI issues.

Unfortunately a broken BIOS usually ends up as a Linux kernel bug, which is unfortunate, since most of time the problem is with the closed proprietary BIOS code. Troubleshooting this can be difficult, as sometimes it really can be a genuine kernel bug(!) We've added some BIOS trouble shooting Ubuntu Wiki pages which may be helpful in diagnosing a lot of these issues.

Since a lot of BIOS issues fall into a broken ACPI, it's worth looking at the Debugging ACPI wiki page first.

Next, if you want to get your fingers really dirty with looking at the Differentiated System Description Table (DSDT), have a look at the BIOS and Ubuntu wiki page.

Hopefully these will guide you to a solution.

Karmic Koala Alpha 2 ready for testing.

Ubuntu 9.10 Karmic Koala Alpha two is now ready to download for testing. This is an Alpha release, so do NOT install it on production machines! The final stable version will be released on October 29th, 2009.

So what's new?
  • Linux 2.6.30-5.6 kernel based on 2.6.30-rc5, with Kernel Mode Setting enabled for Intel graphics. Note that LRM is now deprecated in favour of DKMS packages.
  • GNOME 2.27.1
  • GRUB 2 by default
  • GCC-4.4
Changes to video:

The new Intel video driver architecture is available for testing. In later Alphas there will be probably a switch from the current "EXA" acceleration method to the new "UXA". This will solve major performance problems of Ubuntu 9.04, but is still not as stable as EXA, which is why it is not yet enabled by default. To help testing UXA, please check out the instructions and testing webpage.

Feedback about the new Kernel Mode Setting (KMS) feature is also heavily appreciated. This will reduce video mode switching flicker at booting, and dramatically speed up suspend/resume. To help test this, check out the KMS instructions and feedback webpage.

For more details visit http://www.ubuntu.com/testing/karmic/alpha2

Thursday, 11 June 2009

Zim, the Handy Desktop Wiki

In my day-to-day work I need to make plenty of notes to help me remember the plethora of obscure details or steps required in solving a problem. (When I was much younger I could commit this to memory, but nowadays I need to use the computer to help!).

I use Zim a desktop Wiki (written by Jaap G Karssenberg) to help me make such notes. It supports simple wiki formatting and allows me also create a tree hierarchy of wiki pages too.

Zim does not require any Wiki formatting know-how, it's a simple GUI based Wiki editor and just does the job for me. It generates ASCII text pages with Wiki formatting, so no fancy binary formatting gets in the way of later editing the text with vi, emacs or Zim itself.

To install, simply use: sudo apt-get install zim

Wednesday, 10 June 2009

Kernel Debugging.

Kernel debugging may seem a daunting task but it's possible to solve a whole load of deep issues with some very simple tools. In fact, my number one debugging tool is printk().

Believe it or not, printing out the state of the kernel at specific key areas can reveal a whole load of useful information which can then help corner a bug.

Of course, using printk() has it's issues; it adds extra code to the kernel which can possibly move the bug around or make the bug behave differently if there is a timing or race condition.

Occasionally, one needs to be able to dump a whole load of console messages over a serial line to enable one to capture the state of the machine when the PC console is not available. However, printk() still does the trick.

If you want to know more, I've started the Kernel Debugging Tricks wiki page with some of my debug hacks on it. Feel free to contribute if you have any helpful debugging tricks!

GRUB 2 now default for new installations of Ubuntu

The default boot loader for Karmic Koala (Ubuntu 9.10) will be grub2, as announced by Colin Watson on Monday this week.

The existing boot loader (grub 0.97) will not be changed or upgraded on previous installations as changing the boot loader can be inherently risky.

If you would like to check if your BIOS works with grub2 I suggest following the instructions on the Grub2 testing wiki page.

Tuesday, 9 June 2009

Suspend and Resume - always fun to debug

Debugging suspend/resume issues is always non-trivial when a machine hangs up during the resume. Generally, one would like to save some debug state in the resume stage which gets saved when one has to reboot a hung machine. However, the only real place one can shove state is in the real time clock (RTC) which gives one only ~24 bits of debug state.

Ubuntu kernels have PM_TRACE debug enabled, allowing one to turn on debug tracing via:

echo 1 > /sys/power/pm_trace

When a machine then hangs in the resume stage some magic is stashed in the RTC and then one has a ~3 minute window to reboot the machine. On the reboot, the hashed magic stored in the RTC is converted back into a debug message which can be read using the dmesg command and looking for the "Magic number" message:

$ dmesg | less

Magic number is: 13:773:539
hash matches device usbdev2.1_ep00

This usually provides enough information to allow one to start corning the issue.

Sunday, 7 June 2009

How about freezing new features and instead improve existing code quality?

Software has the tendency to get more and more features added over time. This leads to more bloat and newer bugs. Don't get me wrong, it's good to get new features as it makes a system more usable. Perhaps we should occasionally stop pushing in new code and instead focus on:

1) Fixing outstanding critical or high priority bugs
2) Remove unwanted or old crufty code
3) Optimize code for speed
4) Try to reduce unwanted bloat

It's a bit like going to one's doctor and getting a periodic heath check. We humans put on weight, get unfit and occasionally need to get back into trim before taking on new challenges in life. Maybe we need to do the same with our software...

Which I/O scheduler is best for SSD?

Linux provides several I/O schedulers that allow one to tune performance for your use patterns. Traditionally disk scheduling algorithms have been designed for rotational drives where rotational latency and drive head movement latency need to be taken into consideration, hence the need for complex I/O schedulers. However a SSD does not suffer from the HDD latency issues - does this mean the default I/O scheduler choice needs to be re-thought?

I decided to benchmark a SanDisk pSSD to see the behaviour of the different I/O schedulers using the following tests:

a) untar kernel source
b) sync
c) copy kernel source tree
d) sync
e) copy tar file
f) sync
g) rm -rf kernel source tree
h) rm -rf copy of source tree
i) sync

For each of the I/O schedulers I ran the above sequence of tests 3 times and took an average of the total run time. My results are as follows:

cfq: 3m59s,
noop: 3m25s,
anticipatory: 3m51s,
deadline: 3m42s

So it appears the default cfq (Completely Fair Queuing) scheduler is least optimal and the noop scheduler behaves best. According to the noop wikipedia entry the noop scheduler is the best choice for solid state drives, so maybe my test results can be trusted after all :-)

SSDs do not require multiple I/O requests to be re-ordered since they do not suffer from traditional random-access latencies, hence the noop scheduler is optimal.

Saturday, 6 June 2009

RAID0 + SSD = speed!

I've been messing around with Ubuntu Jaunty Desktop installed on two SanDisk 32GB pSSD devices today with a RAID0 configuration and I'm really impressed with the performance. (Thanks to SanDisk for providing me with some early samples to test!)

RAID0 stripes data across the two SSDs which should theoretically double I/O rates. It also allows me to effectively create a larger disk volume by treating both drives as one larger combined device. The downside is that if one SSD fails, I have zero redundancy (unlike RAID1) and hence can lose my data.

To avoid controller saturation, I configured the SSDs so that each SSD used a different SATA controller; I believe this helps, but I've not done any serious benchmarking to verify this configuration assumption.

I used the SoftwareRAID wiki page as a quick start guide to get my head around how to configure my system; I created a small /boot partition and then a 30GB and ~2GB RAID0 volumes for / (using ext4) and swap respectively on each drive and then installed the 64 bit Jaunty Desktop using the alternative installer ISO image.

Performance just rocked! I benchmarked block I/O performance using bonnie and got block write rates of ~73MB/s and reads of ~114MB/s.

A quick test with dd writing 8GB of zeros in 4K blocks on my machine hit ~102MB/s write rate - quite astounding. I managed to get a sustained read rate of ~119MB/s read rate in 4K blocks reading a 16GB file - that's just awesome!

Boot time was impressive too. On my dual core 2.0GHz desktop I booted in ~12-14 seconds after grub stage without any optimization tweaks.

If I get some more spare time I will try and figure out how to get the partitions aligned to 128K block boundary to see how much more performance I can squeeze out of these SSDs. Watch this space!

Thursday, 4 June 2009

htop - an alternative to top?

At times it is useful to see which process is burning up all those CPU cycles. Normally I use the top command, which does the trick. I've recently stumbled upon the htop command, which is a little like top, but has some neat interactive features which makes it a little more powerful than top in my opinion.

To install it, simply do:

sudo apt-get install htop

htop allows one to see CPU usage per processor, and allows one to scroll up and down the process list. If you've not used it yet, I recommend installing it and giving it a try.

Wednesday, 3 June 2009

Getting more done with moreutils

I was introduced to moreutils a few weeks ago and I now wonder how I managed without it. Installing is straight forward enough:

sudo apt-get install moreutils

This adds the following commands:

- sponge: soak up standard input and write to a file
- ifdata: get network interface info without parsing ifconfig output
- ifne: run a program if the standard input is not empty
- vidir: edit a directory in your text editor
- vipe: insert a text editor into a pipe
- ts: timestamp standard input
- combine: combine the lines in two files using boolean operations
- pee: tee standard input to pipes
- zrun: automatically uncompress arguments to command
- mispipe: pipe two commands, returning the exit status of the first
- isutf8: check if a file or standard input is utf-8
- lckdo: execute a program with a lock held

sponge is particularly useful:

sort file.txt | sponge file.txt # sort file.txt and dump it back to file.txt without clobbering the file.

The vipe command is a useful way to edit data in a pipeline, especially if you cannot be bothered to figure out smart sed logic to edit data in the pipeline.

I've found ts useful for capturing data out of a log and time stamping it - kind of handy for debugging purposes.

vidir is great for editing filenames in a directory, but can be a little too powerful in my opinion as one can really screw up a load of files if one is not careful!

Tuesday, 2 June 2009

Suspend/Resume - 100% reliable?

Sometimes "good" is just not good enough. Take Suspend/Resume as an example. Does it work on your laptop? Tried it once and it's OK? How about 300 times? Now is it OK? Does your Wifi now work? :-)

You may think it is insane to do Suspend/Resume 300 times, but that's a good test of reliability. Over the past few months I've been looking at improving the reliability of Suspend/Resume on various netbooks and it is very surprising to see how varied the results are across different platforms.

I've seen Wifi drivers crash while they try and associate during a suspend. I've seen BIOS bugs that cause no end of weirdness. Debugging these issues can be a pain, hence my fellow Ubuntu Kernel Developers have created a wiki page to help debug suspend/resume issues: https://wiki.ubuntu.com/DebuggingKernelSuspendHibernateResume

Hopefully we can iron out these bugs. My hope is that Suspend/Resume will work correctly each and every time it is used.

Disabling CTS Protection Mode

I've been twiddling my Access Point (AP) settings and discovered that disabling CTS protection mode significantly increases Wifi throughput in a low transmission error network. CTS (Clear-To-Send) protection mode should be enabled when your Wireless-G devices are experiencing problems, for example, not being able to transmit to the Router in an environment with heavy 802.11b traffic.

The theory is as follows:  When there are many devices connected to an AP they can sometimes transmit to the AP at the same time. This can occur when a client cannot detect the other client to determine if it is transmitting on the same channel. In the case of this collision, the AP discards the colliding data and the error rate is increased. 

CTS protection helps by choosing which device gets to transmit at each moment. The client has to send a request to send (RTS) to the AP. The AP then replies with a CTS packet and then only listens to that client until the client has completed transmitting.  This overhead decreases throughput.

Most APs also allow one to change the RTS threshold. This specifies the packet size  requiring an RTS. On a very noisy network where there are a lot of transmission errors one can bring the RTS threshold right down making clients send RTS packets for smaller packet.s, but this will decrease performance.

With CTS enabled it should help 802.11B/G devices to have a chance to transmit to the AP.  If your network error rate is low, disable CTS protection to get better performance. 

Hope that helps!

Monday, 1 June 2009

Grub2 - needs some love and attention.

Funny. Whenever I mention using grub2 as my default bootloader I get worrying looks as if I'm mad. There seems to be an opinion that grub2 is still not mature enough to use. There are rumours that it's untested and possibly won't work because of BIOS issues. Fear. Uncertainty, Doubt. FUD.

So with all this FUD around, grub2 has been slow in adoption. The problem with FUD is that it can be based on opinions that come from on rumour or hearsay - something has to be done to kill this FUD! The wiki page https://wiki.ubuntu.com/KernelTeam/Grub2Testing allows Ubuntu users to test grub2 either from a USB pen drive or by installing it directly on their machine and submit their test results.

So far, the results are very promising. We've not yet seen grub2 fail to boot from any BIOS. It boots from ext2, ext3, ext4, LVM, xfs, jfs and reiserfs partitions successfully. It successfully boots dual boot Windows/Ubuntu systems.

So, if you want to join me in killing the grub2 FUD, test your machine, and submit your results!

Hello World!

Let me introduce myself. I work for Canonical as a Kernel Engineer, mainly focused on ironing out weird BIOS and kernel issues on small netbooks. I'm a bit twiddler at heart, and live in the world of kernel panics and register dumps.

So what makes me tick? Making program code better: be it faster, more compact, less buggy and more power efficient.