As I mentioned in my article on wireless for the Aspire One, one of the things I did was build my own kernel so that some day I could strip out some of the superfluous crud. Well, yesterday I actually started trying to boot that kernel. It was more of a struggle than I thought it would be, but I eventually succeeded. Well, kind of. Before I get into more detail, let me just make something very clear: don’t try this at home. I’m going to talk about doing some things that could seriously screw up your system – not just the Aspire One but whatever other machine you run some of the steps on. Don’t blame me if you turn one of your machines into a very pretty brick.

At this point, I have concluded that the version of GRUB (the boot loader) that’s installed on the A1 is non-standard in some way. No matter what I did in grub.conf (or menu.lst just in case it was lying to me about which config file it was using) I couldn’t get it to show me an actual boot menu. Come to think of it, I couldn’t get it to show me the splash screen referenced in the as-shipped grub.conf either. It would always show the original who-knows-from-where splash screen, and always boot the original kernel. The only thing that seems to have changed since I first received the box is that now I have to hit the space bar a couple of times at the blank screen where the GRUB menu should be. That was good for a few tense moments, but it also tells me that GRUB isn’t being bypassed entirely.

Until I understand exactly what’s special about the Aspire One version of GRUB, I’m loth to replace it or to modify that first entry. Instead, I chose to set up an external boot loader on my oldest and smallest USB key, using my Dell laptio to set it up and test it because I don’t trust any of the GRUB utilities on the A1. I saved everything I used to have on it, reformatted it as plain ext2 (had been ext3), copied the my Dell’s /boot directory onto it, ran grub-install etc. The only non-obvious thing was that after grub-install I still had to go into the grub shell and run setup there. That seems redundant, but here are the commands I used.

grub> root (hd1,0)
grub> setup (hd1)

No, not (hd1,0) as you’d think. This allowed me to use the USB key to boot the Dell, so then I tried on the A1. The boot entry I used looked like this.

title My A1 Kernel
rootnoverify (hd1,0)
kernel /boot/bzimg_jd ro root=LABEL=linpus vga=0x311 splash=silent loglevel=1 console=tty1 quiet nolapic timer
initrd /boot/initrd-splash.img
map (hd0) (hd1)
boot

The most important part is that (hd1,0) when booting from the USB key means the internal SSD, whereas when booting from the internal SSD – or the Dell’s hard disk while I was setting this up – it meant the USB key. Similarly, the “map” command makes it so that post-boot references to /dev/sda refer to the SSD again. This also makes it possible to remove the USB key after booting, despite having used it to boot the system originally.

The other bit of strangeness had to do with loading kernel modules built for one kernel while booted with another. On my first boot, things mostly worked but some things failed because the version strings weren’t quite the same. To get around this, I used “make menuconfig” to change the local version in my kernel from “lw” to “jd” then rebuilt and installed my own matching modules – in parallel with the shipped ones instead of replacing them. My next reboot went a lot better, except for ndiswrapper which had been installed separately. That was just a matter of unpacking it and running “make install” just as I had before.

So now I have my own kernel that’s functionally identical to the original one. I’m out of time now, but some day soon I’ll actually start trimming the fat from my version.

UPDATE 2008-09-22: While I was able to boot my own fully-functional kernel this way a couple of times, on other occasions not everything came up OK (“HAL failed to initialize”) and on still others the system exhibited random flakiness later (failed to make a WiFi connection or recognize a USB device). Apparently, building from the Acer sources with the Acer config file still doesn’t yield something functionally equivalent to the Acer kernel. It might have been an honest mistake/omission on their part, or it might be like when Sun used to accept my fixes and incorporate them into SunOS but leave them out of the supposedly-equivalent NFS source they shipped back to us. Either way, it irks me.