Don't understand german? Read or subscribe to my english-only feed.

DebConf15: “Continuous Delivery of Debian packages” talk

August 24th, 2015

At the Debian Conference 2015 I gave a talk about Continuous Delivery of Debian packages. My slides are available online (PDF, 753KB). Thanks to the fantastic video team there’s also a recording of the talk available: WebM (471MB) and on YouTube.

HAProxy with Debian/squeeze clients causing random “Hash Sum mismatch”

July 2nd, 2015

Update on 2015-07-02 22:15 UTC: as Petter Reinholdtsen noted in the comments:

Try adding /etc/apt/apt.conf.d/90squid with content like this:

Acquire::http::Pipeline-Depth 0;

It turn off the feature in apt confusing proxies.

” – this indeed avoids those “Hash Sum mismatch” failures with HAProxy as well. Thanks, Petter!

Many of you might know apt’s “Hash Sum mismatch” issue and there are plenty of bug reports about it (like #517874, #624122, #743298 + #762079).

Recently I saw the “Hash Sum mismatch” usually only when using “random” mirrors with e.g. httpredir.debian.org in apt’s sources.list, but with a static mirror such issues usually don’t exist anymore. A customer of mine has a Debian mirror and this issue wasn’t a problem there neither, until recently:

Since the mirror also includes packages provided to customers and the mirror needs to be available 24/7 we decided to provide another instance of the mirror and put those systems behind HAProxy (version 1.5.8-3 as present in Debian/jessie). The HAProxy setup worked fine and we didn’t notice any issues in our tests, until the daily Q/A builds randomly started to report failures:

Failed to fetch http://example.org/foobar_amd64.deb Hash Sum mismatch

When repeating the download there was no problem though. This problem only appeared about once every 15-20 minutes with random package files and it affected only Debian/squeeze clients (wheezy and jessie aren’t affected at all). The problem also didn’t appear when directly accessing the mirrors behind HAproxy. We tried plenty of different options for apt (Acquire::http::No-Cache=true, Acquire::http::No-Partial=true,…) and also played with some HAProxy configurations, nothing really helped. With apt’s “Debug::Acquire::http=True” we saw that there really was a checksum failure and HTTP status code 102 (‘Processing‘, or in terms of apt: ‘Waiting for headers‘) seems to be involved. The actual problem between apt on Debian/squeeze and HAProxy is still unknown to us though.

While digging deeper into this issue is on my todo list yet, I found a way to avoid those “Hash Sum mismatch” failures: switch from http to https in sources.list. As soon as https is used the problem doesn’t appear anymore. I’m documenting it here just in case anyone else should run into it.

fork(), once again

May 15th, 2015

On 10th of May 2015 my lovely wife has made me a lovely present again: welcome to our family, Johanna.

The #newinjessie game: tools related to RPM packages

May 4th, 2015

Continuing the #newinjessie game:

Bernhard Miklautz, contributor to jenkins-debian-glue and author of jenkins-package-builder (being in an early stage but under active development to provide support for building RPMs, similar to what jenkins-debian-glue provides for building Debian/Ubuntu packages) pointed out that there are new tools related to RPM packaging available in Debian/jessie:

  • mock: Build rpm packages inside a chroot (similar to what cowbuilder/cowbuilder/sbuild/… do in the Debian world)
  • obs-build: scripts for building RPM/debian packages for multiple distributions

GLT15: Slides of my “Debian 8 aka jessie, what’s new” talk

April 28th, 2015

*

I wasn’t sure whether I would make it to Linuxdays Graz (GLT15) this year so I didn’t participate in its call for lectures. But when meeting folks on the evening before the main event I came up with the idea of giving a lightning talk as special kind of celebrating the Debian jessie release.

So I gave a lightning talk about "Debian 8 aka jessie, what’s new" on 25th of April (couldn’t have been a better date :)) and the slides of my talk turned up to be more useful than expected (>3000 downloads within the first 48 hours and I received lots of great feedback), so maybe it’s worth mentioning them here as well: "Debian 8 aka jessie, what’s new" (PDF, 450KB)

PS: please join the #newinjessie game, also see #newinjessie on twitter

The #newinjessie game: new forensic packages in Debian/jessie

April 24th, 2015

Repeating what I did for the last Debian release with the #newinwheezy game it’s time for the #newinjessie game:

Debian/jessie AKA Debian 8.0 includes a bunch of packages for people interested in digital forensics. The packages maintained within the Debian Forensics team which are new in the Debian/jessie stable release as compared to Debian/wheezy (and ignoring wheezy-backports):

  • ext4magic: recover deleted files from ext3 or ext4 partitions
  • libbfio1: Library to provide basic input/output abstraction
  • lime-forensics-dkms: kernel module to memory dump
  • mac-robber: collects data about allocated files in mounted filesystems
  • pff-tools: library to access various ms outlook files formats/tools to exports PAB, PST and OST files
  • ssdeep: Recursive piecewise hashing tool (note: was present in squeeze but not in wheezy)
  • volatility: advanced memory forensics framework
  • yara: help to identify and classify malwares

Join the #newinjessie game and present packages which are new in Debian/jessie.

check-mk: monitor switches for GBit links

January 23rd, 2015

For one of our customers we are using the Open Monitoring Distribution which includes Check_MK as monitoring system. We’re monitoring the switches (Cisco) via SNMP. The switches as well as all the servers support GBit connections, though there are some systems in the wild which are still operating at 100MBit (or even worse on 10MBit). Recently there have been some performance issues related to network access. To make sure it’s not the fault of a server or a service we decided to monitor the switch ports for their network speed. By default we assume all ports to be running at GBit speed. This can be configured either manually via:

cat etc/check_mk/conf.d/wato/rules.mk
[...]
checkgroup_parameters.setdefault('if', [])

checkgroup_parameters['if'] = [
  ( {'speed': 1000000000}, [], ['switch1', 'switch2', 'switch3', 'switch4'], ALL_SERVICES, {'comment': u'GBit links should be used as default on all switches'} ),
] + checkgroup_parameters['if']

or by visting Check_MK’s admin web-interface at ‘WATO Configuration’ -> ‘Host & Service Parameters’ -> ‘Parameters for Inventorized Checks’ -> ‘Networking’ -> ‘Network interfaces and switch ports’ and creating a rule for the ‘Explicit hosts’ switch1, switch2, etc and setting ‘Operating speed’ to ‘1 GBit/s’ there.

So far so straight forward and this works fine. Thanks to this setup we could identify several systems which used 100Mbit and 10MBit links. Definitely something to investigate on the according systems with their auto-negotiation configuration. But to avoid flooding the monitoring system and its notifications we want to explicitly ignore those systems in the monitoring setup until those issues have been resolved.

First step: identify the checks and their format by either invoking `cmk -D switch2` or looking at var/check_mk/autochecks/switch2.mk:

OMD[synpros]:~$ cat var/check_mk/autochecks/switch2.mk
[
  ("switch2", "cisco_cpu", None, cisco_cpu_default_levels),
  ("switch2", "cisco_fan", 'Switch#1, Fan#1', None),
  ("switch2", "cisco_mem", 'Driver text', cisco_mem_default_levels),
  ("switch2", "cisco_mem", 'I/O', cisco_mem_default_levels),
  ("switch2", "cisco_mem", 'Processor', cisco_mem_default_levels),
  ("switch2", "cisco_temp_perf", 'SW#1, Sensor#1, GREEN', None),
  ("switch2", "if64", '10101', {'state': ['1'], 'speed': 1000000000}),
  ("switch2", "if64", '10102', {'state': ['1'], 'speed': 1000000000}),
  ("switch2", "if64", '10103', {'state': ['1'], 'speed': 1000000000}),
  [...]
  ("switch2", "snmp_info", None, None),
  ("switch2", "snmp_uptime", None, {}),
]
OMD[synpros]:~$

Second step: translate this into the according format for usage in etc/check_mk/main.mk:

checks = [
  ( 'switch2', 'if64', '10105', {'state': ['1'], 'errors': (0.01, 0.1), 'speed': None}), # MAC: 00:42:de:ad:be:af,  10MBit
  ( 'switch2', 'if64', '10107', {'state': ['1'], 'errors': (0.01, 0.1), 'speed': None}), # MAC: 00:23:de:ad:be:af, 100MBit
  ( 'switch2', 'if64', '10139', {'state': ['1'], 'errors': (0.01, 0.1), 'speed': None}), # MAC: 00:42:de:ad:be:af, 100MBit
  [...]
]

Using this configuration we ignore the operation speed on ports 10105, 10107 and 10139 of switch2 using the the if64 check. We kept the state setting untouched where sensible (‘1’ means that the expected operational status of the interface is to be ‘up’). The errors settings specifies the error rates in percent for warnings (0.01%) and critical (0.1%). For further details refer to the online documentation or invoke ‘cmk -M if64′.

Final step: after modifying the checks’ configuration make sure to run `cmk -IIu switch2 ; cmk -R` to renew the inventory for switch2 and apply the changes. Do not forget to verify the running configuration by invoking ‘cmk -D switch2’:

Screenshot of 'cmk -D switch2' execution

Revisiting 2014

January 1st, 2015

Business:

  • Attended only four conferences (FOSDEM, Linuxdays Graz, DebConf and Flowcon) due to taking care of my child each Tuesday and Friday (so it wasn’t easy to attend conferences during the week, and weekends are usually family time for me nowadays)
  • Home office/remote work: this year was very remote work and home office focused because I wanted to see my daughter grow up as much as possible and it was totally worth it
  • New company: together with a friend I started a new company called SynPro Solutions, focusing on IT administration for local (as in Graz/Styria/Austria) industry (lesson learned: there’s quite some overhead involved if you’re not doing business alone, but it’s getting better over time and it’s working fine nowadays) [disclaimer: Grml Solutions and Grml-Forensic still exist and are active]

Technology / Open Source:

  • container virtualization docker: worked a lot with it, especially related to integration in Jenkins
  • code review system gerrit: deployed to production at a customer and still manage and support it (in combination with jenkins-debian-glue)
  • Amazon/EC2, GCE & CO: did a bunch of stuff related to auto scaling, automatic deployment and image/AMI generation
  • Ten years of Grml!

Personal:

  • Used the Steiermark-Card to visit many nice places in Styria with my family (Ökopark Almenland is amongst our favorite places for children)
  • Bought myself a Roland TD-30KV drum-kit, allowing me to play drums whenever I want to (which is great if you don’t have to think of your family or neighbors, just use headphones and start rocking!)
  • Started training table tennis in summer with one of my neighbors (who’s a [semi-]professional), playing about weekly/bi-weekly since then and became quite proficient
  • Became styrian academic champion in Badminton, single and double
  • After having a year without that much reading due to new constraints (AKA family time) I managed to start reading on a regular base in the second half of the year, very much enjoyed that :) (though didn’t read as much as I wanted to in Q4 due to lots of work)
  • The year ended with the 2nd birthday of my daughter. I’m so happy and can’t imagine my life without her anymore.

Conclusion: 2014 was a great year for me, both business as personal wise. I would be more than happy if 2015 would be similar to it.

Installing Debian in UEFI mode with some stunts

December 30th, 2014

For a recent customer setup of Debian/wheezy on a IBM x3630 M4 server we used my blog entry “State of the art Debian/wheezy deployments with GRUB and LVM/SW-RAID/Crypto” as a base. But this time we wanted to use (U)EFI instead of BIOS legacy boot.

As usual we went for installing via Grml and grml-debootstrap. We started by dd-ing the Grml ISO to a USB stick (‘dd grml64-full_2014.11.iso of=/dev/sdX bs=1M‘). The IBM server couldn’t boot from it though, as far as we could identify it seems to be related to a problem with the IBM server not being able to properly recognize USB sticks that are registering themselves as mass storage device instead of removable storage devices (you can check your device via the /sys/devices/…/removable setting). So we enabled Legacy Boot and USB Storage in the boot manager of the server to be able to boot Grml in BIOS/legacy mode from this specific USB stick.

To install the GRUB boot loader in (U)EFI mode you need to be able to execute ‘modprobe efivars’. But our system was booted via BIOS/legacy and in that mode the ‘modprobe efivars’ doesnt work. We could have used a different USB device for booting Grml in UEFI mode but because we are lazy sysadmins and wanted to save time we went for a different route instead:

First of all we write the Grml 64bit ISO (which is (U)EFI capable out-of-the-box, also when dd-ing it) to the local RAID disk (being /dev/sdb in this example):

root@grml ~ # dd if=grml64-full_2014.11.iso of=/dev/sdb bs=1M

Now we should be able to boot in (U)EFI mode from the local RAID disk. To verify this before actually physically rebooting the system (and possibly getting into trouble) we can use qemu with OVMF:

root@grml ~ # apt-get update
root@grml ~ # apt-get install ovmf
root@grml ~ # qemu-system-x86_64 -bios /usr/share/qemu/OVMF.fd -hda /dev/sdb

The Grml boot splash comes up as expected, perfect. Now we actually reboot the live system and boot the ISO from the local disks in (U)EFI mode. Then we put the running Grml live system into RAM to not use and block the local disks any longer since we want to install Debian there. This can be achieved not just by the toram boot option, but also by executing grml2ram on-demand as needed from user space:

root@grml ~ # grml2ram

Now having the local disks available we verify that we’re running in (U)EFI mode by executing:

root@grml ~ # modprobe efivars
root@grml ~ # echo $?
0

Great, so we can install the system in (U)EFI mode now. Starting with the according partitioning (/dev/sda being the local RAID disk here):

root@grml ~ # parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) mkpart primary fat16 2048s 4095s
(parted) name 1 "EFI System"
(parted) mkpart primary 4096s 100%
(parted) name 2 "Linux LVM"
(parted) print
Model: IBM ServeRAID M5110 (scsi)
Disk /dev/sda: 9000GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

  Number  Start   End     Size    File system  Name        Flags
   1      1049kB  2097kB  1049kB  fat16        EFI System
   2      2097kB  9000GB  9000GB               Linux LVM

(parted) quit
Information: You may need to update /etc/fstab.

Then setting up LVM with a logical volume for the root fs and installing Debian via grml-debootstrap on it:

root@grml ~ # pvcreate /dev/sda2
  Physical volume "/dev/sda2" successfully created
root@grml ~ # vgcreate vg0 /dev/sda2
  Volume group "vg0" successfully created
root@grml ~ # lvcreate -n rootfs -L16G vg0
  Logical volume "rootfs" created
root@grml ~ # grml-debootstrap --target /dev/mapper/vg0-rootfs --password secret --hostname foobar --release wheezy
[...]

Now finally set up the (U)EFI partition and properly install GRUB in (U)EFI mode:

root@grml ~ # mkfs.fat -F 16 /dev/sda1
mkfs.fat 3.0.26 (2014-03-07)
WARNING: Not enough clusters for a 16 bit FAT! The filesystem will be
misinterpreted as having a 12 bit FAT without mount option "fat=16".

root@grml ~ # mount /dev/mapper/vg0-rootfs /mnt
root@grml ~ # grml-chroot /mnt /bin/bash
Writing /etc/debian_chroot ...
(foobar)root@grml:/# mkdir -p /boot/efi
(foobar)root@grml:/# mount /dev/sda1 /boot/efi
(foobar)root@grml:/# apt-get install grub-efi-amd64
[...]
(foobar)root@grml:/# grub-install /dev/sda
Timeout: 10 seconds
BootOrder: 0003,0000,0001,0002,0004,0005
Boot0000* CD/DVD Rom
Boot0001* Hard Disk 0
Boot0002* PXE Network
Boot0004* USB Storage
Boot0005* Legacy Only
Boot0003* debian
Installation finished. No error reported.
(foobar)root@grml:/# ls /boot/efi/EFI/debian/
grubx64.efi
(foobar)root@grml:/# update-grub
[...]
(foobar)root@grml:/# exit
root@grml ~ # umount /mnt/boot/efi
root@grml ~ # umount /mnt/
root@grml ~ # vgchange -an
  0 logical volume(s) in volume group "vg0" now active

That’s it. Now rebooting the system should bring you to your Debian installation running in (U)EFI mode. You can verify this before actually rebooting into the system by using the qemu/OVMF trick from above once again.

Ten years of Grml

December 22nd, 2014

*

On 22nd of October 2004 an event called OS04 took place in Seifenfabrik Graz/Austria and it marked the first official release of the Grml project.

Grml was initially started by myself in 2003 – I registered the domain on September 16, 2003 (so technically it would be 11 years already :)). It started with a boot-disk, first created by hand and then based on yard. On 4th of October 2004 we had a first presentation of “grml 0.09 Codename Bughunter” at Kunstlabor in Graz.

I managed to talk a good friend and fellow student – Martin Hecher – into joining me. Soon after Michael Gebetsroither and Andreas Gredler joined and throughout the upcoming years further team members (Nico Golde, Daniel K. Gebhart, Mario Lang, Gerfried Fuchs, Matthias Kopfermann, Wolfgang Scheicher, Julius Plenz, Tobias Klauser, Marcel Wichern, Alexander Wirt, Timo Boettcher, Ulrich Dangel, Frank Terbeck, Alexander Steinböck, Christian Hofstaedtler) and contributors (Hermann Thomas, Andreas Krennmair, Sven Guckes, Jogi Hofmüller, Moritz Augsburger,…) joined our efforts.

Back in those days most efforts went into hardware detection, loading and setting up the according drivers and configurations, packaging software and fighting bugs with lots of reboots (working on our custom /linuxrc for the initrd wasn’t always fun). Throughout the years virtualization became more broadly available, which is especially great for most of the testing you need to do when working on your own (meta) distribution. Once upon a time udev became available and solved most of the hardware detection issues for us. Nowadays X.org doesn’t even need a xorg.conf file anymore (at least by default). We have to acknowledge that Linux grew up over the years quite a bit (and I’m wondering how we’ll look back at the systemd discussions in a few years).

By having Debian Developers within the team we managed to push quite some work of us back to Debian (the distribution Grml was and still is based on), years before the Debian Derivatives initiative appeared. We never stopped contributing to Debian though and we also still benefit from the Debian Derivatives initiative, like sharing issues and ideas on DebConf meetings. On 28th of May 2009 I myself became an official Debian Developer.

Over the years we moved from private self-hosted infrastructure to company-sponsored systems, migrated from Subversion (brr) to Mercurial (2006) to Git (2008). Our Zsh-related work became widely known as grml-zshrc. jenkins.grml.org managed to become a continuous integration/deployment/delivery home e.g. for the dpkg, fai, initramfs-tools, screen and zsh Debian packages. The underlying software for creating Debian packages in a CI/CD way became its own project known as jenkins-debian-glue in August 2011. In 2006 I started grml-debootstrap, which grew into a reliable method for installing plain Debian (nowadays even supporting installation as VM, and one of my customers does tens of deployments per day with grml-debootstrap in a fully automated fashion). So one of the biggest achievements of Grml is – from my point of view – that it managed to grow several active and successful sub-projects under its umbrella.

Nowadays the Grml team consists of 3 Debian Developers – Alexander Wirt (formorer), Evgeni Golov (Zhenech) and myself. We couldn’t talk Frank Terbeck (ft) into becoming a DM/DD (yet?), but he’s an active part of our Grml team nonetheless and does a terrific job with maintaining grml-zshrc as well as helping out in Debian’s Zsh packaging (and being a Zsh upstream committer at the same time makes all of that even better :)).

My personal conclusion for 10 years of Grml? Back in the days when I was a student Grml was my main personal pet and hobby. Grml grew into an open source project which wasn’t known just in Graz/Austria, but especially throughout the German system administration scene. Since 2008 I’m working self-employed and mainly working on open source stuff, so I’m kind of living a dream, which I didn’t even have when I started with Grml in 2003. Nowadays with running my own business and having my own family it’s getting harder for me to consider it still a hobby though, instead it’s more integrated and part of my business – which I personally consider both good and bad at the same time (for various reasons).

Thanks so much to anyone of you, who was (and possibly still is) part of the Grml journey! Let’s hope for another 10 successful years!

Thanks to Max Amanshauser and Christian Hofstaedtler for reading drafts of this.

Book Review: The Docker Book

July 23rd, 2014

Docker is an open-source project that automates the deployment of applications inside software containers. I’m responsible for a docker setup with Jenkins integration and a private docker-registry setup at a customer and pre-ordered James Turnbull’s “The Docker Book” a few months ago.

Recently James – he’s working for Docker Inc – released the first version of the book and thanks to being on holidays I already had a few hours to read it AND blog about it. :) (Note: I’ve read the Kindle version 1.0.0 and all the issues I found and reported to James have been fixed in the current version already, jey.)

The book is very well written and covers all the basics to get familiar with Docker and in my opinion it does a better job at that than the official user guide because of the way the book is structured. The book is also a more approachable way for learning some best practices and commonly used command lines than going through the official reference (but reading the reference after reading the book is still worth it).

I like James’ approach with “ENV REFRESHED_AT $TIMESTAMP” for better controlling the cache behaviour and definitely consider using this in my own setups as well. What I wasn’t aware is that you can directly invoke “docker build $git_repos_url” and further noted a few command line switches I should get more comfortable with. I also plan to check out the Automated Builds on Docker Hub.

There are some references to further online resources, which is relevant especially for the more advanced use cases, so I’d recommend to have network access available while reading the book.

What I’m missing in the book are best practices for running a private docker-registry in a production environment (high availability, scaling options,…). The provided Jenkins use cases are also very basic and nothing I personally would use. I’d also love to see how other folks are using the Docker plugin, the Docker build step plugin or the Docker build publish plugin in production (the plugins aren’t covered in the book at all). But I’m aware that this are fast moving parts and specialised used cases – upcoming versions of the book are already supposed to cover orchestration with libswarm, developing Docker plugins and more advanced topics, so I’m looking forward to further updates of the book (which you get for free as existing customer, being another plus).

Conclusion: I enjoyed reading the Docker book and can recommend it, especially if you’re either new to Docker or want to get further ideas and inspirations what folks from Docker Inc consider best practices.

kamailio-deb-jenkins: Open Source Jenkins setup for Debian Packaging

March 25th, 2014

Kamailio is an Open Source SIP Server. Since beginning of March 2014 a new setup for Kamailio‘s Debian packages is available. Development of this setup is sponsored by Sipwise and I am responsible for its infrastructure part (Jenkins, EC2, jenkins-debian-glue).

The setup includes support for building Debian packages for Debian 5 (lenny), 6 (squeeze), 7 (wheezy) and 8 (jessie) as well as Ubuntu 10.04 (lucid) and 12.04 (precise), all of them architectures amd64 and i386.

My work is fully open sourced. Deployment instructions, scripts and configuration are available at kamailio-deb-jenkins, so if you’re interested in setting up your own infrastructure for Continuous Integration with Debian/Ubuntu packages that’s a very decent starting point.

NOTE: I’ll be giving a talk about Continuous Integration with Debian/Ubuntu packages at Linuxdays Graz/Austria on 5th of April. Besides kamailio-deb-jenkins I’ll also cover best practices, Debian packaging, EC2 autoscaling,…

Building Debian+Ubuntu packages on EC2

March 25th, 2014

In a project I recently worked on we wanted to provide a jenkins-debian-glue based setup on Amazon’s EC2 for building Debian and Ubuntu packages. The idea is to keep a not-so-strong powered Jenkins master up and running 24×7, while stronger machines serving as Jenkins slaves should be launched only as needed. The project setup in question is fully open sourced (more on that in a piuparts (.deb package installation, upgrading, and removal testing tool) on the resulting binary packages. The Debian packages (source+binaries) are then provided back to Jenkins master and put into a reprepro powered Debian repository for public usage.

Preparation

The starting point was one of the official Debian AMIs (x86_64, paravirtual on EBS). We automatically deployed jenkins-debian-glue on the system which is used as Jenkins master (we chose a m1.small instance for our needs).

We started another instance, slightly adjusted it to already include jenkins-debian-glue related stuff out-of-the-box (more details in section “Reduce build time” below) and created an AMI out of it. This new AMI ID can be configured for usage inside Jenkins by using the Amazon EC2 Plugin (see screenshot below).

IAM policy

Before configuring EC2 in Jenkins though start by adding a new user (or group) in AWS’s IAM (Identity and Access Management) with a custom policy. This ensures that your EC2 user in Jenkins doesn’t have more permissions than really needed. The following policy should give you a starting point (we restrict the account to allow actions only in the EC2 region eu-west-1, YMMV):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ec2:CreateTags",
        "ec2:DescribeInstances",
        "ec2:DescribeImages",
        "ec2:DescribeKeyPairs",
        "ec2:GetConsoleOutput",
        "ec2:RunInstances",
        "ec2:StartInstances",
        "ec2:StopInstances",
        "ec2:TerminateInstances"
      ],
      "Effect": "Allow",
      "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "ec2:Region": "eu-west-1"
                }
            }
    }
  ]
}

Jenkins configuration

Configure EC2 access with “Access Key ID”, “Secret Access Key”, “Region” and “EC2 Key Pair’s Private Key” (for SSH login) inside Jenkins in the Cloud section on $YOUR_JENKINS_SERVER/configure. Finally add an AMI in the AMIs Amazon EC2 configuration section (adjust security-group as needed, SSH access is enough):

As you can see the configuration also includes a launch script. This script ensures that slaves are set up as needed (provide all the packages and scripts that are required for building) and always get the latest configuration and scripts before starting to serve as Jenkins slave.

Now your setup should be ready for launching Jenkins slaves as needed:

NOTE: you can use the “Instance Cap” configuration inside the advanced Amazon EC2 Jenkins configuration section to place an upward limit to the number of EC2 instances that Jenkins may launch. This can be useful for avoiding surprises in your AWS invoices. :) Notice though that the cap numbers are calculated for all your running EC2 instances, so be aware if you have further machines running under your account, you might want to e.g. further restrict your IAM policy then.

Reduce build time

Using a plain Debian AMI and automatically installing jenkins-debian-glue and further jenkins-debian-glue-buildenv* packages on each slave startup would work but it takes time. That’s why we created our own AMI which is nothing else than an official Debian AMI with the bootstrap.sh script (which is referred to in the screenshot above) already executed. All the necessary packages are pre-installed and also all the cowbuilder environments are already present then. From time to time we start the instance again to apply (security) updates and execute the bootstrap script with its ‐‐update option to also have all the cowbuilder systems up2date. Creating a new AMI is a no-brainer and we can then use the up2date system for our Jenkins slaves, if something should break for whatever reason we can still fall back to an older known-to-be-good AMI.

Final words

How to set up your Jenkins jobs for optimal master/slave usage, multi-distribution support (Debian/Ubuntu) and further details about this setup are part of another blog post.

Thanks to Andreas Granig, Victor Seva and Bernhard Miklautz for reading drafts of this.

Jenkins on-demand slave selection through labels

March 1st, 2014

Problem description: One of my customers had a problem with their Selenium tests in the Jenkins continuous integration system. While Perl’s Test::WebDriver still worked just fine the Selenium tests using Ruby’s selenium-webdriver suddenly reported failures. The problem was caused by Debian wheezy’s upgrade of the Iceweasel web browser. Debian originally shipped Iceweasel version 17.0.10esr-1~deb7u1 in wheezy, but during a security-update version 24.3.0esr-1~deb7u1 was brought in through the wheezy-security channel. Because the selenium tests are used in an automated fashion in a quite large and long-running build pipeline we immediately rolled back to Iceweasel version 17.0.10esr-1~deb7u1 so everything can continue as expected. Of course we wanted to get the new Iceweasel version up and running, but we didn’t want to break the existing workflow while working on it. This is where on-demand slave selection through labels comes in.

Basics: As soon as you’re using Jenkins slaves you can instruct Jenkins to run a specific project on a particular (slave) node. By attaching labels to your slaves you can also use a label instead of a specific node name, providing more flexibility and scalability (to e.g. avoid problems if a specific node is down or you want to scale to more systems). Then Jenkins decides which of the nodes providing the according label should be considered for job execution. In the following screenshot a job uses the ‘selenium’ label to restrict its execution to the slaves providing selenium and currently there are two nodes available providing this label:

TIP 1: Visiting $JENKINS_SERVER/label/$label/ provides a list of slaves that provide that given $label (as well as list of projects that use $label in their configuration), like:



TIP 2:
Execute the following script on $JENKINS_SERVER/script to get a list of available labels of your Jenkins system:

import hudson.model.*
labels = Hudson.instance.getLabels()
labels.each{ label -> println label.name }

Solution: In the according customer setup we’re using the swarm plugin (with automated Debian deployment through Grml’s netscript boot option, grml-debootstrap + Puppet) to automatically connect our Jenkins slaves to Jenkins master without any manual intervention. The swarm plugin allows you to define the labels through the -labels command line option.

By using the NodeLabel Parameter plugin we can configure additional parameters in Jenkins jobs: ‘node’ and ‘label’. The ‘label’ parameter allows us to execute the jobs on the nodes providing the requested label:

This is what we can use to gradually upgrade from the old Iceweasel version to the new one by keeping a given set of slaves at the old Iceweasel version while we’re upgrading other nodes to the new Iceweasel version (same for the selenium-server version which we want to also control). We can include the version number of the Iceweasel and selenium-server packages inside the labels we announce through the swarm slaves, with something like:

if [ -r /etc/init.d/selenium-server ] ; then
  FLAGS="selenium"

  ICEWEASEL_VERSION="$(dpkg-query --show --showformat='${Version}' iceweasel)"
  if [ -n "$ICEWEASEL_VERSION" ] ; then
    ICEWEASEL_FLAG="iceweasel-${ICEWEASEL_VERSION%%.*}"
    EXTRA_FLAGS="$EXTRA_FLAGS $ICEWEASEL_FLAG"
  fi

  SELENIUM_VERSION="$(dpkg-query --show --showformat='${Version}' selenium-server)"
  if [ -n "$SELENIUM_VERSION" ] ; then
    SELENIUM_FLAG="selenium-${SELENIUM_VERSION%-*}"
    EXTRA_FLAGS="$EXTRA_FLAGS $SELENIUM_FLAG"
  fi
fi

Then by using ‘-labels “$FLAGS EXTRA_FLAGS”‘ in the swarm invocation script we end up with labels like ‘selenium iceweasel-24 selenium-2.40.0’ for the slaves providing the Iceweasel v24 and selenium v2.40.0 Debian packages and ‘selenium iceweasel-17 selenium-2.40.0’ for the slaves providing Iceweasel v17 and selenium v2.40.0.

This is perfect for our needs, because instead of using the “selenium” label (which is still there) we can configure the selenium jobs that should continue to work as usual to default to the slaves with the iceweasel-17 label now. The development related jobs though can use label iceweasel-24 and fail as often as needed without interrupting the build pipeline used for production.

To illustrate this here we have slave selenium-client2 providing Iceweasel v17 with selenium-server v2.40. When triggering the production selenium job it will get executed on selenium-client2, because that’s the slave providing the requested labels:

Whereas the development selenium job can point to the slaves providing Iceweasel v24, so it will be executed on slave selenium-client1 here:

This setup allowed us to work on the selenium Ruby tests while not conflicting with any production build pipeline. By the time I’m writing about this setup we’ve already finished the migration to support Iceweasel v24 and the infrastructure is ready for further Iceweasel and selenium-server upgrades.

Full-Crypto setup with GRUB2

February 28th, 2014

Update on 2014-03-03: quoting Colin Watson from the comments:

Note that this is spelled GRUB_ENABLE_CRYPTODISK=y in GRUB 2.02 betas (matching the 2.00 documentation though not the implementation; not sure why Andrey chose to go with the docs).

Since several people asked me how to get such a setup and it’s poorly documented (as in: I found it in the GRUB sources) I decided to blog about this. When using GRUB >=2.00-22 (as of February 2014 available in Debian/jessie and Debian/unstable) it’s possible to boot from a full-crypto setup (this doesn’t mean it’s recommended, but it worked fine in my test setups so far). This means not even an unencrypted /boot partition is needed.

Before executing the grub-install commands execute those steps (inside the system/chroot of course, adjust GRUB_PRELOAD_MODULES for your setup as needed, I’ve used it in a setup with SW-RAID/LVM):

# echo GRUB_CRYPTODISK_ENABLE=y >> /etc/default/grub
# echo 'GRUB_PRELOAD_MODULES="lvm cryptodisk mdraid1x"' >> /etc/default/grub

This will result in the following dialog before getting to GRUB’s bootsplash:

State of the art Debian/wheezy deployments with GRUB and LVM/SW-RAID/Crypto

February 28th, 2014

Moving from Lilo to GRUB, using LVM as default, etc throughout the last years it was time to evaluate how well LVM works without a separate boot partition, possibly also on top of Software RAID. Big disks are asking for partitioning with GPT, just UEFI isn’t my default yet, so I’m still defaulting to Legacy BIOS for Debian/wheezy (I expect this to change for Debian/jessie and according hardware approaching at my customers).

So what we have and want in this demonstration setup:

  • Debian 7 AKA wheezy
  • 4 hard-disks with Software RAID (on 8GB RAM), using GPT partitioning + GRUB2
  • using state-of-the-art features without too much workarounds like separate /boot partition outside of LVM or mdadm with 0.9 metadata, just no (U)EFI yet
  • LVM on top of SW-RAID (RAID5) for /boot partition [SW-RAID->LVM]
  • Cryptsetup-LUKS on top of LVM on top of SW-RAID (RAID5) for data [SW-RAID->LVM->Crypto] (this gives us more flexibility about crypto yes/no and different cryptsetup options for the LVs compared to using it below RAID/LVM)
  • Rescue-system intregration via grml-rescueboot (not limited to Grml, but Grml should work out-of-the-box)

System used for installation:

root@grml ~ # grml-version
grml64-full 2013.09 Release Codename Hefeknuddler [2013-09-27]

Partition setup:

root@grml ~ # parted /dev/sda
GNU Parted 2.3
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) mkpart primary 2048s 4095s
(parted) set 1 bios_grub on
(parted) name 1 "BIOS Boot Partition"
(parted) mkpart primary 4096s 100%
(parted) set 2 raid on
(parted) name 2 "SW-RAID / Linux"
(parted) quit
Information: You may need to update /etc/fstab.

Clone partition layout from sda to all the other disks:

root@grml ~ # for f in {b,c,d} ; sgdisk -R=/dev/sd$f /dev/sda
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.

Make sure each disk has its unique UUID:

root@grml ~ # for f in {b,c,d} ; sgdisk -G /dev/sd$f
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.

SW-RAID setup:

root@grml ~ # mdadm --create /dev/md0 --verbose --level=raid5 --raid-devices=4 /dev/sd{a,b,c,d}2
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: size set to 1465004544K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
root@grml ~ #

SW-RAID speedup (system dependent, YMMV):

root@grml ~ # cat /sys/block/md0/md/stripe_cache_size
256
root@grml ~ # echo 16384 > /sys/block/md0/md/stripe_cache_size # 16MB
root@grml ~ # blockdev --getra /dev/md0
6144
root@grml ~ # blockdev --setra 65536 /dev/md0 # 32 MB
root@grml ~ # sysctl dev.raid.speed_limit_max
dev.raid.speed_limit_max = 200000
root@grml ~ # sysctl -w dev.raid.speed_limit_max=9999999999
dev.raid.speed_limit_max = 9999999999
root@grml ~ # sysctl dev.raid.speed_limit_min
dev.raid.speed_limit_min = 1000
root@grml ~ # sysctl -w dev.raid.speed_limit_min=100000
dev.raid.speed_limit_min = 100000

LVM setup:

root@grml ~ # pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created
root@grml ~ # vgcreate homesrv /dev/md0
  Volume group "homesrv" successfully created
root@grml ~ # lvcreate -n rootfs -L4G homesrv
  Logical volume "rootfs" created
root@grml ~ # lvcreate -n bootfs -L1G homesrv
  Logical volume "bootfs" created

Check partition setup + alignment:

root@grml ~ # parted -s /dev/sda print
Model: ATA WDC WD15EADS-00P (scsi)
Disk /dev/sda: 1500GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name                 Flags
 1      1049kB  2097kB  1049kB               BIOS Boot Partition  bios_grub
 2      2097kB  1500GB  1500GB               SW-RAID / Linux      raid

root@grml ~ # parted -s /dev/sda unit s print
Model: ATA WDC WD15EADS-00P (scsi)
Disk /dev/sda: 2930277168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start  End          Size         File system  Name                 Flags
 1      2048s  4095s        2048s                     BIOS Boot Partition  bios_grub
 2      4096s  2930276351s  2930272256s               SW-RAID / Linux      raid

root@grml ~ # gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.8.5

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 2930277168 sectors, 1.4 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 212E463A-A4E3-428B-B7E5-8D5785141564
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 2930277134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2797 sectors (1.4 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            4095   1024.0 KiB  EF02  BIOS Boot Partition
   2            4096      2930276351   1.4 TiB     FD00  SW-RAID / Linux

root@grml ~ # mdadm -E /dev/sda2
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 7ed3b741:0774d529:d5a71c1f:cf942f0a
           Name : grml:0  (local to host grml)
  Creation Time : Fri Jan 31 15:26:12 2014
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 2928060416 (1396.21 GiB 1499.17 GB)
     Array Size : 4392089088 (4188.62 GiB 4497.50 GB)
  Used Dev Size : 2928059392 (1396.21 GiB 1499.17 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : c1e4213b:81822fd1:260df456:2c9926fb

    Update Time : Mon Feb  3 09:41:48 2014
       Checksum : b8af8f6 - correct
         Events : 72

         Layout : left-symmetric
     chunk size : 512k

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)

root@grml ~ # pvs -o +pe_start
  PV         VG      Fmt  Attr PSize PFree 1st PE
  /dev/md0   homesrv lvm2 a--  4.09t 4.09t   1.50m
root@grml ~ # pvs --units s -o +pe_start
  PV         VG      Fmt  Attr PSize       PFree       1st PE
  /dev/md0   homesrv lvm2 a--  8784175104S 8773689344S   3072S
root@grml ~ # pvs -o +pe_start
  PV         VG      Fmt  Attr PSize PFree 1st PE
  /dev/md0   homesrv lvm2 a--  4.09t 4.09t   1.50m
root@grml ~ # pvs --units s -o +pe_start
  PV         VG      Fmt  Attr PSize       PFree       1st PE
  /dev/md0   homesrv lvm2 a--  8784175104S 8773689344S   3072S
root@grml ~ # vgs -o +pe_start
  VG      #PV #LV #SN Attr   VSize VFree 1st PE
  homesrv   1   2   0 wz--n- 4.09t 4.09t   1.50m
root@grml ~ # vgs --units s -o +pe_start
  VG      #PV #LV #SN Attr   VSize       VFree       1st PE
  homesrv   1   2   0 wz--n- 8784175104S 8773689344S   3072S

Cryptsetup:

root@grml ~ # echo cryptsetup >> /etc/debootstrap/packages
root@grml ~ # cryptsetup luksFormat -c aes-xts-plain64 -s 256 /dev/mapper/homesrv-rootfs

WARNING!
========
This will overwrite data on /dev/mapper/homesrv-rootfs irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase:
Verify passphrase:
root@grml ~ # cryptsetup luksOpen /dev/mapper/homesrv-rootfs cryptorootfs
Enter passphrase for /dev/mapper/homesrv-rootfs:

Filesystems:

root@grml ~ # mkfs.ext4 /dev/mapper/cryptorootfs
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=384 blocks
262144 inodes, 1048192 blocks
52409 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

root@grml ~ # mkfs.ext4 /dev/mapper/homesrv-bootfs
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=384 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376

Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Install Debian/wheezy:

root@grml ~ # mount /dev/mapper/cryptorootfs /media
root@grml ~ # mkdir /media/boot
root@grml ~ # mount /dev/mapper/homesrv-bootfs /media/boot
root@grml ~ # grml-debootstrap --target /media --password YOUR_PASSWORD --hostname YOUR_HOSTNAME
 * grml-debootstrap [0.57] - Please recheck configuration before execution:

   Target:          /media
   Install grub:    no
   Using release:   wheezy
   Using hostname:  YOUR_HOSTNAME
   Using mirror:    http://http.debian.net/debian
   Using arch:      amd64

   Important! Continuing will delete all data from /media!

 * Is this ok for you? [y/N] y
[...]

Enable grml-rescueboot (to have easy access to rescue ISO via GRUB):

root@grml ~ # mkdir /media/boot/grml
root@grml ~ # wget -O /media/boot/grml/grml64-full_$(date +%Y.%m.%d).iso http://daily.grml.org/grml64-full_testing/latest/grml64-full_testing_latest.iso
root@grml ~ # grml-chroot /media apt-get -y install grml-rescueboot

[NOTE: We’re installing a daily ISO for grml-rescueboot here because the 2013.09 Grml release doesn’t work for this LVM/SW-RAID setup while newer ISOs are working fine already. The upcoming Grml stable release is supposed to work just fine, so you will be able to choose http://download.grml.org/grml64-full_2014.XX.iso by then. :)]

Install GRUB on all disks and adjust crypttab, fstab + initramfs:

root@grml ~ # grml-chroot /media /bin/bash
(grml)root@grml:/# for f in {a,b,c,d} ; do grub-install /dev/sd$f ; done
(grml)root@grml:/# update-grub
(grml)root@grml:/# echo "cryptorootfs /dev/mapper/homesrv-rootfs none luks" > /etc/crypttab
(grml)root@grml:/# echo "/dev/mapper/cryptorootfs / auto defaults,errors=remount-ro 0   1" > /etc/fstab
(grml)root@grml:/# echo "/dev/mapper/homesrv-bootfs /boot auto defaults 0 0" >> /etc/fstab
(grml)root@grml:/# update-initramfs -k all -u
(grml)root@grml:/# exit

Clean unmounted/removal for reboot:

root@grml ~ # umount /media/boot
root@grml ~ # umount /media/
root@grml ~ # cryptsetup luksClose cryptorootfs
root@grml ~ # dmsetup remove homesrv-bootfs
root@grml ~ # dmsetup remove homesrv-rootfs

NOTE: On a previous hardware installation I had to install GRUB 2.00-22 from Debian/unstable to get GRUB working.
Some metadata from different mdadm and LVM experiments seems to have been left and confused GRUB 1.99-27+deb7u2 from Debian/wheezy (I wasn’t able to reproduce this issue in my VM demo/test setup).
Just in cause you might experience the following error message, try GRUB >=2.00-22:

  # grub-install --recheck /dev/sda
  error: unknown LVM metadata header.
  error: unknown LVM metadata header.
  /usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/cryptorootfs.  Check your device.map.
  Auto-detection of a filesystem of /dev/mapper/cryptorootfs failed.
  Try with --recheck.
  If the problem persists please report this together with the output of "/usr/sbin/grub-probe --device-map="/boot/grub/device.map"
  --target=fs -v /boot/grub" to <bug-grub@gnu.org>

Revisiting 2013

January 8th, 2014

2013 was a fantastic year for me. Following the real-life fork it was the first year of my daughter’s life. Whenever I heard people saying “oh, children are growing sooooo fast” in the past it felt like a lie for me, but seeing your own child grow I can just sign that, and it’s fantastic.

It was also the first year in our new home, and I’m not just happy with the building itself but also our neighborhood turned out to be great. New friends for drinking beer. :)

Also my business year was kind of special. jenkins-debian-glue has quite taken off and I had several interesting consulting gigs thanks to it. Business wise I learned a lot of stuff, especially related to distributor handling thanks to business around Grml-Forensic.

2014 already started interesting with several things in the pipeline, more details about that at a later stage.

Conclusion: I tend to call 2013 the best year of my life so far.

Event: mur.at Streitgespräche am 05.12.2013 + 06.12.2013 im ESC

November 27th, 2013

Copy/Paste von mur.at:

mur.at Streitgespräche am 05.12.2013 und 06.12.2013 – 19 Uhr

Piep, Piep, Piep, wir haben uns alle lieb? Von wegen! Mur.at möchte die Macht des guten alten Streitgesprächs, des gepflegten Meinungsaustauschs, der erbitterten Kontroverse, der lautgedachten Erörterung, des abendlichen Gedankenaustausches, des freundlichen Meinungsgefechts, etc. wiederbeleben!

Darum richten wir erstmals an zwei aufeinander folgenden Abenden jeweils einen potenziellen Disput zweier Widersacher*Innen mit Beteiligungsmöglichkeit des Publikums aus. Veranstaltungsort ist das neue ESC (esc.mur.at), Bürgergasse 5, Palais Trauttmansdorff, Graz.

Thema 1: Zwischen Datenmaßlosigkeit und Speicherdiät: Lebst du schon in der Cloud oder wie viele externe Speicherplatten hast du so?

Donnerstag, 05.12.2013, 19 Uhr

In Zeiten des eigenen exzessiven Datenspeicherbedürfnisses und der gleichzeitig allgegenwärtigen Datenüberwachung durch Staat und Wirtschaft diskutieren die geladenen “Streithansln” verschiedene Ansätze und Maßnahmen, wie man mit den eigenen Daten umgehen sollte. Es treffen aufeinander Karl Voit und Heinz Wittenbrink.

Thema 2: A/Symmetrischer Internetausbau – wie schnell soll das Internet der Zukunft noch werden?

Freitag, 06.12.2013, 19 Uhr

In vielen Haushalten regiert der preisgünstige ADSL Internetanschluss (sprich: viel Download, wenig Upload) und nur wenige professionelle NutzerInnen leisten sich den ungleich teureren Ausbau der schnellen Glasfaserleitung. Über die in/direkten politischen und praktischen Folgen dieser Aus- und Umbaupolitik der Datenautobahn zanken Michael “Mika” Prokop und Josef “Seppo” Gründler.

Event: Infracoders Graz Treffen am 09.10.2013

October 3rd, 2013

Edmund Haselwanter­ hat mich eingeladen beim ersten Treffen der Infracoders Graz über das Thema “Continuous Integration im Rechenzentrum” zu reden. Dieser Einladung folge ich sehr gerne und werde basierend auf meinem Vortrag von der OSDC über Themen wie Continuous Integration/Delivery, Jenkins, jenkins-debian-glue, Puppet und mcollective referieren.

  • Wann: Mittwoch, 9. Oktober 2013, 19:00 Uhr
  • Wo: BYTEPOETS GmbH / Münzgrabenstraße 92/2/3, Graz

Der Eintritt ist frei, Bier und Pizza werden von BYTEPOETS gesponsert. Weitere Details gibt es auf Meetup.

How to get grub-reboot working™

May 10th, 2013

So while testing Proxmox VE 3.0 RC1 I had the need to reboot the system into a kernel version different than the one being the default in the bootloader GRUB. “lilo -R …” worked fine in the past, but with GRUB it’s not as trivial on the first sight to get its equivalent. I remembered to have had problems with grub-reboot in the past already, or to quote a friend of mine: “has grub-reboot worked ever?”

Well yes, grub-reboot works – but only once you’re aware of the fact that you need to manually edit /etc/default/grub. :( It’s actually documented at wiki.debian.org/GrubReboot, but not in the man page/info document of grub-reboot itself (great idea to provide a separate wiki page for this issue but not consider editing the official documentation instead, not).

So here you go:

# grep GRUB_DEFAULT /etc/default/grub 
GRUB_DEFAULT=0
# sed -i 's/^GRUB_DEFAULT.*/GRUB_DEFAULT=saved/' /etc/default/grub
# grep GRUB_DEFAULT /etc/default/grub 
GRUB_DEFAULT=saved

# update-grub
[...]
# grep '^menuentry' /boot/grub/grub.cfg
menuentry 'Debian GNU/Linux, with Linux 3.2.0-4-amd64' --class debian --class gnu-linux --class gnu --class os {
menuentry 'Debian GNU/Linux, with Linux 3.2.0-4-amd64 (recovery mode)' --class debian --class gnu-linux --class gnu --class os {
menuentry 'Debian GNU/Linux, with Linux 2.6.32-20-pve' --class debian --class gnu-linux --class gnu --class os {
menuentry 'Debian GNU/Linux, with Linux 2.6.32-20-pve (recovery mode)' --class debian --class gnu-linux --class gnu --class os {
menuentry 'Debian GNU/Linux, with Linux 2.6.32-5-amd64' --class debian --class gnu-linux --class gnu --class os {
menuentry 'Debian GNU/Linux, with Linux 2.6.32-5-amd64 (recovery mode)' --class debian --class gnu-linux --class gnu --class os {

# grub-reboot 2  # to boot the third entry, the command writes to /boot/grub/grubenv
# reboot

FTR: Filed as #707695.