Don't understand german? Read or subscribe to my english-only feed.

State of the art Debian/wheezy deployments with GRUB and LVM/SW-RAID/Crypto

Moving from Lilo to GRUB, using LVM as default, etc throughout the last years it was time to evaluate how well LVM works without a separate boot partition, possibly also on top of Software RAID. Big disks are asking for partitioning with GPT, just UEFI isn’t my default yet, so I’m still defaulting to Legacy BIOS for Debian/wheezy (I expect this to change for Debian/jessie and according hardware approaching at my customers).

So what we have and want in this demonstration setup:

  • Debian 7 AKA wheezy
  • 4 hard-disks with Software RAID (on 8GB RAM), using GPT partitioning + GRUB2
  • using state-of-the-art features without too much workarounds like separate /boot partition outside of LVM or mdadm with 0.9 metadata, just no (U)EFI yet
  • LVM on top of SW-RAID (RAID5) for /boot partition [SW-RAID->LVM]
  • Cryptsetup-LUKS on top of LVM on top of SW-RAID (RAID5) for data [SW-RAID->LVM->Crypto] (this gives us more flexibility about crypto yes/no and different cryptsetup options for the LVs compared to using it below RAID/LVM)
  • Rescue-system intregration via grml-rescueboot (not limited to Grml, but Grml should work out-of-the-box)

System used for installation:

root@grml ~ # grml-version
grml64-full 2013.09 Release Codename Hefeknuddler [2013-09-27]

Partition setup:

root@grml ~ # parted /dev/sda
GNU Parted 2.3
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) mkpart primary 2048s 4095s
(parted) set 1 bios_grub on
(parted) name 1 "BIOS Boot Partition"
(parted) mkpart primary 4096s 100%
(parted) set 2 raid on
(parted) name 2 "SW-RAID / Linux"
(parted) quit
Information: You may need to update /etc/fstab.

Clone partition layout from sda to all the other disks:

root@grml ~ # for f in {b,c,d} ; sgdisk -R=/dev/sd$f /dev/sda
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.

Make sure each disk has its unique UUID:

root@grml ~ # for f in {b,c,d} ; sgdisk -G /dev/sd$f
The operation has completed successfully.
The operation has completed successfully.
The operation has completed successfully.

SW-RAID setup:

root@grml ~ # mdadm --create /dev/md0 --verbose --level=raid5 --raid-devices=4 /dev/sd{a,b,c,d}2
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: size set to 1465004544K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
root@grml ~ #

SW-RAID speedup (system dependent, YMMV):

root@grml ~ # cat /sys/block/md0/md/stripe_cache_size
256
root@grml ~ # echo 16384 > /sys/block/md0/md/stripe_cache_size # 16MB
root@grml ~ # blockdev --getra /dev/md0
6144
root@grml ~ # blockdev --setra 65536 /dev/md0 # 32 MB
root@grml ~ # sysctl dev.raid.speed_limit_max
dev.raid.speed_limit_max = 200000
root@grml ~ # sysctl -w dev.raid.speed_limit_max=9999999999
dev.raid.speed_limit_max = 9999999999
root@grml ~ # sysctl dev.raid.speed_limit_min
dev.raid.speed_limit_min = 1000
root@grml ~ # sysctl -w dev.raid.speed_limit_min=100000
dev.raid.speed_limit_min = 100000

LVM setup:

root@grml ~ # pvcreate /dev/md0
  Physical volume "/dev/md0" successfully created
root@grml ~ # vgcreate homesrv /dev/md0
  Volume group "homesrv" successfully created
root@grml ~ # lvcreate -n rootfs -L4G homesrv
  Logical volume "rootfs" created
root@grml ~ # lvcreate -n bootfs -L1G homesrv
  Logical volume "bootfs" created

Check partition setup + alignment:

root@grml ~ # parted -s /dev/sda print
Model: ATA WDC WD15EADS-00P (scsi)
Disk /dev/sda: 1500GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name                 Flags
 1      1049kB  2097kB  1049kB               BIOS Boot Partition  bios_grub
 2      2097kB  1500GB  1500GB               SW-RAID / Linux      raid

root@grml ~ # parted -s /dev/sda unit s print
Model: ATA WDC WD15EADS-00P (scsi)
Disk /dev/sda: 2930277168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start  End          Size         File system  Name                 Flags
 1      2048s  4095s        2048s                     BIOS Boot Partition  bios_grub
 2      4096s  2930276351s  2930272256s               SW-RAID / Linux      raid

root@grml ~ # gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.8.5

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 2930277168 sectors, 1.4 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 212E463A-A4E3-428B-B7E5-8D5785141564
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 2930277134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2797 sectors (1.4 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            4095   1024.0 KiB  EF02  BIOS Boot Partition
   2            4096      2930276351   1.4 TiB     FD00  SW-RAID / Linux

root@grml ~ # mdadm -E /dev/sda2
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 7ed3b741:0774d529:d5a71c1f:cf942f0a
           Name : grml:0  (local to host grml)
  Creation Time : Fri Jan 31 15:26:12 2014
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 2928060416 (1396.21 GiB 1499.17 GB)
     Array Size : 4392089088 (4188.62 GiB 4497.50 GB)
  Used Dev Size : 2928059392 (1396.21 GiB 1499.17 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : c1e4213b:81822fd1:260df456:2c9926fb

    Update Time : Mon Feb  3 09:41:48 2014
       Checksum : b8af8f6 - correct
         Events : 72

         Layout : left-symmetric
     chunk size : 512k

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)

root@grml ~ # pvs -o +pe_start
  PV         VG      Fmt  Attr PSize PFree 1st PE
  /dev/md0   homesrv lvm2 a--  4.09t 4.09t   1.50m
root@grml ~ # pvs --units s -o +pe_start
  PV         VG      Fmt  Attr PSize       PFree       1st PE
  /dev/md0   homesrv lvm2 a--  8784175104S 8773689344S   3072S
root@grml ~ # pvs -o +pe_start
  PV         VG      Fmt  Attr PSize PFree 1st PE
  /dev/md0   homesrv lvm2 a--  4.09t 4.09t   1.50m
root@grml ~ # pvs --units s -o +pe_start
  PV         VG      Fmt  Attr PSize       PFree       1st PE
  /dev/md0   homesrv lvm2 a--  8784175104S 8773689344S   3072S
root@grml ~ # vgs -o +pe_start
  VG      #PV #LV #SN Attr   VSize VFree 1st PE
  homesrv   1   2   0 wz--n- 4.09t 4.09t   1.50m
root@grml ~ # vgs --units s -o +pe_start
  VG      #PV #LV #SN Attr   VSize       VFree       1st PE
  homesrv   1   2   0 wz--n- 8784175104S 8773689344S   3072S

Cryptsetup:

root@grml ~ # echo cryptsetup >> /etc/debootstrap/packages
root@grml ~ # cryptsetup luksFormat -c aes-xts-plain64 -s 256 /dev/mapper/homesrv-rootfs

WARNING!
========
This will overwrite data on /dev/mapper/homesrv-rootfs irrevocably.

Are you sure? (Type uppercase yes): YES
Enter passphrase:
Verify passphrase:
root@grml ~ # cryptsetup luksOpen /dev/mapper/homesrv-rootfs cryptorootfs
Enter passphrase for /dev/mapper/homesrv-rootfs:

Filesystems:

root@grml ~ # mkfs.ext4 /dev/mapper/cryptorootfs
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=384 blocks
262144 inodes, 1048192 blocks
52409 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

root@grml ~ # mkfs.ext4 /dev/mapper/homesrv-bootfs
mke2fs 1.42.8 (20-Jun-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=384 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376

Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Install Debian/wheezy:

root@grml ~ # mount /dev/mapper/cryptorootfs /media
root@grml ~ # mkdir /media/boot
root@grml ~ # mount /dev/mapper/homesrv-bootfs /media/boot
root@grml ~ # grml-debootstrap --target /media --password YOUR_PASSWORD --hostname YOUR_HOSTNAME
 * grml-debootstrap [0.57] - Please recheck configuration before execution:

   Target:          /media
   Install grub:    no
   Using release:   wheezy
   Using hostname:  YOUR_HOSTNAME
   Using mirror:    http://http.debian.net/debian
   Using arch:      amd64

   Important! Continuing will delete all data from /media!

 * Is this ok for you? [y/N] y
[...]

Enable grml-rescueboot (to have easy access to rescue ISO via GRUB):

root@grml ~ # mkdir /media/boot/grml
root@grml ~ # wget -O /media/boot/grml/grml64-full_$(date +%Y.%m.%d).iso http://daily.grml.org/grml64-full_testing/latest/grml64-full_testing_latest.iso
root@grml ~ # grml-chroot /media apt-get -y install grml-rescueboot

[NOTE: We’re installing a daily ISO for grml-rescueboot here because the 2013.09 Grml release doesn’t work for this LVM/SW-RAID setup while newer ISOs are working fine already. The upcoming Grml stable release is supposed to work just fine, so you will be able to choose http://download.grml.org/grml64-full_2014.XX.iso by then. :)]

Install GRUB on all disks and adjust crypttab, fstab + initramfs:

root@grml ~ # grml-chroot /media /bin/bash
(grml)root@grml:/# for f in {a,b,c,d} ; do grub-install /dev/sd$f ; done
(grml)root@grml:/# update-grub
(grml)root@grml:/# echo "cryptorootfs /dev/mapper/homesrv-rootfs none luks" > /etc/crypttab
(grml)root@grml:/# echo "/dev/mapper/cryptorootfs / auto defaults,errors=remount-ro 0   1" > /etc/fstab
(grml)root@grml:/# echo "/dev/mapper/homesrv-bootfs /boot auto defaults 0 0" >> /etc/fstab
(grml)root@grml:/# update-initramfs -k all -u
(grml)root@grml:/# exit

Clean unmounted/removal for reboot:

root@grml ~ # umount /media/boot
root@grml ~ # umount /media/
root@grml ~ # cryptsetup luksClose cryptorootfs
root@grml ~ # dmsetup remove homesrv-bootfs
root@grml ~ # dmsetup remove homesrv-rootfs

NOTE: On a previous hardware installation I had to install GRUB 2.00-22 from Debian/unstable to get GRUB working.
Some metadata from different mdadm and LVM experiments seems to have been left and confused GRUB 1.99-27+deb7u2 from Debian/wheezy (I wasn’t able to reproduce this issue in my VM demo/test setup).
Just in cause you might experience the following error message, try GRUB >=2.00-22:

  # grub-install --recheck /dev/sda
  error: unknown LVM metadata header.
  error: unknown LVM metadata header.
  /usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/cryptorootfs.  Check your device.map.
  Auto-detection of a filesystem of /dev/mapper/cryptorootfs failed.
  Try with --recheck.
  If the problem persists please report this together with the output of "/usr/sbin/grub-probe --device-map="/boot/grub/device.map"
  --target=fs -v /boot/grub" to <bug-grub@gnu.org>

Comments are closed.