State of the art Debian/wheezy deployments with GRUB and LVM/SW-RAID/Crypto
Moving from Lilo to GRUB, using LVM as default, etc throughout the last years it was time to evaluate how well LVM works without a separate boot partition, possibly also on top of Software RAID. Big disks are asking for partitioning with GPT, just UEFI isn’t my default yet, so I’m still defaulting to Legacy BIOS for Debian/wheezy (I expect this to change for Debian/jessie and according hardware approaching at my customers).
So what we have and want in this demonstration setup:
- Debian 7 AKA wheezy
- 4 hard-disks with Software RAID (on 8GB RAM), using GPT partitioning + GRUB2
- using state-of-the-art features without too much workarounds like separate /boot partition outside of LVM or mdadm with 0.9 metadata, just no (U)EFI yet
- LVM on top of SW-RAID (RAID5) for /boot partition [SW-RAID->LVM]
- Cryptsetup-LUKS on top of LVM on top of SW-RAID (RAID5) for data [SW-RAID->LVM->Crypto] (this gives us more flexibility about crypto yes/no and different cryptsetup options for the LVs compared to using it below RAID/LVM)
- Rescue-system intregration via grml-rescueboot (not limited to Grml, but Grml should work out-of-the-box)
System used for installation:
root@grml ~ # grml-version grml64-full 2013.09 Release Codename Hefeknuddler [2013-09-27]
Partition setup:
root@grml ~ # parted /dev/sda GNU Parted 2.3 Using /dev/sda Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel gpt (parted) mkpart primary 2048s 4095s (parted) set 1 bios_grub on (parted) name 1 "BIOS Boot Partition" (parted) mkpart primary 4096s 100% (parted) set 2 raid on (parted) name 2 "SW-RAID / Linux" (parted) quit Information: You may need to update /etc/fstab.
Clone partition layout from sda to all the other disks:
root@grml ~ # for f in {b,c,d} ; sgdisk -R=/dev/sd$f /dev/sda The operation has completed successfully. The operation has completed successfully. The operation has completed successfully.
Make sure each disk has its unique UUID:
root@grml ~ # for f in {b,c,d} ; sgdisk -G /dev/sd$f The operation has completed successfully. The operation has completed successfully. The operation has completed successfully.
SW-RAID setup:
root@grml ~ # mdadm --create /dev/md0 --verbose --level=raid5 --raid-devices=4 /dev/sd{a,b,c,d}2 mdadm: layout defaults to left-symmetric mdadm: layout defaults to left-symmetric mdadm: chunk size defaults to 512K mdadm: size set to 1465004544K mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. root@grml ~ #
SW-RAID speedup (system dependent, YMMV):
root@grml ~ # cat /sys/block/md0/md/stripe_cache_size 256 root@grml ~ # echo 16384 > /sys/block/md0/md/stripe_cache_size # 16MB root@grml ~ # blockdev --getra /dev/md0 6144 root@grml ~ # blockdev --setra 65536 /dev/md0 # 32 MB root@grml ~ # sysctl dev.raid.speed_limit_max dev.raid.speed_limit_max = 200000 root@grml ~ # sysctl -w dev.raid.speed_limit_max=9999999999 dev.raid.speed_limit_max = 9999999999 root@grml ~ # sysctl dev.raid.speed_limit_min dev.raid.speed_limit_min = 1000 root@grml ~ # sysctl -w dev.raid.speed_limit_min=100000 dev.raid.speed_limit_min = 100000
LVM setup:
root@grml ~ # pvcreate /dev/md0 Physical volume "/dev/md0" successfully created root@grml ~ # vgcreate homesrv /dev/md0 Volume group "homesrv" successfully created root@grml ~ # lvcreate -n rootfs -L4G homesrv Logical volume "rootfs" created root@grml ~ # lvcreate -n bootfs -L1G homesrv Logical volume "bootfs" created
Check partition setup + alignment:
root@grml ~ # parted -s /dev/sda print Model: ATA WDC WD15EADS-00P (scsi) Disk /dev/sda: 1500GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 2097kB 1049kB BIOS Boot Partition bios_grub 2 2097kB 1500GB 1500GB SW-RAID / Linux raid root@grml ~ # parted -s /dev/sda unit s print Model: ATA WDC WD15EADS-00P (scsi) Disk /dev/sda: 2930277168s Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 2048s 4095s 2048s BIOS Boot Partition bios_grub 2 4096s 2930276351s 2930272256s SW-RAID / Linux raid root@grml ~ # gdisk -l /dev/sda GPT fdisk (gdisk) version 0.8.5 Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. Disk /dev/sda: 2930277168 sectors, 1.4 TiB Logical sector size: 512 bytes Disk identifier (GUID): 212E463A-A4E3-428B-B7E5-8D5785141564 Partition table holds up to 128 entries First usable sector is 34, last usable sector is 2930277134 Partitions will be aligned on 2048-sector boundaries Total free space is 2797 sectors (1.4 MiB) Number Start (sector) End (sector) Size Code Name 1 2048 4095 1024.0 KiB EF02 BIOS Boot Partition 2 4096 2930276351 1.4 TiB FD00 SW-RAID / Linux root@grml ~ # mdadm -E /dev/sda2 /dev/sda3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 7ed3b741:0774d529:d5a71c1f:cf942f0a Name : grml:0 (local to host grml) Creation Time : Fri Jan 31 15:26:12 2014 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 2928060416 (1396.21 GiB 1499.17 GB) Array Size : 4392089088 (4188.62 GiB 4497.50 GB) Used Dev Size : 2928059392 (1396.21 GiB 1499.17 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : c1e4213b:81822fd1:260df456:2c9926fb Update Time : Mon Feb 3 09:41:48 2014 Checksum : b8af8f6 - correct Events : 72 Layout : left-symmetric chunk size : 512k Device Role : Active device 0 Array State : AAAA ('A' == active, '.' == missing) root@grml ~ # pvs -o +pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/md0 homesrv lvm2 a-- 4.09t 4.09t 1.50m root@grml ~ # pvs --units s -o +pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/md0 homesrv lvm2 a-- 8784175104S 8773689344S 3072S root@grml ~ # pvs -o +pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/md0 homesrv lvm2 a-- 4.09t 4.09t 1.50m root@grml ~ # pvs --units s -o +pe_start PV VG Fmt Attr PSize PFree 1st PE /dev/md0 homesrv lvm2 a-- 8784175104S 8773689344S 3072S root@grml ~ # vgs -o +pe_start VG #PV #LV #SN Attr VSize VFree 1st PE homesrv 1 2 0 wz--n- 4.09t 4.09t 1.50m root@grml ~ # vgs --units s -o +pe_start VG #PV #LV #SN Attr VSize VFree 1st PE homesrv 1 2 0 wz--n- 8784175104S 8773689344S 3072S
Cryptsetup:
root@grml ~ # echo cryptsetup >> /etc/debootstrap/packages root@grml ~ # cryptsetup luksFormat -c aes-xts-plain64 -s 256 /dev/mapper/homesrv-rootfs WARNING! ======== This will overwrite data on /dev/mapper/homesrv-rootfs irrevocably. Are you sure? (Type uppercase yes): YES Enter passphrase: Verify passphrase: root@grml ~ # cryptsetup luksOpen /dev/mapper/homesrv-rootfs cryptorootfs Enter passphrase for /dev/mapper/homesrv-rootfs:
Filesystems:
root@grml ~ # mkfs.ext4 /dev/mapper/cryptorootfs mke2fs 1.42.8 (20-Jun-2013) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=128 blocks, Stripe width=384 blocks 262144 inodes, 1048192 blocks 52409 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done root@grml ~ # mkfs.ext4 /dev/mapper/homesrv-bootfs mke2fs 1.42.8 (20-Jun-2013) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=128 blocks, Stripe width=384 blocks 65536 inodes, 262144 blocks 13107 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=268435456 8 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Allocating group tables: done Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done
Install Debian/wheezy:
root@grml ~ # mount /dev/mapper/cryptorootfs /media root@grml ~ # mkdir /media/boot root@grml ~ # mount /dev/mapper/homesrv-bootfs /media/boot root@grml ~ # grml-debootstrap --target /media --password YOUR_PASSWORD --hostname YOUR_HOSTNAME * grml-debootstrap [0.57] - Please recheck configuration before execution: Target: /media Install grub: no Using release: wheezy Using hostname: YOUR_HOSTNAME Using mirror: http://http.debian.net/debian Using arch: amd64 Important! Continuing will delete all data from /media! * Is this ok for you? [y/N] y [...]
Enable grml-rescueboot (to have easy access to rescue ISO via GRUB):
root@grml ~ # mkdir /media/boot/grml root@grml ~ # wget -O /media/boot/grml/grml64-full_$(date +%Y.%m.%d).iso http://daily.grml.org/grml64-full_testing/latest/grml64-full_testing_latest.iso root@grml ~ # grml-chroot /media apt-get -y install grml-rescueboot
[NOTE: We’re installing a daily ISO for grml-rescueboot here because the 2013.09 Grml release doesn’t work for this LVM/SW-RAID setup while newer ISOs are working fine already. The upcoming Grml stable release is supposed to work just fine, so you will be able to choose http://download.grml.org/grml64-full_2014.XX.iso by then. :)]
Install GRUB on all disks and adjust crypttab, fstab + initramfs:
root@grml ~ # grml-chroot /media /bin/bash (grml)root@grml:/# for f in {a,b,c,d} ; do grub-install /dev/sd$f ; done (grml)root@grml:/# update-grub (grml)root@grml:/# echo "cryptorootfs /dev/mapper/homesrv-rootfs none luks" > /etc/crypttab (grml)root@grml:/# echo "/dev/mapper/cryptorootfs / auto defaults,errors=remount-ro 0 1" > /etc/fstab (grml)root@grml:/# echo "/dev/mapper/homesrv-bootfs /boot auto defaults 0 0" >> /etc/fstab (grml)root@grml:/# update-initramfs -k all -u (grml)root@grml:/# exit
Clean unmounted/removal for reboot:
root@grml ~ # umount /media/boot root@grml ~ # umount /media/ root@grml ~ # cryptsetup luksClose cryptorootfs root@grml ~ # dmsetup remove homesrv-bootfs root@grml ~ # dmsetup remove homesrv-rootfs
NOTE: On a previous hardware installation I had to install GRUB 2.00-22 from Debian/unstable to get GRUB working.
Some metadata from different mdadm and LVM experiments seems to have been left and confused GRUB 1.99-27+deb7u2 from Debian/wheezy (I wasn’t able to reproduce this issue in my VM demo/test setup).
Just in cause you might experience the following error message, try GRUB >=2.00-22:
# grub-install --recheck /dev/sda error: unknown LVM metadata header. error: unknown LVM metadata header. /usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/cryptorootfs. Check your device.map. Auto-detection of a filesystem of /dev/mapper/cryptorootfs failed. Try with --recheck. If the problem persists please report this together with the output of "/usr/sbin/grub-probe --device-map="/boot/grub/device.map" --target=fs -v /boot/grub" to <bug-grub@gnu.org>