From patchwork Wed Jul 8 10:11:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Bakke X-Patchwork-Id: 23109 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id D218127BBE5; Wed, 8 Jul 2020 11:13:24 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTP id 0363E27BBE4 for ; Wed, 8 Jul 2020 11:13:12 +0100 (BST) Received: from localhost ([::1]:47832 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jt74h-00075b-I7 for patchwork@mira.cbaines.net; Wed, 08 Jul 2020 06:13:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35528) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jt74Z-00075D-V5 for guix-patches@gnu.org; Wed, 08 Jul 2020 06:13:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:54160) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jt74Y-0002Uq-Lt for guix-patches@gnu.org; Wed, 08 Jul 2020 06:13:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jt74Y-000341-I8 for guix-patches@gnu.org; Wed, 08 Jul 2020 06:13:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#42261] [PATCH] website: Add draft of a Ganeti cluster post. Resent-From: Marius Bakke Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 08 Jul 2020 10:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 42261 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 42261@debbugs.gnu.org Cc: Marius Bakke Received: via spool by 42261-submit@debbugs.gnu.org id=B42261.159420314411717 (code B ref 42261); Wed, 08 Jul 2020 10:13:02 +0000 Received: (at 42261) by debbugs.gnu.org; 8 Jul 2020 10:12:24 +0000 Received: from localhost ([127.0.0.1]:37470 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jt73q-00032j-NQ for submit@debbugs.gnu.org; Wed, 08 Jul 2020 06:12:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38846) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jt73c-00031J-N7 for 42261@debbugs.gnu.org; Wed, 08 Jul 2020 06:12:06 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:33617) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jt73X-0002Ne-HL; Wed, 08 Jul 2020 06:11:59 -0400 Received: from ti0006q161-3115.bb.online.no ([88.95.106.80]:53160 helo=localhost) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jt73W-0001Pc-R6; Wed, 08 Jul 2020 06:11:59 -0400 From: Marius Bakke Date: Wed, 8 Jul 2020 12:11:18 +0200 Message-Id: <20200708101118.3579-5-marius@gnu.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200708101118.3579-1-marius@gnu.org> References: <20200708101118.3579-1-marius@gnu.org> MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches From: Marius Bakke * website/drafts/ganeti-cluster-on-guix.md: New file. --- website/drafts/ganeti-cluster-on-guix.md | 414 +++++++++++++++++++++++ 1 file changed, 414 insertions(+) create mode 100644 website/drafts/ganeti-cluster-on-guix.md diff --git a/website/drafts/ganeti-cluster-on-guix.md b/website/drafts/ganeti-cluster-on-guix.md new file mode 100644 index 0000000..253681b --- /dev/null +++ b/website/drafts/ganeti-cluster-on-guix.md @@ -0,0 +1,414 @@ +title: Running a Ganeti cluster on Guix +date: 2020-07-10 12:00 +author: Marius Bakke +tags: Virtualization, Ganeti +--- +The latest addition to Guix's ever-growing list of services is a little-known +virtualization toolkit called [Ganeti](http://www.ganeti.org/). Ganeti is +designed to keep virtual machines running on a cluster of servers even in the +event of hardware failures, and to make maintenance and recovery tasks easy. + +It is comparable to tools such as +[Proxmox](https://www.proxmox.com/en/proxmox-ve) or +[oVirt](https://www.ovirt.org/), but has some distinctive features. One is +that there is no GUI: [third](https://github.com/osuosl/ganeti_webmgr) +[party](https://github.com/grnet/ganetimgr) +[ones](https://github.com/sipgate/ganeti-control-center) exist, but are not +currently packaged in Guix, so you are left with a rich command-line client +and a fully featured +[remote API](http://docs.ganeti.org/ganeti/master/html/rapi.html). + +Another interesting feature is that installing Ganeti on its own leaves you +no way to actually deploy any virtual machines. That probably sounds crazy, +but stems from the fact that Ganeti is designed to be API-driven and automated, +thus it comes with a +[OS API](http://docs.ganeti.org/ganeti/master/html/man-ganeti-os-interface.html) +and users need to install one or more *OS providers* in addition to Ganeti. +OS providers offer a declarative way to deploy virtual machine variants and +should feel natural to Guix users. At the time of writing, the providers +available in Guix are [debootstrap](https://github.com/ganeti/instance-debootstrap) +for provisioning Debian- and Ubuntu-based VMs, and of course a +[Guix](https://github.com/mbakke/ganeti-instance-guix) provider. + +Finally Ganeti comes with a sophisticated scheduler that efficiently packs +virtual machines across a cluster while maintaining N+1 redundancy in case +of a failover scenario. It can also make informed scheduling decisions +based on various cluster tags, such as ensuring primary and secondary nodes +are on different power distribution lines. + +(Note: if you are looking for a way to run just a few virtual machines on +your local computer, you are probably better off using +[libvirt](https://guix.gnu.org/manual/en/guix.html#index-libvirt) or even +a [Childhurd](https://guix.gnu.org/manual/devel/en/guix.html#index-hurd_002dvm_002dservice_002dtype), as Ganeti is fairly heavyweight and requires a complicated networking +setup.) + + +# Preparing the configuration + +With introductions out of the way, let's see how we can deploy a Ganeti +cluster using Guix. For this tutorial we will create a two-node cluster +and connect instances to the local network using an +[Open vSwitch](https://www.openvswitch.org/) bridge with no VLANs. We assume +that each node has a single network interface named `eth0` connected to the +same network, and that a dedicated partition `/dev/sdz3` is available for +virtual machine storage. It is possible to store VMs on a number of other +storage backends, but a dedicated drive (or rather LVM volume group) is +necessary to use the [DRBD](https://www.linbit.com/drbd/) integration to +replicate VM disks. + +We'll start off by defining a few helper services to create the Open vSwitch +bridge and ensure the physical network interface is in the "up" state. Since +Open vSwich stores the configuration in a database, you might as well run the +equivalent `ovs-vsctl` commands on the host once and be done with it, but we +do it through the configuration system to ensure we don't forget it in the +future when adding or reinstalling nodes. + +``` +(define (start-interface if) + #~(let ((ip (string-append #$iproute "/sbin/ip"))) + (invoke/quiet ip "link" "set" #$if "up"))) + +(define (stop-interface if) + #~(let ((ip (string-append #$iproute "/sbin/ip"))) + (invoke/quiet ip "link" "set" #$if "down"))) + +;; This service is necessary to ensure eth0 is in the "up" state on boot +;; since it is otherwise unmanaged from Guix PoV. +(define (ifup-service if) + (let ((name (string-append "ifup-" if))) + (simple-service name shepherd-root-service-type + (list (shepherd-service + (provision (list (string->symbol name))) + (start #~(lambda () + #$(start-interface if))) + (stop #~(lambda () + #$(stop-interface if))) + (respawn? #f)))))) + +(define* (create-openvswitch-bridge bridge uplink + #:key (vlan-mode #f)) + #~(let ((ovs-vsctl (lambda (cmd) + (apply invoke/quiet + #$(file-append openvswitch "/bin/ovs-vsctl") + (string-tokenize cmd))))) + (and (ovs-vsctl (string-append "--may-exist add-br " #$bridge)) + (ovs-vsctl (string-append "--may-exist add-port " #$bridge " " + #$uplink + (if #$vlan_mode + (format #f " vlan_mode=~a " #$vlan-mode) + "")))))) + +(define* (create-openvswitch-internal-port bridge port + #:key (vlan-mode #f)) + #~(invoke/quiet #$(file-append openvswitch "/bin/ovs-vsctl") + "--may-exist" "add-port" #$bridge #$port + (if #$vlan_mode + (string-append "vlan_mode=" #$vlan-mode) + "") + "--" "set" "Interface" #$port "type=internal")) + +(define %openvswitch-configuration-service + (simple-service 'openvswitch-configuration shepherd-root-service-type + (list (shepherd-service + (provision '(openvswitch-configuration)) + (requirement '(vswitchd)) + (start #~(lambda () + #$(create-openvswitch-bridge + "br0" "eth0" + #:vlan_mode "native-untagged") + #$(create-openvswitch-internal-port + "br0" "gnt0" + #:vlan_mode "native-untagged"))) + (respawn? #f))))) +``` + +This defines a `openvswitch-configuration` service object that creates a +logical switch `br0`, connects `eth0` as the "uplink", and creates a logical +port `gnt0` that we will use later as the main network interface for this +system. We also create an `ifup` service that can bring network interfaces +up and down. By themselves these variables do nothing, we also have to add +them to our `operating-system` configuration below. + +A configuration like this might be suitable for a small home network. In most +"real world" deployments you would use tagged VLANs, and maybe a traditional +Linux bridge instead of Open vSwitch. You can also forego bridging altogether +with a `routed` networking setup, or do any combination of the three. + +With this in place, we can start creating the `operating-system` configuration +that we will use for the Ganeti servers: + +``` +(operating-system + (host-name "node1") + [...] + ;; Ganeti requires that each node and the cluster address resolves to an + ;; IP address. The easiest way to achieve this is by adding everything + ;; to the hosts file. + (hosts-file (plain-file "hosts" (format #f "\ +127.0.0.1 localhost +::1 localhost + +192.168.1.101 node1 +192.168.1.102 node2 +192.168.1.254 ganeti.lan +"))) + (kernel-arguments + (append %default-kernel-arguments + '(;; Disable DRBDs usermode helper, as Ganeti + ;; is the only thing that should manage DRBD. + "drbd.usermode_helper=/run/current-system/profile/bin/true"))) + + (packages (append (map specification->package + '("qemu" "drbd-utils" "lvm2" + "ganeti-instance-guix" + "ganeti-instance-debootstrap")) + %base-packages)) + + (services (cons* (service ganeti-service-type + (ganeti-configuration + (file-storage-paths '("/srv/ganeti/file-storage")) + (os + (list (ganeti-os + (name "debootstrap") + (variants + (list (debootstrap-variant + "buster" + (debootstrap-configuration + (hooks + (local-file + "debootstrap-hooks" + #:recursive? #t)))) + (debootstrap-variant + "testing+contrib" + (debootstrap-configuration + (suite "testing") + (components '("main" "contrib"))))))))))) + + ;; Create a static IP on the "gnt0" Open vSwitch interface. + (service openvswitch-service-type) + %openvswitch-configuration-service + (ifup-service "eth0") + (static-networking-service "gnt0" "192.168.1.101" + #:netmask "255.255.255.0" + #:gateway "192.168.1.1" + #:requirement '(openvswitch-configuration) + #:name-servers '("192.168.1.1")) + + ;; Ganeti needs SSH to communicate between nodes. + (service openssh-service-type + (openssh-configuration + (permit-root-login 'without-password))) + %base-services))) +``` + +Debootstrap variants rely on a set of scripts (known as "hooks") in the +installation process to do things like configure networking, install bootloader, +create users, etc. In the example above, the "buster" variant will use a local +directory next to the configuration file named "debootstrap-hooks" (it is copied +into the final system closure), whereas the "testing+contrib" variant has no hooks +defined and will use `/etc/ganeti/instance-debootstrap/hooks` if it exists. + +Ganeti veterans may be surprised that each OS variant has its own hooks. All +Ganeti clusters I know of use a single set of hooks for all variants, sometimes +with additional logic inside the script based on the variant. Guix offers a +powerful abstraction that makes it trivial to create per-variant hooks, obsoleting +the need for a big `/etc/ganeti/instance-debootstrap/hooks` directory. Of course +you can still create it using `extra-special-file` and leave the `hooks` property +of the variants as `#f`. + +Not all Ganeti options are exposed in the configuration system yet. If you +find it limiting, you can add custom files using `extra-special-file`, or +ideally extend the `` data type to suite your needs. +Of course you can use `gnt-cluster copyfile` and `gnt-cluster command` +to distribute files or run executables, but beware that undeclared changes +in `/etc` may be lost on the next reboot or reconfigure. + + +# Initializing a cluster + +At this stage, you should run `guix system reconfigure` with the new +configuration on all nodes that will participate in the cluster. If you +do this over SSH or with +[guix deploy](https://guix.gnu.org/blog/2019/managing-servers-with-gnu-guix-a-tutorial/), +beware that `eth0` will lose network connectivity once it is "plugged in to" +the virtual switch, and you need to add any IP configuration to `gnt0`. + +The Guix configuration system does not currently support declaring LVM +volume groups, so we will create these manually on each node. We could +write our own declarative configuration like the `ifup-service`, but for +brevity and safety reasons we'll do it "by hand": + +``` +pvcreate /dev/sdz3 +vgcreate ganetivg /dev/sdz3 +``` + +On the node that will act as the "master node", run the init command: + +``` +gnt-cluster init \ + --master-netdev=gnt0 \ + --vg-name=ganetivg \ + --enabled-disk-templates=file,plain,drbd \ + --drbd-usermode-helper=/run/current-system/profile/bin/true \ + --enabled-hypervisors=kvm \ + --no-etc-hosts \ + --no-ssh-init \ + ganeti.lan +``` + +If you are okay with Ganeti taking control over SSH `authorized_keys` and +`known_hosts`, remove the `--no-ssh-init` option. Guix users might prefer +to manage the relevant files using `openssh-configuration`. All nodes in +the cluster must be able to reach each other over SSH as the root user. + +Similarly, Ganeti can update the `/etc/hosts` file when nodes are added or +removed, but it makes little sense on Guix as it is recreated every reboot. + +If all goes well, the command returns no output and you should have the +`ganeti.lan` IP address visible on `gnt0`. You can run `gnt-cluster verify` +to check that the cluster is in good shape. Most likely it complains about +something: + +``` +# TODO +``` + +Use `gnt-cluster modify` to change the running state of the cluster: + +``` +gnt-cluster modify -H kvm:kernel_path= +``` + +The command above removes the warning about the default KVM kernel being +missing, making `gnt-cluster verify` happy. For this tutorial we only use +fully virtualized instances, but users might want to set `kernel_path` to a +suitable VM kernel. + +Now let's add our other machine to the cluster: + +``` +gnt-node add node2 +``` + +Ganeti will log into the node, copy the cluster configuration and start the +relevant Shepherd services. No output means the command succeeded. Run +`gnt-cluster verify` again to check that everything is in order: + +``` +gnt-cluster verify +``` + +If you get warnings about SSH authorizations here, you should fix those +before proceeding. If you used `--no-ssh-init` earlier you may need to +update `/var/lib/ganeti/known_hosts` with the new node information, either +with `gnt-cluster copyfile` or by adding it to the OS configuration. + +The above configuration will make three operating systems available: + +``` +# gnt-os list +Name +guix +debootstrap+buster +debootstrap+testing+contrib +``` + +Let's try them out. But first we'll make Ganeti aware of our network +so it can choose a static IP for the virtual machines. + +``` +# gnt-network add --network=192.168.1.0/24 --gateway=192.168.1.1 lan +# gnt-network connect -N mode=openvswitch,link=br0 lan +``` + +Now we can add an instance: + +``` +gnt-instance add --no-name-check --no-ip-check -o debootstrap+buster \ + -t drbd --disk 0:size=5G -B memory=256m,vcpus=2 \ + --net 0:network=lan,ip=pool bustervm1 +``` + +Ganeti will automatically select the optimal primary and secondary node +for this VM based on available cluster resources. You can manually +specify primary and secondary nodes with the `-n` and `-s` options. + +By default Ganeti assumes that the new instance is already configured in DNS, +so we need `--no-name-check` and `--no-ip-check` to bypass some sanity tests. + +Try adding another instance, now using the Guix OS provider: + +``` +gnt-instance add --no-name-check --no-ip-check -o guix \ + -t plain --disk 0:size=5G -B memory=1G,vcpus=4 \ + --net 0:network=lan,ip=pool guix1 +``` + +The Guix OS has a built-in configuration that starts an SSH server and authorizes +the hosts SSH key, and configures static networking based on information from +Ganeti. It is possible to specify a custom configuration file, and even a +specific Guix commit: + +``` +gnt-instance add --no-name-check --no-ip-check -o guix \ + -t file --file-storage-dir=/srv/ganeti/file-storage \ + --disk 0:size=20G -B memory=4G,vpus=3 \ + --net 0:network=lan,ip=pool \ + -O "config=$(base64 /the/config/file.scm),commit=" \ + custom-guix +``` + +That's it for this tutorial! If you are new to Ganeti, you should +familiarize yourself with the `gnt-` family commands. Fun stuff to +do include `gnt-instance migrate` to move VMs between hosts, +`gnt-node evacuate` to migrate _all_ VMs off a node, and +`gnt-cluster master-failover` to move the master role to a different node. + + +# Final remarks + +Like most services in Guix, Ganeti comes with a +[system test](https://guix.gnu.org/blog/2016/guixsd-system-tests/) +that [runs in a VM](FIXME) and ensures that things like initializing a cluster +work. The continuous integration system +[runs this automatically](https://ci.guix.gnu.org/search?query=ganeti), and +users can run it locally with `make check-system TESTS=ganeti`. Such +tests give us confidence that both the package and configuration system work, +and allows rapid testing of the configuration API. Currently it does little +more than `gnt-cluster verify`, but it can be extended to provision a real +cluster inside Ganeti and try things like live migration. + +The author had a lot of fun creating +[native data types](FIXME manual link) +in the Guix configuration system for the Ganeti OS specification. The API +went through at least three major revisions during the writing of this blog +post. There is still room for improvement, but I decided I had to stop +tweaking it and instead focus on shipping the thing. Feedback welcome! + +Having OS support in the configuration system lets us benefit from Guix's +provenance tracking and we can easily `guix system roll-back` any breaking +changes. Ganeti is usually coupled with tools such as Puppet or SaltStack to +keep things in sync between nodes, but that should not be necessary here. + +So far only the `KVM` hypervisor has been tested. If you use LXC or Xen with +Ganeti, please reach out to `guix-devel@gnu.org` and share your experience. + +#### About GNU Guix + +[GNU Guix](https://guix.gnu.org) is a transactional package +manager and an advanced distribution of the GNU system that [respects +user +freedom](https://www.gnu.org/distros/free-system-distribution-guidelines.html). +Guix can be used on top of any system running the kernel Linux, or it +can be used as a standalone operating system distribution for i686, +x86_64, ARMv7, and AArch64 machines. + +In addition to standard package management features, Guix supports +transactional upgrades and roll-backs, unprivileged package management, +per-user profiles, and garbage collection. When used as a standalone +GNU/Linux distribution, Guix offers a declarative, stateless approach to +operating system configuration management. Guix is highly customizable +and hackable through [Guile](https://www.gnu.org/software/guile) +programming interfaces and extensions to the +[Scheme](http://schemers.org) language.