mbox

[bug#36404,0/6] Add 'guix deploy'.

Message ID 87o92ianbj.fsf@sdf.lonestar.org
Headers show

Message

Jakob L. Kreuze June 27, 2019, 6:35 p.m. UTC
Hello, Guix!

This patch provides the basis for 'guix deploy', implementing what I've
referred to as the "simple case" in my progress reports: in-place
updates to machines (physical or virtual) whose name and IP address we
know well. Do note that these commits depend on Ludovic's implementation
of 'remote-eval'.[1]

There's certainly more to be done with this -- the GSoC period is far
from over, and I'm hoping to use that time to implement more complex
use-cases such as automatically provisioning virtual machines in the
cloud. I'm submitting a patch series now per the recommendation of my
mentors to break the project into a few chunks to submit over the
duration of the summer.

Quite a bit has changed since my last email about this.[2] For one,
GOOPS is no longer used. Machine declarations now look just like any
other sort of declaration in Guix.

#+BEGIN_SRC scheme
(use-modules (gnu) (guix))
(use-machine-modules ssh)
(use-service-modules networking ssh)
(use-package-modules bootloaders)

(define %system
  (operating-system
   (host-name "gnu-deployed")
   (timezone "Etc/UTC")
   (bootloader (bootloader-configuration
                (bootloader grub-bootloader)
                (target "/dev/vda")
                (terminal-outputs '(console))))
   (file-systems (cons (file-system
                        (mount-point "/")
                        (device "/dev/vda1")
                        (type "ext4"))
                       %base-file-systems))
   (services
    (append (list (service dhcp-client-service-type)
                  (service openssh-service-type
                           (openssh-configuration
                            (permit-root-login #t)
                            (allow-empty-passwords? #t))))
            %base-services))))

(list (machine
       (system %system)
       (environment 'managed-host)
       (configuration (machine-ssh-configuration
                       (host-name "localhost")
                       (identity "./id_rsa")
                       (port 2222)))))
#+END_SRC scheme

There are a number of other differences here as well. For one, the SSH
configuration now has an 'identity' field for specifying a private key
to use when authenticating with the host. Any key management scheme you
might have set up in '~/.ssh/config' will also work if the 'identity'
field is omitted.

The 'environment' field is where we declare how machines should be
provisioned. In this case, the only type of provisioning that's been
implemented is 'managed-host' -- the "simple case" of in-place updates
to a machine that's already running GuixSD. The parameters for
provisioning are given in the form of an environment-specific
configuration type. In the example, this is 'machine-ssh-configuration',
which describes how 'guix deploy' should make an SSH connection to the
machine. I'm sure you can imagine something along the lines of a
'machine-digitalocean-configuration', describing some parameters for a
droplet.

There are two things in this patch series that I'd like comments on in
particular.

First, I still haven't figured out the whole testing situation. The
tests, as of now, spin up a virtual machine, create a machine instance,
deploy that to the virtual machine, and then make assertions about
changes made to the system. These tests were originally in the system
test suite as they deal with virtual machines, but I've since moved it
into the normal Guix test suite because of how much needs to be done on
the host side -- I spent an absurd amount of time trying to fit a call
to 'deploy-machine' into a derivation that could be run by the system
test suite, but I just wasn't able to make it work. I'm hoping someone
will have thoughts about how we can test 'guix deploy'. Should we have
them disabled by default? Is there some way to implement them in the a
system test suite that I've overlooked? Should the tests be included at
all?

Second, I'd like some suggestions on how to go about the documentation.
I have a cursory description of how to invoke the command-line tool, and
an example of a deployment specification, but I'm wondering if the
documentation should be split up into multiple sections across the
manual -- especially if we're going to have multiple 'environment' types
with their own configuration records down the line.

I look forward to your comments.

Regards,
Jakob

[1]: https://lists.gnu.org/archive/html/guix-patches/2019-06/msg00201.html
[2]: https://lists.gnu.org/archive/html/guix-devel/2019-06/msg00078.html

David Thompson (1):
  Take another stab at this whole guix deploy thing.

Jakob L. Kreuze (5):
  ssh: Add 'identity' keyword to 'open-ssh-session'.
  gnu: Add machine type for deployment specifications.
  Export the (gnu machine) interface.
  Add 'guix deploy'.
  doc: Add section for 'guix deploy'.

 Makefile.am             |   4 +-
 doc/guix.texi           | 103 +++++++++
 gnu.scm                 |   8 +-
 gnu/local.mk            |   5 +-
 gnu/machine.scm         |  89 ++++++++
 gnu/machine/ssh.scm     | 355 +++++++++++++++++++++++++++++++
 guix/scripts/deploy.scm |  90 ++++++++
 guix/ssh.scm            |   3 +-
 tests/machine.scm       | 450 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 1103 insertions(+), 4 deletions(-)
 create mode 100644 gnu/machine.scm
 create mode 100644 gnu/machine/ssh.scm
 create mode 100644 guix/scripts/deploy.scm
 create mode 100644 tests/machine.scm

Comments

Thompson, David June 27, 2019, 8:05 p.m. UTC | #1
Hi Jakob,

On Thu, Jun 27, 2019 at 2:38 PM Jakob L. Kreuze
<zerodaysfordays@sdf.lonestar.org> wrote:
>
> Hello, Guix!
>
> This patch provides the basis for 'guix deploy', implementing what I've
> referred to as the "simple case" in my progress reports: in-place
> updates to machines (physical or virtual) whose name and IP address we
> know well.

First of all: Wooooooooooooooo!!!!!!!!!!!!!! This is a huge first step!

Second of all: Could you please squash these 5 commits into one
commit? No one needs to review my WIP code that uses GOOPS that later
gets dropped in one of your commits. :)

Thanks!

- Dave
Jakob L. Kreuze June 28, 2019, 1:34 p.m. UTC | #2
Hey Dave,

Thanks for the initial feedback. I squashed that first commit of yours
and used the opportunity to move the addition of 'deploy.scm' into the
"Add 'guix deploy' commit". Here's the cleaned up patch set.

Jakob L. Kreuze (5):
  ssh: Add 'identity' keyword to 'open-ssh-session'.
  gnu: Add machine type for deployment specifications.
  Add 'guix deploy'.
  Export the (gnu machine) interface.
  doc: Add section for 'guix deploy'.

 Makefile.am             |   4 +-
 doc/guix.texi           | 103 +++++++++
 gnu.scm                 |   8 +-
 gnu/local.mk            |   5 +-
 gnu/machine.scm         |  89 ++++++++
 gnu/machine/ssh.scm     | 355 +++++++++++++++++++++++++++++++
 guix/scripts/deploy.scm |  90 ++++++++
 guix/ssh.scm            |   3 +-
 tests/machine.scm       | 450 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 1103 insertions(+), 4 deletions(-)
 create mode 100644 gnu/machine.scm
 create mode 100644 gnu/machine/ssh.scm
 create mode 100644 guix/scripts/deploy.scm
 create mode 100644 tests/machine.scm
Christine Lemmer-Webber June 29, 2019, 2:37 p.m. UTC | #3
Jakob L. Kreuze writes:

> Hello, Guix!
>
> This patch provides the basis for 'guix deploy', implementing what I've
> referred to as the "simple case" in my progress reports: in-place
> updates to machines (physical or virtual) whose name and IP address we
> know well. Do note that these commits depend on Ludovic's implementation
> of 'remote-eval'.[1]

Horray!

> #+BEGIN_SRC scheme
> ;; [...]
> (list (machine
>        (system %system)
>        (environment 'managed-host)
>        (configuration (machine-ssh-configuration
>                        (host-name "localhost")
>                        (identity "./id_rsa")
>                        (port 2222)))))
> #+END_SRC scheme
>
> The 'environment' field is where we declare how machines should be
> provisioned. In this case, the only type of provisioning that's been
> implemented is 'managed-host' -- the "simple case" of in-place updates
> to a machine that's already running GuixSD. The parameters for
> provisioning are given in the form of an environment-specific
> configuration type. In the example, this is 'machine-ssh-configuration',
> which describes how 'guix deploy' should make an SSH connection to the
> machine. I'm sure you can imagine something along the lines of a
> 'machine-digitalocean-configuration', describing some parameters for a
> droplet.

In the future I think it would be good to make this extensible as well.
Dispatching on a symbol means that Guix must itself provide a fixed set
of possible environment types.  If we made this an extensible structure,
akin to services or something, we could allow for more flexibility in
the future.  Thoughts for the future, but not a blocker on this patch.

> There are two things in this patch series that I'd like comments on in
> particular.
>
> First, I still haven't figured out the whole testing situation. The
> tests, as of now, spin up a virtual machine, create a machine instance,
> deploy that to the virtual machine, and then make assertions about
> changes made to the system. These tests were originally in the system
> test suite as they deal with virtual machines, but I've since moved it
> into the normal Guix test suite because of how much needs to be done on
> the host side -- I spent an absurd amount of time trying to fit a call
> to 'deploy-machine' into a derivation that could be run by the system
> test suite, but I just wasn't able to make it work. I'm hoping someone
> will have thoughts about how we can test 'guix deploy'. Should we have
> them disabled by default? Is there some way to implement them in the a
> system test suite that I've overlooked? Should the tests be included at
> all?

Ludo, do you have comments?  I suspect this is up your area of expertise.

> I look forward to your comments.

Yes, now for me to look at the actual patches :)
Jakob L. Kreuze June 29, 2019, 11:42 p.m. UTC | #4
Hi, Chris!

Christopher Lemmer Webber <cwebber@dustycloud.org> writes:

> In the future I think it would be good to make this extensible as
> well. Dispatching on a symbol means that Guix must itself provide a
> fixed set of possible environment types. If we made this an extensible
> structure, akin to services or something, we could allow for more
> flexibility in the future. Thoughts for the future, but not a blocker
> on this patch.

+1. Initially, I thought the service types _were_ symbols, but I see now
that they're actually procedures. Thanks for pointing that out. I'll see
about implementing environment types similarly in my revised patch set,
since I think that's a change that we'd want to make before any other
environment types come into existence.

Regards,
Jakob
Ludovic Courtès July 1, 2019, 12:53 p.m. UTC | #5
Hi!

Christopher Lemmer Webber <cwebber@dustycloud.org> skribis:

> Jakob L. Kreuze writes:

[...]

>> There are two things in this patch series that I'd like comments on in
>> particular.
>>
>> First, I still haven't figured out the whole testing situation. The
>> tests, as of now, spin up a virtual machine, create a machine instance,
>> deploy that to the virtual machine, and then make assertions about
>> changes made to the system. These tests were originally in the system
>> test suite as they deal with virtual machines, but I've since moved it
>> into the normal Guix test suite because of how much needs to be done on
>> the host side -- I spent an absurd amount of time trying to fit a call
>> to 'deploy-machine' into a derivation that could be run by the system
>> test suite, but I just wasn't able to make it work. I'm hoping someone
>> will have thoughts about how we can test 'guix deploy'. Should we have
>> them disabled by default? Is there some way to implement them in the a
>> system test suite that I've overlooked? Should the tests be included at
>> all?
>
> Ludo, do you have comments?  I suspect this is up your area of expertise.

As Ricardo wrote, I think that’s too much work to do in “make check”.
Plus this would only run when a “host store” is available, as we can’t
reasonably build QEMU and everything in $builddir/test-tmp.

So I feel that the system test suite is a better fit, but I don’t fully
understand the limitations you hit, Jakob.

Do you still have a draft of a system test that you wrote and/or notes
about what went wrong?

Ludo’.
Jakob L. Kreuze July 2, 2019, 12:10 a.m. UTC | #6
Hi, Ludovic + Ricardo!

Ricardo Wurmus <rekado@elephly.net> writes:

> Building and running virtual machines as part of the tests
> seems expensive. Would it be feasible to mock the remote
> interactions?

I agree 100%. I've decoupled it from my patch series for now. We can
always add it back later when it's implemented in a less expensive way.

As for mocking -- I do like that idea, but that would only really be
testing that calls to 'deploy-machine' et al. don't fail rather than
ensuring that the implementation of 'guix deploy' does what it's
supposed to do. The current tests make assertions about changes to the
virtual machine.

Ludovic Courtès <ludo@gnu.org> writes:

> As Ricardo wrote, I think that’s too much work to do in “make check”.
> Plus this would only run when a “host store” is available, as we can’t
> reasonably build QEMU and everything in $builddir/test-tmp.
>
> So I feel that the system test suite is a better fit, but I don’t
> fully understand the limitations you hit, Jakob.
>
> Do you still have a draft of a system test that you wrote and/or notes
> about what went wrong?

Yep, I have an unsquashed commit history on my personal branch with all
renditions of the test suite. I can pull it out tomorrow and write a
detailed report on the issues I ran into.

Thanks for both of your comments!
Jakob L. Kreuze July 2, 2019, 12:14 a.m. UTC | #7
Huge thanks to everyone who commented on the first two renditions of this
patch series. Here's a summary of the changes I've incorporated:

% The 'environment' field of <machine> is now an instance of
  <environment-type> -- a record similar to <service-type>. See the manual
  page for an example of how this looks in a deployment specification.
% Deployment specifications are loaded in an environment with '(gnu)', '(gnu
  machine)', and '(gnu machine ssh)'. '(gnu machine)' and its descendant
  modules are no longer exported from '(gnu)'.
% Environment and load path excursions have been removed from the deployment
  internals for 'managed-host-environment-type'. 'remote-eval' spawns a new
  Guile REPL with each invocation, so modifications to $PATH et al. aren't
  really relevant -- at least not with how 'deploy-managed-host' is
  implemented.
% Wording in the manual section has been updated.
% The docstring for 'open-ssh-session' has been updated.
% Tests have been decoupled from the commit adding '(gnu machine)' and omitted
  from this patch series. I will add them back in a future patch.

Jakob L. Kreuze (4):
  ssh: Add 'identity' keyword to 'open-ssh-session'.
  gnu: Add machine type for deployment specifications.
  Add 'guix deploy'.
  doc: Add section for 'guix deploy'.

 Makefile.am             |   4 +-
 doc/guix.texi           | 101 ++++++++++++
 gnu/local.mk            |   5 +-
 gnu/machine.scm         | 118 +++++++++++++
 gnu/machine/ssh.scm     | 355 ++++++++++++++++++++++++++++++++++++++++
 guix/scripts/deploy.scm |  90 ++++++++++
 guix/ssh.scm            |  10 +-
 7 files changed, 677 insertions(+), 6 deletions(-)
 create mode 100644 gnu/machine.scm
 create mode 100644 gnu/machine/ssh.scm
 create mode 100644 guix/scripts/deploy.scm
Jakob L. Kreuze July 2, 2019, 10:14 p.m. UTC | #8
Hi Ludovic,

zerodaysfordays@sdf.lonestar.org (Jakob L. Kreuze) writes:

> Yep, I have an unsquashed commit history on my personal branch with
> all renditions of the test suite. I can pull it out tomorrow and write
> a detailed report on the issues I ran into.

So we begin as I did about a month ago with a very naïve test, ensuring
that we can create a 'machine' object. This isn't particularly hard to
pull off in the system test suite.

#+BEGIN_SRC scheme
(define (run-sshable-machine-test)
  (define os
    (marionette-operating-system
     (simple-operating-system
      (service dhcp-client-service-type)
      (service openssh-service-type
               (openssh-configuration
                (permit-root-login #t)
                (allow-empty-passwords? #t))))
     #:imported-modules '((gnu services herd)
                          (guix combinators))))

  (define vm
    (virtual-machine
     (operating-system os)
     (port-forwardings '((2222 . 22)))))

  (define test
    (with-extensions (list guile-bytestructures guile-gcrypt
                           guile-git guile-ssh guile-sqlite3 guix)
      (with-imported-modules '((gnu build marionette)
                               (gnu)
                               (gnu machine)
                               (gnu machine ssh)
                               (guix remote))
        #~(begin
            (use-modules (gnu build marionette)
                         (gnu)
                         (gnu machine)
                         (gnu machine ssh)
                         (srfi srfi-64))
            (use-service-modules networking ssh)

            (define %system
              (operating-system
                (host-name "gnu-deployed")
                (timezone "Etc/UTC")
                (bootloader (bootloader-configuration
                             (bootloader grub-bootloader)
                             (target "/dev/vda")
                             (terminal-outputs '(console))))
                (file-systems (cons (file-system
                                      (mount-point "/")
                                      (device "/dev/vda1")
                                      (type "ext4"))
                                    %base-file-systems))
                (services
                 (append (list (service dhcp-client-service-type)
                               (service openssh-service-type
                                        (openssh-configuration
                                         (permit-root-login #t)
                                         (allow-empty-passwords? #t))))
                         %base-services))))


            (define %machine
              (machine
               (system %system)
               (environment managed-host-environment-type)
               (configuration (machine-ssh-configuration
                               (host-name "localhost")
                               (port 2222)))))

            (define marionette
              (make-marionette (list #$vm)))

            (mkdir #$output)
            (chdir #$output)

            (test-begin "remote-eval")

            (test-assert "machine instance was created"
              %machine)

            (test-end)
            (exit (= (test-runner-fail-count (test-runner-current)) 0))))))

  (gexp->derivation "sshable-machine" test))

(define %test-sshable-machine
  (system-test
   (name "sshable-machine")
   (description "Create a machine object")
   (value (run-sshable-machine-test))))
#+END_SRC

For onlookers unfamiliar with the system test suite, this is mostly
boilerplate. The important code begins at 'define %system' and ends at
'test-end'.

Wonderful! We've ensured that we can import '(gnu machine)', and that we
can create instances of 'machine'. Where to now? How about testing
'remote-eval'? (This snippet requires more changes to the surrounding
code to work. If you need a reproducible version, let me know.)

#+BEGIN_SRC scheme
(test-assert "can invoke machine-remote-eval"
  (with-store store
    (run-with-store store
      (machine-remote-eval %machine #~#t))))
#+END_SRC

Alas, this doesn't work in the context of a derivation.

#+BEGIN_SRC scheme
(srfi-34 #<condition &store-connection-error [file: "/var/guix/daemon-socket/socket" errno: 2] 101e0c0>)
#+END_SRC

This is around when I began to pester you on IRC with questions that I
realize are kind of silly now. In general, system tests can't use the
store. The only workaround that I'm aware of is 'gnu/tests/install.scm',
which makes use of the store monad to perform store operations before
running the test. For example:

#+BEGIN_SRC scheme
(define %test-iso-image-installer
  (system-test
   (name "iso-image-installer")
   (description
    "")
   (value
    (mlet* %store-monad ((image   (run-install
                                   %minimal-os-on-vda
                                   %minimal-os-on-vda-source
                                   #:script
                                   %simple-installation-script-for-/dev/vda
                                   #:installation-disk-image-file-system-type
                                   "iso9660"))
                         (command (qemu-command/writable-image image)))
      (run-basic-test %minimal-os-on-vda command name)))))
#+END_SRC

This is a bit less complicated than system deployment, since the tests
only need the store to build the virtual machine image. Deployment to a
machine requires that the machine is /up/, but if you look at the
initial, naïve test, you can see that the virtual machine isn't started
until the test derivation runs -- which is after everything in the store
monad is run.

c6e01898[1] has a version that starts the virtual machine while the
store monad is running so it can deploy to it. This is an absolute mess,
as seen in 'call-with-marionette'. Also, the use of 'dynamic-wind' in
that rendition causes the SSH session to close during deployment, which
is why that test fails. (I didn't figure that out until around the time
I began reimplementing the tests in the normal test suite.) In theory,
_I could fix that issue and implement the tests this way_. Another
possibility would be to spawn two virtual machines and have one deploy
to the other. This is implemented in 358f1287[2], which I believe I
would also be able to adapt now that I know I need to create writable
disk images for the virtual machines.

Before I go ahead with either, though, I'd like to know if either is the
"right way". Or if there's something better than what I'm suggesting.

Regards,
Jakob

[1]: https://git.sr.ht/~jakob/guix/tree/c6e01898dc774eef318c042595d6490e50e19486/gnu/tests/machine.scm
[2]: https://git.sr.ht/~jakob/guix/tree/358f12871326085c3e108181887ea36a8577de73/gnu/tests/machine.scm
Ludovic Courtès July 5, 2019, 8 a.m. UTC | #9
Hi Jakob,

zerodaysfordays@sdf.lonestar.org (Jakob L. Kreuze) skribis:

> Something hit me today. There aren't any tests for 'guix system
> reconfigure'. There are for 'guix system init' in
> 'gnu/tests/install.scm', but not for 'guix system reconfigure', which
> makes me think that I'm going about testing this the wrong way. I feel I
> should begin by isolate the behavior that's common between 'guix system
> reconfigure' and 'guix deploy' as you suggested, and then writing tests
> for that common code in the system test suite.

That would be great, especially factorizing these bits.

Note that writing tests could be tricky because it’s about testing the
effect of these reconfigure actions.  At any rate, let us know how it
goes!

> Then, as Ricardo suggested, mocking can be used for the parts that are
> specific only to 'guix deploy'.

Sounds good.

Thank you!

Ludo’.
Christine Lemmer-Webber July 5, 2019, 10:32 a.m. UTC | #10
References: <87o92ianbj.fsf@sdf.lonestar.org> <87o92glap5.fsf@dustycloud.org> <878sthoqzi.fsf@gnu.org> <87imsl9tsx.fsf_-_@sdf.lonestar.org> <87ef399tpu.fsf_-_@sdf.lonestar.org> <87a7dx9tog.fsf_-_@sdf.lonestar.org> <875zol9tn2.fsf_-_@sdf.lonestar.org> <871rz99tl9.fsf_-_@sdf.lonestar.org> <875zoldqah.fsf@kyleam.com> <87muhwtmfp.fsf@sdf.lonestar.org> <871rz874l2.fsf@kyleam.com> <877e90tj7l.fsf_-_@sdf.lonestar.org> <8736jotj5v.fsf_-_@sdf.lonestar.org> <87y31gs4k5.fsf_-_@sdf.lonestar.org>
User-agent: mu4e 1.2.0; emacs 26.2
In-reply-to: <87y31gs4k5.fsf_-_@sdf.lonestar.org>
Jakob L. Kreuze writes:

> +(define-record-type* <environment-type> environment-type
> +  make-environment-type
> +  environment-type?
> +
> +  ;; Interface to the environment type's deployment code. Each procedure
> +  ;; should take the same arguments as the top-level procedure of this file
> +  ;; that shares the same name. For example, 'machine-remote-eval' should be
> +  ;; of the form '(machine-remote-eval machine exp)'.
> +  (machine-remote-eval environment-type-machine-remote-eval) ; procedure
> +  (deploy-machine      environment-type-deploy-machine)      ; procedure
> +
> +  ;; Metadata.
> +  (name        environment-type-name)       ; symbol
> +  (description environment-type-description ; string
> +               (default #f))
> +  (location    environment-type-location    ; <location>
> +               (default (and=> (current-source-location)
> +                               source-properties->location))
> +               (innate)))

Yeah!  I think this is much nicer. :)

> +
> +
> +;;;
> +;;; Declarations for machines in a deployment.
> +;;;
> +
> +(define-record-type* <machine> machine
> +  make-machine
> +  machine?
> +  this-machine
> +  (system        machine-system)       ; <operating-system>
> +  (environment   machine-environment)  ; symbol
> +  (configuration machine-configuration ; configuration object
> +                 (default #f)))        ; specific to environment
> +
> +(define (machine-display-name machine)
> +  "Return the host-name identifying MACHINE."
> +  (operating-system-host-name (machine-system machine)))
> +
> +(define (build-machine machine)
> +  "Monadic procedure that builds the system derivation for MACHINE and returning
> +a list containing the path of the derivation file and the path of the derivation
> +output."
> +  (let ((os (machine-system machine)))
> +    (mlet* %store-monad ((osdrv (operating-system-derivation os))
> +                         (_ ((store-lift build-derivations) (list osdrv))))
> +      (return (list (derivation-file-name osdrv)
> +                    (derivation->output-path osdrv))))))
> +
> +(define (machine-remote-eval machine exp)
> +  "Evaluate EXP, a gexp, on MACHINE. Ensure that all the elements EXP refers to
> +are built and deployed to MACHINE beforehand."
> +  (let ((environment (machine-environment machine)))
> +    ((environment-type-machine-remote-eval environment) machine exp)))
> +
> +(define (deploy-machine machine)
> +  "Monadic procedure transferring the new system's OS closure to the remote
> +MACHINE, activating it on MACHINE and switching MACHINE to the new generation."
> +  (let ((environment (machine-environment machine)))
> +    ((environment-type-deploy-machine environment) machine)))

Oooooh so much cleaner.  Nice nice nice!  I like this.

> diff --git a/gnu/machine/ssh.scm b/gnu/machine/ssh.scm
> new file mode 100644
> index 000000000..6ce106bb2
> --- /dev/null
> +++ b/gnu/machine/ssh.scm
> @@ -0,0 +1,363 @@
> +;;; GNU Guix --- Functional package management for GNU
> +;;; Copyright © 2019 Jakob L. Kreuze <zerodaysfordays@sdf.lonestar.org>
> +;;;
> +;;; This file is part of GNU Guix.
> +;;;
> +;;; GNU Guix is free software; you can redistribute it and/or modify it
> +;;; under the terms of the GNU General Public License as published by
> +;;; the Free Software Foundation; either version 3 of the License, or (at
> +;;; your option) any later version.
> +;;;
> +;;; GNU Guix is distributed in the hope that it will be useful, but
> +;;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;;; GNU General Public License for more details.
> +;;;
> +;;; You should have received a copy of the GNU General Public License
> +;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
> +
> +(define-module (gnu machine ssh)
> +  #:use-module (gnu bootloader)
> +  #:use-module (gnu machine)
> +  #:autoload   (gnu packages gnupg) (guile-gcrypt)
> +  #:use-module (gnu services)
> +  #:use-module (gnu services shepherd)
> +  #:use-module (gnu system)
> +  #:use-module (guix derivations)
> +  #:use-module (guix gexp)
> +  #:use-module (guix i18n)
> +  #:use-module (guix modules)
> +  #:use-module (guix monads)
> +  #:use-module (guix records)
> +  #:use-module (guix remote)
> +  #:use-module (guix ssh)
> +  #:use-module (guix store)
> +  #:use-module (ice-9 match)
> +  #:use-module (srfi srfi-19)
> +  #:export (managed-host-environment-type
> +
> +            machine-ssh-configuration
> +            machine-ssh-configuration?
> +            machine-ssh-configuration
> +
> +            machine-ssh-configuration-host-name
> +            machine-ssh-configuration-port
> +            machine-ssh-configuration-user
> +            machine-ssh-configuration-session))
> +
> +;;; Commentary:
> +;;;
> +;;; This module implements remote evaluation and system deployment for
> +;;; machines that are accessable over SSH and have a known host-name. In the
> +;;; sense of the broader "machine" interface, we describe the environment for
> +;;; such machines as 'managed-host.
> +;;;
> +;;; Code:
> +
> +
> +;;;
> +;;; Parameters for the SSH client.
> +;;;
> +
> +(define-record-type* <machine-ssh-configuration> machine-ssh-configuration
> +  make-machine-ssh-configuration
> +  machine-ssh-configuration?
> +  this-machine-ssh-configuration
> +  (host-name machine-ssh-configuration-host-name) ; string
> +  (port      machine-ssh-configuration-port       ; integer
> +             (default 22))
> +  (user      machine-ssh-configuration-user       ; string
> +             (default "root"))
> +  (identity  machine-ssh-configuration-identity   ; path to a private key
> +             (default #f))
> +  (session   machine-ssh-configuration-session    ; session
> +             (default #f)))
> +
> +(define (machine-ssh-session machine)
> +  "Return the SSH session that was given in MACHINE's configuration, or create
> +one from the configuration's parameters if one was not provided."
> +  (let ((config (machine-configuration machine)))
> +    (if (machine-ssh-configuration? config)
> +        (or (machine-ssh-configuration-session config)
> +            (let ((host-name (machine-ssh-configuration-host-name config))
> +                  (user (machine-ssh-configuration-user config))
> +                  (port (machine-ssh-configuration-port config))
> +                  (identity (machine-ssh-configuration-identity config)))
> +              (open-ssh-session host-name
> +                                #:user user
> +                                #:port port
> +                                #:identity identity)))
> +        (error "unsupported configuration type"))))
> +
> +
> +;;;
> +;;; Remote evaluation.
> +;;;
> +
> +(define (managed-host-remote-eval machine exp)
> +  "Internal implementation of 'machine-remote-eval' for MACHINE instances with
> +an environment type of 'managed-host."
> +  (maybe-raise-missing-configuration-error machine)
> +  (remote-eval exp (machine-ssh-session machine)))
> +
> +
> +;;;
> +;;; System deployment.
> +;;;
> +
> +(define (switch-to-system machine)
> +  "Monadic procedure creating a new generation on MACHINE and execute the
> +activation script for the new system configuration."
> +  (define (remote-exp drv script)
> +    (with-extensions (list guile-gcrypt)
> +      (with-imported-modules (source-module-closure '((guix config)
> +                                                      (guix profiles)
> +                                                      (guix utils)))
> +        #~(begin
> +            (use-modules (guix config)
> +                         (guix profiles)
> +                         (guix utils))
> +
> +            (define %system-profile
> +              (string-append %state-directory "/profiles/system"))
> +
> +            (let* ((system #$(derivation->output-path drv))
> +                   (number (1+ (generation-number %system-profile)))
> +                   (generation (generation-file-name %system-profile number)))
> +              (switch-symlinks generation system)
> +              (switch-symlinks %system-profile generation)
> +              ;; The implementation of 'guix system reconfigure' saves the
> +              ;; load path and environment here. This is unnecessary here
> +              ;; because each invocation of 'remote-eval' runs in a distinct
> +              ;; Guile REPL.
> +              (setenv "GUIX_NEW_SYSTEM" system)
> +              ;; The activation script may write to stdout, which confuses
> +              ;; 'remote-eval' when it attempts to read a result from the
> +              ;; remote REPL. We work around this by forcing the output to a
> +              ;; string.
> +              (with-output-to-string
> +                (lambda ()
> +                  (primitive-load #$script))))))))
> +
> +  (let* ((os (machine-system machine))
> +         (script (operating-system-activation-script os)))
> +    (mlet* %store-monad ((drv (operating-system-derivation os)))
> +      (machine-remote-eval machine (remote-exp drv script)))))
> +
> +;; XXX: Currently, this does NOT attempt to restart running services. This is
> +;; also the case with 'guix system reconfigure'.
> +;;
> +;; See <https://issues.guix.info/issue/33508>.
> +(define (upgrade-shepherd-services machine)
> +  "Monadic procedure unloading and starting services on the remote as needed
> +to realize the MACHINE's system configuration."
> +  (define target-services
> +    ;; Monadic expression evaluating to a list of (name output-path) pairs for
> +    ;; all of MACHINE's services.
> +    (mapm %store-monad
> +          (lambda (service)
> +            (mlet %store-monad ((file ((compose lower-object
> +                                                shepherd-service-file)
> +                                       service)))
> +              (return (list (shepherd-service-canonical-name service)
> +                            (derivation->output-path file)))))
> +          (service-value
> +           (fold-services (operating-system-services (machine-system machine))
> +                          #:target-type shepherd-root-service-type))))
> +
> +  (define (remote-exp target-services)
> +    (with-imported-modules '((gnu services herd))
> +      #~(begin
> +          (use-modules (gnu services herd)
> +                       (srfi srfi-1))
> +
> +          (define running
> +            (filter live-service-running (current-services)))
> +
> +          (define (essential? service)
> +            ;; Return #t if SERVICE is essential and should not be unloaded
> +            ;; under any circumstance.
> +            (memq (first (live-service-provision service))
> +                  '(root shepherd)))
> +
> +          (define (obsolete? service)
> +            ;; Return #t if SERVICE can be safely unloaded.
> +            (and (not (essential? service))
> +                 (every (lambda (requirements)
> +                          (not (memq (first (live-service-provision service))
> +                                     requirements)))
> +                        (map live-service-requirement running))))
> +
> +          (define to-unload
> +            (filter obsolete?
> +                    (remove (lambda (service)
> +                              (memq (first (live-service-provision service))
> +                                    (map first '#$target-services)))
> +                            running)))
> +
> +          (define to-start
> +            (remove (lambda (service-pair)
> +                      (memq (first service-pair)
> +                            (map (compose first live-service-provision)
> +                                 running)))
> +                    '#$target-services))
> +
> +          ;; Unload obsolete services.
> +          (for-each (lambda (service)
> +                      (false-if-exception
> +                       (unload-service service)))
> +                    to-unload)
> +
> +          ;; Load the service files for any new services and start them.
> +          (load-services/safe (map second to-start))
> +          (for-each start-service (map first to-start))
> +
> +          #t)))
> +
> +  (mlet %store-monad ((target-services target-services))
> +    (machine-remote-eval machine (remote-exp target-services))))
> +
> +(define (machine-boot-parameters machine)
> +  "Monadic procedure returning a list of 'boot-parameters' for the generations
> +of MACHINE's system profile, ordered from most recent to oldest."
> +  (define bootable-kernel-arguments
> +    (@@ (gnu system) bootable-kernel-arguments))
> +
> +  (define remote-exp
> +    (with-extensions (list guile-gcrypt)
> +      (with-imported-modules (source-module-closure '((guix config)
> +                                                      (guix profiles)))
> +        #~(begin
> +            (use-modules (guix config)
> +                         (guix profiles)
> +                         (ice-9 textual-ports))
> +
> +            (define %system-profile
> +              (string-append %state-directory "/profiles/system"))
> +
> +            (define (read-file path)
> +              (call-with-input-file path
> +                (lambda (port)
> +                  (get-string-all port))))
> +
> +            (map (lambda (generation)
> +                   (let* ((system-path (generation-file-name %system-profile
> +                                                             generation))
> +                          (boot-parameters-path (string-append system-path
> +                                                               "/parameters"))
> +                          (time (stat:mtime (lstat system-path))))
> +                     (list generation
> +                           system-path
> +                           time
> +                           (read-file boot-parameters-path))))
> +                 (reverse (generation-numbers %system-profile)))))))
> +
> +  (mlet* %store-monad ((generations (machine-remote-eval machine remote-exp)))
> +    (return
> +     (map (lambda (generation)
> +            (match generation
> +              ((generation system-path time serialized-params)
> +               (let* ((params (call-with-input-string serialized-params
> +                                read-boot-parameters))
> +                      (root (boot-parameters-root-device params))
> +                      (label (boot-parameters-label params)))
> +                 (boot-parameters
> +                  (inherit params)
> +                  (label
> +                   (string-append label " (#"
> +                                  (number->string generation) ", "
> +                                  (let ((time (make-time time-utc 0 time)))
> +                                    (date->string (time-utc->date time)
> +                                                  "~Y-~m-~d ~H:~M"))
> +                                  ")"))
> +                  (kernel-arguments
> +                   (append (bootable-kernel-arguments system-path root)
> +                           (boot-parameters-kernel-arguments params))))))))
> +          generations))))
> +
> +(define (install-bootloader machine)
> +  "Create a bootloader entry for the new system generation on MACHINE, and
> +configure the bootloader to boot that generation by default."
> +  (define bootloader-installer-script
> +    (@@ (guix scripts system) bootloader-installer-script))
> +
> +  (define (remote-exp installer bootcfg bootcfg-file)
> +    (with-extensions (list guile-gcrypt)
> +      (with-imported-modules (source-module-closure '((gnu build install)
> +                                                      (guix store)
> +                                                      (guix utils)))
> +        #~(begin
> +            (use-modules (gnu build install)
> +                         (guix store)
> +                         (guix utils))
> +            (let* ((gc-root (string-append "/" %gc-roots-directory "/bootcfg"))
> +                   (temp-gc-root (string-append gc-root ".new")))
> +
> +              (switch-symlinks temp-gc-root gc-root)
> +
> +              (unless (false-if-exception
> +                       (begin
> +                         ;; The implementation of 'guix system reconfigure'
> +                         ;; saves the load path here. This is unnecessary here
> +                         ;; because each invocation of 'remote-eval' runs in a
> +                         ;; distinct Guile REPL.
> +                         (install-boot-config #$bootcfg #$bootcfg-file "/")
> +                         ;; The installation script may write to stdout, which
> +                         ;; confuses 'remote-eval' when it attempts to read a
> +                         ;; result from the remote REPL. We work around this
> +                         ;; by forcing the output to a string.
> +                         (with-output-to-string
> +                           (lambda ()
> +                             (primitive-load #$installer)))))
> +                (delete-file temp-gc-root)
> +                (error "failed to install bootloader"))
> +
> +              (rename-file temp-gc-root gc-root)
> +              #t)))))
> +
> +  (mlet* %store-monad ((boot-parameters (machine-boot-parameters machine)))
> +    (let* ((os (machine-system machine))
> +           (bootloader ((compose bootloader-configuration-bootloader
> +                                 operating-system-bootloader)
> +                        os))
> +           (bootloader-target (bootloader-configuration-target
> +                               (operating-system-bootloader os)))
> +           (installer (bootloader-installer-script
> +                       (bootloader-installer bootloader)
> +                       (bootloader-package bootloader)
> +                       bootloader-target
> +                       "/"))
> +           (menu-entries (map boot-parameters->menu-entry boot-parameters))
> +           (bootcfg (operating-system-bootcfg os menu-entries))
> +           (bootcfg-file (bootloader-configuration-file bootloader)))
> +      (machine-remote-eval machine (remote-exp installer bootcfg bootcfg-file)))))
> +
> +(define (deploy-managed-host machine)
> +  "Internal implementation of 'deploy-machine' for MACHINE instances with an
> +environment type of 'managed-host."
> +  (maybe-raise-missing-configuration-error machine)
> +  (mbegin %store-monad
> +    (switch-to-system machine)
> +    (upgrade-shepherd-services machine)
> +    (install-bootloader machine)))
> +
> +
> +;;;
> +;;; Environment type.
> +;;;
> +
> +(define managed-host-environment-type
> +  (environment-type
> +   (machine-remote-eval managed-host-remote-eval)
> +   (deploy-machine      deploy-managed-host)
> +   (name                'managed-host-environment-type)
> +   (description         "Provisioning for machines that are accessable over SSH
> +and have a known host-name. This entails little more than maintaining an SSH
> +connection to the host.")))
> +
> +(define (maybe-raise-missing-configuration-error machine)
> +  "Raise an error if MACHINE's configuration is #f."
> +  (let ((environment (machine-environment machine)))
> +    (unless (machine-configuration machine)
> +      (error (format #f (G_ "no configuration specified for environment '~a'")
> +                     (symbol->string (environment-type-name environment)))))))

Yeah ok!  This looks good to me.  I think my issues are all addressed
here.