diff mbox series

[bug#44800,v2,3/3] Use substitute servers on the local network.

Message ID 20201124132145.217751-4-othacehe@gnu.org
State Accepted
Headers show
Series publish: Add Avahi support. | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

Mathieu Othacehe Nov. 24, 2020, 1:21 p.m. UTC
* guix/scripts/discover.scm: New file.
* Makefile.am (MODULES): Add it.
* nix/nix-daemon/guix-daemon.cc (options): Add "use-local-publish" option,
(parse-opt): parse it,
(main): start "guix discover" process when the option is set.
* nix/libstore/globals.hh (Settings): Add "useLocalPublish" public member.
* nix/libstore/globals.cc (Settings): Initialize it.
* guix/scripts/substitute.scm (%local-substitute-urls): New variable,
(substitute-urls): add it.
* gnu/services/base.scm (<guix-configuration>): Add "use-local-publish?"
field,
(guix-shepherd-service): honor it.
* doc/guix.texi (Invoking guix-daemon): Document "use-local-publish" option,
(Base Services): ditto.
---
 Makefile.am                   |   1 +
 doc/guix.texi                 |   8 ++
 gnu/services/base.scm         |   9 +-
 guix/scripts/discover.scm     | 158 ++++++++++++++++++++++++++++++++++
 guix/scripts/substitute.scm   |  11 ++-
 nix/libstore/globals.cc       |   1 +
 nix/libstore/globals.hh       |   4 +
 nix/nix-daemon/guix-daemon.cc |  20 +++++
 8 files changed, 209 insertions(+), 3 deletions(-)
 create mode 100644 guix/scripts/discover.scm

Comments

Ludovic Courtès Nov. 27, 2020, 5:37 p.m. UTC | #1
Mathieu Othacehe <othacehe@gnu.org> skribis:

> * guix/scripts/discover.scm: New file.
> * Makefile.am (MODULES): Add it.
> * nix/nix-daemon/guix-daemon.cc (options): Add "use-local-publish" option,
> (parse-opt): parse it,
> (main): start "guix discover" process when the option is set.
> * nix/libstore/globals.hh (Settings): Add "useLocalPublish" public member.
> * nix/libstore/globals.cc (Settings): Initialize it.
> * guix/scripts/substitute.scm (%local-substitute-urls): New variable,
> (substitute-urls): add it.
> * gnu/services/base.scm (<guix-configuration>): Add "use-local-publish?"
> field,
> (guix-shepherd-service): honor it.
> * doc/guix.texi (Invoking guix-daemon): Document "use-local-publish" option,
> (Base Services): ditto.

[...]

> +@item --use-local-publish[=yes|no]
> +Whether to use publish servers discovered a the local network, using
> +Avahi, for substitutution.

How about ‘--discover-substitute-servers’ or ‘--discover-substitutes’ or
even ‘--discover’?

s/publish servers/substitute servers/

I think we need a note about the performance, security, and privacy
implications of this here, namely:

  0. It might be faster/less expensive than fetching from remote
     servers; 

  1. There are no security risks, only genuine substitutes will be used
     (add cross-ref);

  2. An attacker advertising ‘guix publish’ on your LAN cannot serve you
     malicious binaries, but they can learn what software you’re
     installing.

  3. Servers may serve substitute over HTTP, unencrypted, so anyone on
     the LAN can see what software you’re installing.

IWBN to have an action of the Shepherd service to turn it on and off;
you might want to do that depending on how much you trust the LAN you’re
on.  (That can come later though.)

> +++ b/gnu/services/base.scm
> @@ -1529,6 +1529,8 @@ archive' public keys, with GUIX."
>                      (default 0))
>    (log-compression  guix-configuration-log-compression
>                      (default 'bzip2))
> +  (use-local-publish? guix-configuration-use-local-publish?
> +                      (default #f))

Same here.

> +(define %publish-services
> +  ;; Set of discovered publish services.
> +  (make-hash-table))
> +
> +(define (publish-file cache-directory)
> +  "Return the name of the file storing the discovered publish services inside
> +CACHE-DIRECTORY."
> +  (let ((directory (string-append cache-directory "/discover")))
> +    (string-append directory "/publish")))
> +
> +(define %publish-file
> +  (make-parameter (publish-file %state-directory)))
> +
> +(define* (write-publish-file #:key (file (%publish-file)))
> +  "Dump the content of %PUBLISH-SERVICES hash table into FILE.  Use a write
> +lock on FILE to synchronize with any potential readers."

Aren’t we partly duplicating what avahi-daemon’s already doing?
avahi-daemon maintains a list of currently valid advertisements, which
can be seen with:

  avahi-browse --cache _workstation._tcp

However, that cache first needs to be initialized by running the same
command without ‘--cache’.  Hmm, maybe there’s no other choice.  I
wonder how others deal with that.

> +(define-command (guix-discover . args)
> +  (category plumbing)

Should be “internal” IMO.

> +++ b/nix/libstore/globals.cc
> @@ -35,6 +35,7 @@ Settings::Settings()
>      maxSilentTime = 0;
>      buildTimeout = 0;
>      useBuildHook = true;
> +    useLocalPublish = false;
>      printBuildTrace = false;
>      multiplexedBuildOutput = false;
>      reservedSize = 8 * 1024 * 1024;
> diff --git a/nix/libstore/globals.hh b/nix/libstore/globals.hh
> index 27616a2283..43653aef48 100644
> --- a/nix/libstore/globals.hh
> +++ b/nix/libstore/globals.hh
> @@ -116,6 +116,10 @@ struct Settings {
>         users want to disable this from the command-line. */
>      bool useBuildHook;
>  
> +    /* Whether to use publish servers found on the local network for
> +       substitution. */
> +    bool useLocalPublish;

I think you don’t even need to field here since the variable is only
used in guix-daemon.cc.

> +    case GUIX_OPT_USE_LOCAL_PUBLISH:
> +      settings.useLocalPublish = string_to_bool (arg);
> +      settings.set("use-local-publish", arg);
> +      break;

Just set a variable local to this file and that’s enough.  You still
need the second line so that (guix scripts substitute) knows whether it
should read the thing.

> diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
> index ddb885d344..16e8fe6106 100755
> --- a/guix/scripts/substitute.scm
> +++ b/guix/scripts/substitute.scm
> @@ -27,6 +27,7 @@
>    #:use-module (guix config)
>    #:use-module (guix records)
>    #:use-module ((guix serialization) #:select (restore-file))
> +  #:use-module (guix scripts discover)
>    #:use-module (gcrypt hash)
>    #:use-module (guix base32)
>    #:use-module (guix base64)
> @@ -1078,9 +1079,17 @@ found."
>       ;; daemon.
>       '("http://ci.guix.gnu.org"))))
>  
> +(define %local-substitute-urls
> +  ;; If the following option is passed to the daemon, use the substitutes list
> +  ;; provided by "guix discover" process.
> +  (if (find-daemon-option "use-local-publish")
> +      (read-publish-urls)
> +      '()))
> +
>  (define substitute-urls
>    ;; List of substitute URLs.
> -  (make-parameter %default-substitute-urls))
> +  (make-parameter (append %local-substitute-urls
> +                          %default-substitute-urls)))

As discussed on IRC, we should probably need to set an upper limit. on
the number of local substitute URLs.

Imagine: you’re at GuixCon 2021, there are 500 participants all of which
are running ‘guix publish --advertise’; every Guix operation leads to
everyone’s Guix talking to every other person’s Guix, the whole thing
gets slow as hell, 500 people staring at “updating list of substitutes”,
500 people eventually giving up and signing up for CONDACon.

Also, we must make sure ‘guix substitute’ gracefully handles disconnects
and servers still advertised but no longer around (timeouts etc.)

We’ll need real world tests to see how it behaves I think.  In the
meantime, we can describe it as a technology preview™ in the manual.

WDYT?

Ludo’.
Mathieu Othacehe Nov. 29, 2020, 2:29 p.m. UTC | #2
Hey,

> How about ‘--discover-substitute-servers’ or ‘--discover-substitutes’ or
> even ‘--discover’?

"--discover" seems nice.

> I think we need a note about the performance, security, and privacy
> implications of this here, namely:
>
>   0. It might be faster/less expensive than fetching from remote
>      servers; 
>
>   1. There are no security risks, only genuine substitutes will be used
>      (add cross-ref);
>
>   2. An attacker advertising ‘guix publish’ on your LAN cannot serve you
>      malicious binaries, but they can learn what software you’re
>      installing.
>
>   3. Servers may serve substitute over HTTP, unencrypted, so anyone on
>      the LAN can see what software you’re installing.

I added a variant of this snippet to the documentation.

> IWBN to have an action of the Shepherd service to turn it on and off;
> you might want to do that depending on how much you trust the LAN you’re
> on.  (That can come later though.)

Yup, I agree.

> Aren’t we partly duplicating what avahi-daemon’s already doing?
> avahi-daemon maintains a list of currently valid advertisements, which
> can be seen with:
>
>   avahi-browse --cache _workstation._tcp
>
> However, that cache first needs to be initialized by running the same
> command without ‘--cache’.  Hmm, maybe there’s no other choice.  I
> wonder how others deal with that.

If the local network machines are connected with multiple interfaces
such as Wifi and Ethernet, then the discovered services will appear
multiple times, regardless of the "cache" option I think.

Couldn't find any useful resources about that, someone maybe?

> Just set a variable local to this file and that’s enough.  You still
> need the second line so that (guix scripts substitute) knows whether it
> should read the thing.

Right, fixed.

> Imagine: you’re at GuixCon 2021, there are 500 participants all of which
> are running ‘guix publish --advertise’; every Guix operation leads to
> everyone’s Guix talking to every other person’s Guix, the whole thing
> gets slow as hell, 500 people staring at “updating list of substitutes”,
> 500 people eventually giving up and signing up for CONDACon.

Haha, that would be a shame. I limited the number of local substitute
servers to 50. Maybe that's too high. I think that we will be able to
fine tune this value once we have more experience with it. Deploying
this mechanism on berlin will probably help.

> Also, we must make sure ‘guix substitute’ gracefully handles disconnects
> and servers still advertised but no longer around (timeouts etc.)
>
> We’ll need real world tests to see how it behaves I think.  In the
> meantime, we can describe it as a technology preview™ in the manual.

Sure, I described this option as "experimental" in the
documentation. Regarding the disconnections and timeouts, there's
probably some work, but I think it's transverse to this development.

Pushed the whole patchset, taking your remarks into account. Thanks
again for reviewing.

Thanks,

Mathieu
Ludovic Courtès Nov. 30, 2020, 1:46 p.m. UTC | #3
Hi,

Mathieu Othacehe <othacehe@gnu.org> skribis:

> Pushed the whole patchset, taking your remarks into account. Thanks
> again for reviewing.

Woohoo, thanks!

I’ll give it a spin in the coming days/weeks and we’ll see.  Can’t wait
to be physically back at the office to see how it goes with more
publishers/users.  :-)

Ludo’.
Mathieu Othacehe Dec. 1, 2020, 8:43 a.m. UTC | #4
Hey Ludo,

> I’ll give it a spin in the coming days/weeks and we’ll see.  Can’t wait
> to be physically back at the office to see how it goes with more
> publishers/users.  :-)

Great, I hope it will work fine :). Here's the system configuration I
used to test this feature.

Thanks,

Mathieu
diff mbox series

Patch

diff --git a/Makefile.am b/Makefile.am
index 7049da9594..41b366eb75 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -257,6 +257,7 @@  MODULES =					\
   guix/import/texlive.scm   			\
   guix/import/utils.scm				\
   guix/scripts.scm				\
+  guix/scripts/discover.scm			\
   guix/scripts/download.scm			\
   guix/scripts/perform-download.scm		\
   guix/scripts/build.scm			\
diff --git a/doc/guix.texi b/doc/guix.texi
index f8efc34310..72531533ff 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -1584,6 +1584,10 @@  Unless @option{--lose-logs} is used, all the build logs are kept in the
 @var{localstatedir}.  To save space, the daemon automatically compresses
 them with Bzip2 by default.
 
+@item --use-local-publish[=yes|no]
+Whether to use publish servers discovered a the local network, using
+Avahi, for substitutution.
+
 @item --disable-deduplication
 @cindex deduplication
 Disable automatic file ``deduplication'' in the store.
@@ -14999,6 +15003,10 @@  disables the timeout.
 The type of compression used for build logs---one of @code{gzip},
 @code{bzip2}, or @code{none}.
 
+@item @code{use-local-publish?} (default: @code{#f})
+Whether to use publish servers discovered a the local network, using
+Avahi, for substitutution.
+
 @item @code{extra-options} (default: @code{'()})
 List of extra command-line options for @command{guix-daemon}.
 
diff --git a/gnu/services/base.scm b/gnu/services/base.scm
index 87c247bdf1..718fa4096a 100644
--- a/gnu/services/base.scm
+++ b/gnu/services/base.scm
@@ -1529,6 +1529,8 @@  archive' public keys, with GUIX."
                     (default 0))
   (log-compression  guix-configuration-log-compression
                     (default 'bzip2))
+  (use-local-publish? guix-configuration-use-local-publish?
+                      (default #f))
   (extra-options    guix-configuration-extra-options ;list of strings
                     (default '()))
   (log-file         guix-configuration-log-file   ;string
@@ -1570,8 +1572,8 @@  proxy of 'guix-daemon'...~%")
   (match-record config <guix-configuration>
     (guix build-group build-accounts authorize-key? authorized-keys
           use-substitutes? substitute-urls max-silent-time timeout
-          log-compression extra-options log-file http-proxy tmpdir
-          chroot-directories)
+          log-compression use-local-publish? extra-options log-file
+          http-proxy tmpdir chroot-directories)
     (list (shepherd-service
            (documentation "Run the Guix daemon.")
            (provision '(guix-daemon))
@@ -1605,6 +1607,9 @@  proxy of 'guix-daemon'...~%")
                           #$@(if use-substitutes?
                                  '()
                                  '("--no-substitutes"))
+                          #$@(if use-local-publish?
+                                 '("--use-local-publish=yes")
+                                 '())
                           "--substitute-urls" #$(string-join substitute-urls)
                           #$@extra-options
 
diff --git a/guix/scripts/discover.scm b/guix/scripts/discover.scm
new file mode 100644
index 0000000000..d17b2bcfe4
--- /dev/null
+++ b/guix/scripts/discover.scm
@@ -0,0 +1,158 @@ 
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (guix scripts discover)
+  #:use-module (guix avahi)
+  #:use-module (guix config)
+  #:use-module (guix scripts)
+  #:use-module (guix ui)
+  #:use-module (guix build syscalls)
+  #:use-module (guix build utils)
+  #:use-module (guix scripts publish)
+  #:use-module (ice-9 rdelim)
+  #:use-module (srfi srfi-37)
+  #:export (read-publish-urls
+
+            guix-discover))
+
+(define (show-help)
+  (format #t (G_ "Usage: guix discover [OPTION]...
+Discover Guix related services using Avahi.\n"))
+  (display (G_ "
+  -c, --cache=DIRECTORY     cache discovery results in DIRECTORY"))
+  (display (G_ "
+  -h, --help                display this help and exit"))
+  (display (G_ "
+  -V, --version             display version information and exit"))
+  (newline)
+  (show-bug-report-information))
+
+(define %options
+  (list (option '(#\c "cache") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'cache arg result)))
+        (option '(#\h "help") #f #f
+                (lambda _
+                  (show-help)
+                  (exit 0)))
+        (option '(#\V "version") #f #f
+                (lambda _
+                  (show-version-and-exit "guix discover")))))
+
+(define %default-options
+  `((cache . ,%state-directory)))
+
+
+;;;
+;;; Publish servers.
+;;;
+
+(define %publish-services
+  ;; Set of discovered publish services.
+  (make-hash-table))
+
+(define (publish-file cache-directory)
+  "Return the name of the file storing the discovered publish services inside
+CACHE-DIRECTORY."
+  (let ((directory (string-append cache-directory "/discover")))
+    (string-append directory "/publish")))
+
+(define %publish-file
+  (make-parameter (publish-file %state-directory)))
+
+(define* (write-publish-file #:key (file (%publish-file)))
+  "Dump the content of %PUBLISH-SERVICES hash table into FILE.  Use a write
+lock on FILE to synchronize with any potential readers."
+  (with-file-lock file
+    (call-with-output-file file
+      (lambda (port)
+        (hash-for-each
+         (lambda (name service)
+           (format port "http://~a:~a~%"
+                   (avahi-service-address service)
+                   (avahi-service-port service)))
+         %publish-services)))
+        (chmod file #o644)))
+
+(define (call-with-read-file-lock file thunk)
+  "Call THUNK with a read lock on FILE."
+  (let ((port #f))
+    (dynamic-wind
+      (lambda ()
+        (set! port
+              (let ((port (open-file file "r0")))
+                (fcntl-flock port 'read-lock)
+                port)))
+      thunk
+      (lambda ()
+        (when port
+          (unlock-file port))))))
+
+(define-syntax-rule (with-read-file-lock file exp ...)
+  "Wait to acquire a read lock on FILE and evaluate EXP in that context."
+  (call-with-read-file-lock file (lambda () exp ...)))
+
+(define* (read-publish-urls #:key (file (%publish-file)))
+  "Read publish urls list from FILE and return it.  Use a read lock on FILE to
+synchronize with the writer."
+  (with-read-file-lock file
+    (call-with-input-file file
+      (lambda (port)
+        (let loop ((url (read-line port))
+                   (urls '()))
+          (if (eof-object? url)
+              urls
+              (loop (read-line port) (cons url urls))))))))
+
+
+;;;
+;;; Entry point.
+;;;
+
+(define %services
+  ;; List of services we want to discover.
+  (list publish-service-type))
+
+(define (service-proc action service)
+  (let ((name (avahi-service-name service))
+        (type (avahi-service-type service)))
+    (when (string=? type publish-service-type)
+      (case action
+        ((new-service)
+         (hash-set! %publish-services name service))
+        ((remove-service)
+         (hash-remove! %publish-services name)))
+      (write-publish-file))))
+
+(define-command (guix-discover . args)
+  (category plumbing)
+  (synopsis "discover Guix related services using Avahi")
+
+  (with-error-handling
+    (let* ((opts (args-fold* args %options
+                             (lambda (opt name arg result)
+                               (leave (G_ "~A: unrecognized option~%") name))
+                             (lambda (arg result)
+                               (leave (G_ "~A: extraneous argument~%") arg))
+                             %default-options))
+           (cache (assoc-ref opts 'cache))
+           (publish-file (publish-file cache)))
+      (parameterize ((%publish-file publish-file))
+        (mkdir-p (dirname publish-file))
+        (avahi-browse-service-thread service-proc
+                                     #:types %services)))))
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index ddb885d344..16e8fe6106 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -27,6 +27,7 @@ 
   #:use-module (guix config)
   #:use-module (guix records)
   #:use-module ((guix serialization) #:select (restore-file))
+  #:use-module (guix scripts discover)
   #:use-module (gcrypt hash)
   #:use-module (guix base32)
   #:use-module (guix base64)
@@ -1078,9 +1079,17 @@  found."
      ;; daemon.
      '("http://ci.guix.gnu.org"))))
 
+(define %local-substitute-urls
+  ;; If the following option is passed to the daemon, use the substitutes list
+  ;; provided by "guix discover" process.
+  (if (find-daemon-option "use-local-publish")
+      (read-publish-urls)
+      '()))
+
 (define substitute-urls
   ;; List of substitute URLs.
-  (make-parameter %default-substitute-urls))
+  (make-parameter (append %local-substitute-urls
+                          %default-substitute-urls)))
 
 (define (client-terminal-columns)
   "Return the number of columns in the client's terminal, if it is known, or a
diff --git a/nix/libstore/globals.cc b/nix/libstore/globals.cc
index 0cc001fbe4..2b621af982 100644
--- a/nix/libstore/globals.cc
+++ b/nix/libstore/globals.cc
@@ -35,6 +35,7 @@  Settings::Settings()
     maxSilentTime = 0;
     buildTimeout = 0;
     useBuildHook = true;
+    useLocalPublish = false;
     printBuildTrace = false;
     multiplexedBuildOutput = false;
     reservedSize = 8 * 1024 * 1024;
diff --git a/nix/libstore/globals.hh b/nix/libstore/globals.hh
index 27616a2283..43653aef48 100644
--- a/nix/libstore/globals.hh
+++ b/nix/libstore/globals.hh
@@ -116,6 +116,10 @@  struct Settings {
        users want to disable this from the command-line. */
     bool useBuildHook;
 
+    /* Whether to use publish servers found on the local network for
+       substitution. */
+    bool useLocalPublish;
+
     /* Whether buildDerivations() should print out lines on stderr in
        a fixed format to allow its progress to be monitored.  Each
        line starts with a "@".  The following are defined:
diff --git a/nix/nix-daemon/guix-daemon.cc b/nix/nix-daemon/guix-daemon.cc
index cd949aca67..f0ace9ce8b 100644
--- a/nix/nix-daemon/guix-daemon.cc
+++ b/nix/nix-daemon/guix-daemon.cc
@@ -89,6 +89,7 @@  builds derivations on behalf of its clients.");
 #define GUIX_OPT_TIMEOUT 18
 #define GUIX_OPT_MAX_SILENT_TIME 19
 #define GUIX_OPT_LOG_COMPRESSION 20
+#define GUIX_OPT_USE_LOCAL_PUBLISH 21
 
 static const struct argp_option options[] =
   {
@@ -129,6 +130,9 @@  static const struct argp_option options[] =
       n_("disable compression of the build logs") },
     { "log-compression", GUIX_OPT_LOG_COMPRESSION, "TYPE", 0,
       n_("use the specified compression type for build logs") },
+    { "use-local-publish", GUIX_OPT_USE_LOCAL_PUBLISH,
+      "yes/no", OPTION_ARG_OPTIONAL,
+      n_("use publish servers discovered on the local network") },
 
     /* '--disable-deduplication' was known as '--disable-store-optimization'
        up to Guix 0.7 included, so keep the alias around.  */
@@ -261,6 +265,10 @@  parse_opt (int key, char *arg, struct argp_state *state)
     case GUIX_OPT_NO_BUILD_HOOK:
       settings.useBuildHook = false;
       break;
+    case GUIX_OPT_USE_LOCAL_PUBLISH:
+      settings.useLocalPublish = string_to_bool (arg);
+      settings.set("use-local-publish", arg);
+      break;
     case GUIX_OPT_DEBUG:
       verbosity = lvlDebug;
       break;
@@ -506,6 +514,18 @@  using `--build-users-group' is highly recommended\n"));
 		    format ("extra chroot directories: '%1%'") % chroot_dirs);
 	}
 
+      if (settings.useLocalPublish)
+      {
+        Strings args;
+
+        args.push_back("guix");
+        args.push_back("discover");
+
+        startProcess([&]() {
+          execv(settings.guixProgram.c_str(), stringsToCharPtrs(args).data());
+        });
+      }
+
       printMsg (lvlDebug,
 		format ("automatic deduplication set to %1%")
 		% settings.autoOptimiseStore);