diff mbox

[bug#61803,0/3,shepherd] improve race-free spawn+wait

Message ID 87cz5x1jr2.fsf@tilde.club
State New
Headers show

Commit Message

ulfvonbelow Feb. 25, 2023, 10:08 p.m. UTC
These patches fill out shepherd's procedures for running processes to
completion.  They add a replacement for 'system' to complement the
existing replacement for 'system*', and add a 'fork+exec+wait-process'
procedure so that the flexibility of that family of procedures is
available for this use case as well.  It also improves error handling in
the event that an exception occurs while spawning a process in the
process monitor, which would normally kill that essential fiber.

Note: I previously tried to send this to guix-devel, but it didn't seem
to make it (I didn't see it in the archives after half a day), and after
some consideration I recalled that guix-patches exists.  Is this the
right place for shepherd patches?

Comments

Ludovic Courtès March 2, 2023, 10:16 p.m. UTC | #1
Hi Ulf,

Ulf Herrman <striness@tilde.club> skribis:

> These patches fill out shepherd's procedures for running processes to
> completion.  They add a replacement for 'system' to complement the
> existing replacement for 'system*', and add a 'fork+exec+wait-process'
> procedure so that the flexibility of that family of procedures is
> available for this use case as well.  It also improves error handling in
> the event that an exception occurs while spawning a process in the
> process monitor, which would normally kill that essential fiber.

Nice!

> Note: I previously tried to send this to guix-devel, but it didn't seem
> to make it (I didn't see it in the archives after half a day), and after
> some consideration I recalled that guix-patches exists.  Is this the
> right place for shepherd patches?

Yes, as Tobias confirmed already.  :-)

> From 64370a98dfc17f0531de7397a38362c03a1d89bc Mon Sep 17 00:00:00 2001
> From: ulfvonbelow <striness@tilde.club>
> Date: Sat, 25 Feb 2023 00:42:41 -0600
> Subject: [PATCH 1/3] service: Propagate exceptions while spawning in process
>  monitor.
>
> * modules/shepherd/service.scm (unboxed-errors): new procedure.
>   (boxed-errors): new syntax.
>   (process-monitor): use it to propagate exceptions from fork+exec-command via
>   reply channel.
>   (spawn-via-monitor): new procedure.
>   (spawn-command): use it.

Good catch!  I added a test and a copyright line for you (let me know if
I got it wrong) and pushed as 18989f2fffa6ecdbd0f9b77834e1a54c9c45ee73.

> From 51ee63ace6f3f52eb196c990664cc6b9af3d3683 Mon Sep 17 00:00:00 2001
> From: ulfvonbelow <striness@tilde.club>
> Date: Sat, 25 Feb 2023 00:46:27 -0600
> Subject: [PATCH 2/3] service: accept fork+exec-command argument list in
>  monitor.
>
> Sometimes it's necessary to run startup / shutdown programs as a certain user,
> in a certain directory, with certain environment variables, etc.  Shepherd
> currently provides a replacement for system* that won't race against the
> child process auto-reaper, but this lacks the flexibility Shepherd users are
> used to.
>
> * modules/shepherd/service.scm (process-monitor): treat command instead as
>   argument list to fork+exec-command.
>   (spawn-via-monitor): update to new convention.
>   (fork+exec+wait-command): new procedure.

I’ll take a closer look to this one and report back soon.

> From 177592ee9d4b7fc6dcc80e545e8ad615a1d6786c Mon Sep 17 00:00:00 2001
> From: ulfvonbelow <striness@tilde.club>
> Date: Sat, 25 Feb 2023 00:56:57 -0600
> Subject: [PATCH 3/3] service: add spawn-shell-command replacement for
>  `system'.
>
> We already have a replacement for `system*' that avoids racing, but not for
> `system'.
>
> * configure.ac (SHELL): new substitution variable.
> * modules/shepherd/system.scm.in (%shell-filename): new variable.
> * modules/shepherd/service.scm
>   (spawn-shell-command, real-system): new procedures.
> * modules/shepherd.scm (main): replace `system' with `spawn-shell-command'.

Out of curiosity, do you have a need for ‘system’?  I’m inclined to
recommend against its use, in which case this patch is unnecessary.

> +(define %shell-filename "@SHELL@")

This is the configure-time shell so it will be wrong when
cross-compiling.

I’d just do:

  (define %shell (or (getenv "SHELL") "/bin/sh"))

Thanks!

Ludo’.
Ludovic Courtès March 4, 2023, 10:09 p.m. UTC | #2
Hi Ulf,

Ulf Herrman <striness@tilde.club> skribis:

> From 51ee63ace6f3f52eb196c990664cc6b9af3d3683 Mon Sep 17 00:00:00 2001
> From: ulfvonbelow <striness@tilde.club>
> Date: Sat, 25 Feb 2023 00:46:27 -0600
> Subject: [PATCH 2/3] service: accept fork+exec-command argument list in
>  monitor.
>
> Sometimes it's necessary to run startup / shutdown programs as a certain user,
> in a certain directory, with certain environment variables, etc.  Shepherd
> currently provides a replacement for system* that won't race against the
> child process auto-reaper, but this lacks the flexibility Shepherd users are
> used to.
>
> * modules/shepherd/service.scm (process-monitor): treat command instead as
>   argument list to fork+exec-command.
>   (spawn-via-monitor): update to new convention.
>   (fork+exec+wait-command): new procedure.

On this one I took a similar approach but chose to extend
‘spawn-command’ instead of introducing a new procedure—see commit
0f3276a9c3dafbef41b0aab88ba5dda1bb78dc99.

Another difference is explicitly listing keyword arguments so that their
default values are taken from the caller’s dynamic state and not from
that of the process monitoring fiber.  This fixes
<https://issues.guix.gnu.org/60106>.

Let me know what you think!

Thanks,
Ludo’.
ulfvonbelow March 9, 2023, 3:48 a.m. UTC | #3
Ludovic Courtès <ludo@gnu.org> writes:

>> From 177592ee9d4b7fc6dcc80e545e8ad615a1d6786c Mon Sep 17 00:00:00 2001
>> From: ulfvonbelow <striness@tilde.club>
>> Date: Sat, 25 Feb 2023 00:56:57 -0600
>> Subject: [PATCH 3/3] service: add spawn-shell-command replacement for
>>  `system'.
>>
>> We already have a replacement for `system*' that avoids racing, but not for
>> `system'.
>>
>> * configure.ac (SHELL): new substitution variable.
>> * modules/shepherd/system.scm.in (%shell-filename): new variable.
>> * modules/shepherd/service.scm
>>   (spawn-shell-command, real-system): new procedures.
>> * modules/shepherd.scm (main): replace `system' with `spawn-shell-command'.
> Out of curiosity, do you have a need for ‘system’?

I don't.

> I’m inclined to recommend against its use, in which case this patch is
> unnecessary.

I tend to agree, but make-system-constructor and make-system-destructor
both use it and are documented in the manual, so we should either make
them work properly or remove them.

>> +(define %shell-filename "@SHELL@")
>
> This is the configure-time shell so it will be wrong when
> cross-compiling.
>
> I’d just do:
>
>   (define %shell (or (getenv "SHELL") "/bin/sh"))
>

The rationale behind not taking that straightforward approach was to
closely emulate the normal behavior of 'system' on guix, where the shell
path used is a hardcoded store path, though since guix's libc is likely
the only one where this is anything other than /bin/sh, I suppose it
does make a lot more sense to patch it in the guix package definition
(or accept the minor behavioral difference) than to try to automagically
figure it out at configure-time, which also has the problems you
mentioned.
Ludovic Courtès March 11, 2023, 2:57 p.m. UTC | #4
Hi Ulf,

Ulf Herrman <striness@tilde.club> skribis:

> I tend to agree, but make-system-constructor and make-system-destructor
> both use it and are documented in the manual, so we should either make
> them work properly or remove them.

Yeah, you’re right.  So I guess we need to support it for now and start
a deprecation cycle so we can eventually remove it.

For now, I’ve pushed a simplified version of your patch as
89dd3bb57fa3e3a23cf85385b0788046b7e45170.

Let me know if you notice something wrong!

Ludo’.
diff mbox

Patch

From 177592ee9d4b7fc6dcc80e545e8ad615a1d6786c Mon Sep 17 00:00:00 2001
From: ulfvonbelow <striness@tilde.club>
Date: Sat, 25 Feb 2023 00:56:57 -0600
Subject: [PATCH 3/3] service: add spawn-shell-command replacement for
 `system'.

We already have a replacement for `system*' that avoids racing, but not for
`system'.

* configure.ac (SHELL): new substitution variable.
* modules/shepherd/system.scm.in (%shell-filename): new variable.
* modules/shepherd/service.scm
  (spawn-shell-command, real-system): new procedures.
* modules/shepherd.scm (main): replace `system' with `spawn-shell-command'.
---
 configure.ac                   |  1 +
 modules/shepherd.scm           |  7 +++++--
 modules/shepherd/service.scm   | 13 +++++++++++++
 modules/shepherd/system.scm.in |  5 ++++-
 4 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index 6f681dc..19c177a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -32,6 +32,7 @@  guilemoduledir="${datarootdir}/guile/site/$GUILE_EFFECTIVE_VERSION"
 guileobjectdir="${libdir}/guile/$GUILE_EFFECTIVE_VERSION/site-ccache"
 AC_SUBST([guilemoduledir])
 AC_SUBST([guileobjectdir])
+AC_SUBST([SHELL])
 
 dnl Check for extra dependencies.
 GUILE_MODULE_AVAILABLE([have_fibers], [(fibers)])
diff --git a/modules/shepherd.scm b/modules/shepherd.scm
index cce0507..1f6342e 100644
--- a/modules/shepherd.scm
+++ b/modules/shepherd.scm
@@ -420,8 +420,10 @@  already ~a threads running, disabling 'signalfd' support")
 
                ;; Replace the default 'system*' binding with one that
                ;; cooperates instead of blocking on 'waitpid'.
-               (let ((real-system* system*))
+               (let ((real-system* system*)
+                     (real-system system))
                  (set! system* spawn-command)
+                 (set! system spawn-shell-command)
 
                  ;; Restore 'system*' after fork.
                  (set! primitive-fork
@@ -430,7 +432,8 @@  already ~a threads running, disabling 'signalfd' support")
                            (let ((result (real-fork)))
                              (when (zero? result)
                                (set! primitive-fork real-fork)
-                               (set! system* real-system*))
+                               (set! system* real-system*)
+                               (set! system real-system))
                              result)))))
 
                (run-daemon #:socket-file socket-file
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index a36e486..f8df3a9 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -81,6 +81,7 @@ 
             handle-SIGCHLD
             with-process-monitor
             spawn-command
+            spawn-shell-command
             %precious-signals
             register-services
             provided-by
@@ -1938,6 +1939,18 @@  context.  The process monitoring fiber is responsible for handling
       (spawn-via-monitor (list (cons program arguments)))
       (apply system* program arguments)))
 
+(define real-system system)
+
+(define* (spawn-shell-command #:optional command)
+  "Like 'system' but do not block while waiting for COMMAND to terminate."
+  (if (current-process-monitor)
+      (if command
+          (spawn-command %shell-filename "-c" command)
+          #t)
+      (if command
+          (real-system command)
+          (real-system))))
+
 (define (fork+exec+wait-command command . arguments)
   "Like 'fork+exec' but also wait for PROGRAM to terminate, giving its exit
 status."
diff --git a/modules/shepherd/system.scm.in b/modules/shepherd/system.scm.in
index 29357aa..4646e81 100644
--- a/modules/shepherd/system.scm.in
+++ b/modules/shepherd/system.scm.in
@@ -41,7 +41,8 @@ 
             unblock-signals
             set-blocked-signals
             with-blocked-signals
-            without-automatic-finalization))
+            without-automatic-finalization
+            %shell-filename))
 
 ;; The <sys/reboot.h> constants.
 (define RB_AUTOBOOT @RB_AUTOBOOT@)
@@ -328,3 +329,5 @@  Turning finalization off shuts down the finalization thread as a side effect."
         exp ...)
       (lambda ()
         (%set-automatic-finalization-enabled?! enabled?)))))
+
+(define %shell-filename "@SHELL@")
-- 
2.38.1