mbox series

[bug#70542,0/4] Improve Shepherd service support for networked file systems

Message ID cover.1713904784.git.richard@freakingpenguin.com
Headers show
Series Improve Shepherd service support for networked file systems | expand

Message

Richard Sent April 23, 2024, 8:44 p.m. UTC
Hi Guix!

This patch series aims to improve the experience when using Guix and Shepherd
to manage networked file systems.

Previously, operating-system file-system entries would all be started before
the symbol 'file-systems was provided, which many other Shepherd services
depend on. This meant that adding a networked file-system with (mount? #t)
would (depending on mount-can-fail?) either halt boot due to 'user-processes
(and thus 'networking) not being provisioned, or fail to mount, even though
Guix contained the code to sucessfully mount that file system.

Now, file system entries can specify arbitrary Shepherd symbols that other
services provision. When this is done, that specific file-system entry is not
mounted as part of providing 'file-systems.

I considered adding a (network?) flag to the file-system record instead, but
that wouldn't handle every case (say, if an Avahi .local address was used). So
instead I went with the more general approach.

Prior workarounds were verbose [1] and required creating a custom service
entry. This method allows for reusing code already present in (gnu
services base) and (gnu build file-systems).

I considered splitting CIFS support into its own patch, but since the support
is fairly meaningless without the preceding commits, I figured keeping it was
best.

This patch series resolves https://issues.guix.gnu.org/46563.

Richard Sent (4):
  file-systems: Add requirements field to file-systems
  services: base: Use requirements to delay some file-systems
  file-systems: Add support for mounting CIFS file systems
  system: Do not check for CIFS file system availability

 doc/guix.texi               | 13 ++++++++
 gnu/build/file-systems.scm  | 60 ++++++++++++++++++++++++++++++++-----
 gnu/machine/ssh.scm         |  3 +-
 gnu/services/base.scm       | 16 ++++++++--
 gnu/system/file-systems.scm |  3 ++
 guix/scripts/system.scm     |  3 +-
 6 files changed, 87 insertions(+), 11 deletions(-)


base-commit: 0f68306268773f0eaa4327e1f6fdcb39442e4a34

Comments

Richard Sent April 23, 2024, 8:51 p.m. UTC | #1
Oops, forgot to include the link to [1] in the cover letter.

(If you see this Felix, nothing's wrong with your code! :) I just needed
an example of how it's currently done.)

[1] https://lists.gnu.org/archive/html/guix-devel/2024-04/msg00233.html
Jonathan Brielmaier April 25, 2024, 6:51 a.m. UTC | #2
Hello Richard,

thanks for improving the CIFS mounting problem!

I'm using a CIFS share on one of my servers. There I stumbled upon a
problem, that the share is disappearing (e.g. CIFS server unavailable
for a short time) and gets not automatically remounted.

So I'm using a simple cron job to workaround this problem:
```
;; CIFS mount disappears often
(define mount-all-job
   #~(job "0 * * * *"
          "mount --all"
          #:user "root"))
```

Do you know if this particular problem gets resolved with your patch?

~Jonathan
Richard Sent April 25, 2024, 1:43 p.m. UTC | #3
Hi Jonathan!

Jonathan Brielmaier <jonathan.brielmaier@web.de> writes:

> Hello Richard,
>
> thanks for improving the CIFS mounting problem!
>
> I'm using a CIFS share on one of my servers. There I stumbled upon a
> problem, that the share is disappearing (e.g. CIFS server unavailable
> for a short time) and gets not automatically remounted.
>
> Do you know if this particular problem gets resolved with your patch?
>

I've never experienced that issue myself so I can't say for sure.
However, I don't believe my patch would resolve that issue.

file-system-shepherd-service in (gnu services base) is in charge of
mounting the file system. That service does not attempt to monitor the
file system's status after running. There's no daemon. If the file
system is mounted successfully, Shepherd will think there's no problem.

My understanding is that Shepherd will not respawn a service that
starts, then exits sucessfully. From Shepherd's manual:

> start’.  If the starting attempt failed, it must return ‘#f’
> or throw an exception; otherwise, the return value is stored
> as the “running value” of the service.

This could be solved by, for example, adding a remount? flag and/or
remount-delay field to file-systems and changing
file-system-shepherd-service to conditionally use a fork-style
constructor many other services use. Within that process, a loop checks
if there is a file system mounted at the target location.

There might be a better way to structure this. I'd be a little worried
about adding many new file-system record fields that aren't always used.
Consider when needed-for-boot is #t, file-system-shepherd-service isn't
used at all. Those new flags silently do nothing. I think that's fine
when it's just one (requirements), but it's probably worth some thought
if we add more later.

Either way it's probably another patch problem.