[bug#75810,v5,00/14] Rootless guix-daemon

Message ID cover.1741973869.git.ludo@gnu.org
Headers
Series Rootless guix-daemon |

Message

Ludovic Courtès March 14, 2025, 5:47 p.m. UTC
Hello Guix!

Changes since v4:

  • Remove qualifiers such as “new” from the documentation
    and clarify that unprivileged guix-daemon is the option
    chosen by default in some cases (Simon, Maxim).

  • Change ‘deleteTmpDir’ to deal with the case where
    CAP_SYS_CHOWN is available but ‘--disable-chroot’ is used
    (Reepca).

  • Add ‘unshare’ call in the build process before ‘execve’
    to create new user and mount namespaces, thereby locking
    together all the previous mounts; check by calling
    ‘umount’ and ensuring that it returns EINVAL that mounts
    are indeed locked (Reepca).

  • In ‘guix-install.sh’, keep /var/guix/profiles/per-user/root
    root-owned (previously it was chowned to ‘guix-daemon’).

  • In ‘guix-install.sh’, start ‘gnu-store.mount’ explicitly
    since it is no longer a dependency of ‘guix-daemon.service’.

  • In ‘guix-daemon.service.in’, set
    ‘GUIX_DATABASE_DIRECTORY=/var/guix’ for forward compatibility
    (I’m thinking of eventually changing the default database
    location when not running as root).

With these changes, the ‘debian-install’ and ‘guix-daemon’
system tests both pass.

I think we’ve never been this close to completion.  :-)

Thoughts?

Thanks a lot for your feedback, comrades.

Ludo’.

Ludovic Courtès (14):
  daemon: Use ‘close_range’ where available.
  daemon: Bind-mount /etc/nsswitch.conf & co. only if it exists.
  daemon: Bind-mount all the inputs, not just directories.
  daemon: Remount inputs as read-only.
  daemon: Remount root directory as read-only.
  daemon: Allow running as non-root with unprivileged user namespaces.
  daemon: Create /var/guix/profiles/per-user unconditionally.
  daemon: Drop Linux ambient capabilities before executing builder.
  daemon: Move comments where they belong.
  tests: Add missing derivation inputs.
  tests: Run in a chroot and unprivileged user namespaces.
  etc: systemd services: Run ‘guix-daemon’ as an unprivileged user.
  guix-install.sh: Support the unprivileged daemon where possible.
  DRAFT gnu: guix: Update to 00562be.

 build-aux/test-env.in               |  16 +-
 config-daemon.ac                    |   5 +-
 doc/guix.texi                       | 102 +++++++++---
 etc/gnu-store.mount.in              |   3 +-
 etc/guix-daemon.service.in          |  22 ++-
 etc/guix-install.sh                 | 109 +++++++++---
 gnu/packages/package-management.scm |   6 +-
 guix/substitutes.scm                |   2 +-
 nix/libstore/build.cc               | 247 ++++++++++++++++++++-------
 nix/libstore/local-store.cc         |  26 ++-
 nix/libutil/util.cc                 |  23 ++-
 tests/derivations.scm               |  24 ++-
 tests/packages.scm                  |  13 +-
 tests/processes.scm                 |   9 +-
 tests/store.scm                     | 250 ++++++++++++++++++++++++----
 15 files changed, 675 insertions(+), 182 deletions(-)


base-commit: 519fc51b6ecfe9ac9f2fa2f4ae052ab1984eed22
  

Comments

Reepca Russelstein March 15, 2025, 11:44 p.m. UTC | #1
Ludovic Courtès <ludo@gnu.org> writes:

>   • In ‘guix-daemon.service.in’, set
>     ‘GUIX_DATABASE_DIRECTORY=/var/guix’ for forward compatibility
>     (I’m thinking of eventually changing the default database
>     location when not running as root).

Did you intend GUIX_STATE_DIRECTORY here, or GUIX_DATABASE_DIRECTORY in
the patch?  GUIX_STATE_DIRECTORY is what's in the patch.

> @@ -1087,12 +1091,19 @@ string runProgram(Path program, bool searchPath, const Strings & args)
>  
>  void closeMostFDs(const set<int> & exceptions)
>  {
> -    int maxFD = 0;
> -    maxFD = sysconf(_SC_OPEN_MAX);
> -    for (int fd = 0; fd < maxFD; ++fd)
> -        if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO
> -            && exceptions.find(fd) == exceptions.end())
> -            close(fd); /* ignore result */
> +#ifdef HAVE_CLOSE_RANGE
> +    if (exceptions.empty())
> +	 close_range(3, ~0U, 0);
> +    else
> +#endif
> +    {
> +	 int maxFD = 0;
> +	 maxFD = sysconf(_SC_OPEN_MAX);
> +	 for (int fd = 0; fd < maxFD; ++fd)
> +	      if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO
> +		  && exceptions.find(fd) == exceptions.end())
> +		   close(fd); /* ignore result */
> +    }
>  }

(in patch 01/14, in nix/libutil/util.cc)

Minor note: this could be implemented solely in terms of close_range, by
sorting the exceptions and iterating over the gaps between them.  It's
fine as it is though.

> +The second and preferred option is to run @command{guix-daemon}
> +@emph{as an unprivileged user}.  It has the advantage of reducing the
> +harm that can be done should a build process manage to exploit a
> +vulnerability in the daemon.  This option requires the user of Linux's
> +unprivileged user namespace mechanism; today it is available and enabled
> +by most GNU/Linux distributions but can still be disabled.  The
> +installation script automatically determines whether this option is
> +available on your system (@pxref{Binary Installation}).

(in patch 06/14, in doc/guix.texi)

In the third sentence: s/requires the user/requires the use/

> +When using this option, you only need to create one user account, and
> +@command{guix-daemon} will run with the authority of that account:
> +
> +@example
> +# groupadd --system guix-daemon
> +# useradd -g guix-daemon -G guix-daemon              \
> +          -d /var/empty -s $(which nologin)          \
> +          -c "Guix daemon privilege separation user" \
> +          --system guix-daemon
> +@end example
> +
> +In this configuration, @file{/gnu/store} is owned by the
> +@code{guix-daemon} user.

It may be somewhat confusing to the reader looking at this page in
isolation whether they still need to create this account after running
the installation script, since it was mentioned immediately before this.

> @@ -1567,10 +1612,17 @@ Invoking guix-daemon
>  @item --disable-chroot
>  Disable chroot builds.
> 
> -Using this option is not recommended since, again, it would allow build
> -processes to gain access to undeclared dependencies.  It is necessary,
> -though, when @command{guix-daemon} is running under an unprivileged user
> -account.
> +@quotation Warning
> +Using this option is not recommended since it allows build processes to
> +gain access to undeclared dependencies, to interfere with one another,
> +and more generally to do anything that can be done with the authority of
> +the daemon---which includes at least the ability to tamper with any file
> +in the store!
> +
> +You may find it necessary, though, when support for Linux unprivileged
> +user namespaces is missing (@pxref{Build Environment Setup}).  Use at
> +your own risk!
> +@end quotation

Note that the "do anything that can be done with the authority of the
daemon" part is only true in the case where the builder and the daemon
run under the same user.  For example, on Hurd, we use --disable-chroot
but still use the separate builder users.  More generally,
--disable-chroot allows any user that can start builds to gain the
privileges of the build user their build runs as.  While this allows for
the output of any build run as that user to be controlled, to my
knowledge it can't change the contents of existing outputs (unless they
can be gc'ed and rebuilt, of course).

Perhaps this would be a good place to recommend setting the permissions
of LOCALSTATEDIR/guix/daemon-socket to allow only trusted users to
access it?


> diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
> index c8b778362a..76f75e00df 100644
> --- a/nix/libstore/build.cc
> +++ b/nix/libstore/build.cc
> @@ -744,6 +744,10 @@ private:
>  
>      friend int childEntry(void *);
>  
> +    /* Pipe to notify readiness to the child process when using unprivileged
> +       user namespaces.  */
> +    Pipe readiness;
> +
>      /* Check that the derivation outputs all exist and register them
>         as valid. */
>      void registerOutputs();

(in nix/libstore/build.cc)

This is inside of DerivationGoal.  We found out while hunting that
deadlock bug that it doesn't always get disposed of in a timely fashion.
So we probably shouldn't rely on the AutoCloseFD type of
readiness.readSide and readiness.writeSide to close them in a timely
fashion, opting instead to explicitly close the end we use immediately
after we're done with it.  It would also probably be wise to follow the
usual practice of the reader closing the write end before reading, and
the writer closing the read end before writing, lest any
sudden-process-death cause indefinite hangs.

On a slightly-related note, while investigating what we currently do for
the other file-descriptor-bearing objects in DerivationGoal, I noticed
that we appear to never explicitly close builderOut.readSide in the
child (we do implicitly close it eventually with closeMostFDs).
Probably not a problem in practice, since the parent never writes to the
pipe, and if the parent process gets suddenly wiped out, the possibility
of the child blocking indefinitely is probably the least of our worries,
but maybe it would be good to close it in commonChildInit.

> @@ -102,10 +102,22 @@ then
>      rm -rf "$GUIX_STATE_DIRECTORY/daemon-socket"
>      mkdir -m 0700 "$GUIX_STATE_DIRECTORY/daemon-socket"
>  
> +    # If unprivileged user namespaces are not supported, pass
> +    # '--disable-chroot'.
> +    if [ ! -f /proc/sys/kernel/unprivileged_userns_clone ] \
> +       || [ "$(cat /proc/sys/kernel/unprivileged_userns_clone)" -eq 1 ]; then
> +	extra_options=""
> +    else
> +	extra_options="--disable-chroot"
> +	echo "unprivileged user namespaces not supported; \
> +running 'guix-daemon $extra_options'" >&2
> +    fi
> +
>      # Launch the daemon without chroot support because is may be
>      # unavailable, for instance if we're not running as root.
>      "@abs_top_builddir@/pre-inst-env"				\
> -	"@abs_top_builddir@/guix-daemon" --disable-chroot	\
> +	"@abs_top_builddir@/guix-daemon"			\
> +        $extra_options						\
>  	--substitute-urls="$GUIX_BINARY_SUBSTITUTE_URL" &

(in patch 11/14, in build-aux/test-env.in)

I think this test for /proc/sys/kernel/unprivileged_userns_clone will
only work on linux, no?  On Hurd it will see that the file doesn't exist
and assume that means it shouldn't pass --disable-chroot.

Maybe also require that something like /proc/self/ns/user exists?  I see
in (gnu build linux-container) this is what we do for
'user-namespace-supported?'.  Perhaps
'unprivileged-user-namespace-supported?' should also be updated to only
return true if user namespaces are supported at all?  Otherwise many of
the test skips in this patch are going to do the opposite of what they
should on Hurd.

Also, perhaps the "Launch the daemon without chroot support because is
..."  comment should be updated (also s/is/it/ there) or removed.

> +(unless (unprivileged-user-namespace-supported?)
> +  (test-skip 1))
> +(test-assert "build root cannot be made world-readable"
> +  (let ((drv
> +         (run-with-store %store
> +           (gexp->derivation
> +            "attempt-to-make-root-world-readable"
> +            (with-imported-modules (source-module-closure
> +                                    '((guix build syscalls)))
> +              #~(begin
> +                  (use-modules (guix build syscalls))
> +
> +                  (let ((guile (string-append (assoc-ref %guile-build-info
> +                                                         'bindir)
> +                                              "/guile")))
> +                    (catch 'system-error
> +                      (lambda ()
> +                        (chmod "/" #o777))
> +                      (lambda args
> +                        (format #t "failed to make root writable: ~a~%"
> +                                (strerror (system-error-errno args)))
> +                        (format #t "attempting read-write remount~%")
> +                        (mount "none" "/" "/" (logior MS_BIND MS_REMOUNT))
> +                        (chmod "/" #o777)))
> +                    (copy-file guile "/guile")
> +                    (chmod "/guile" #o6755)
> +                    ;; At this point, there's a world-readable setuid 'guile'
> +                    ;; binary in the store that remains visible until this
> +                    ;; build completes.
> +                    (list #$output))))))))
> +    (guard (c ((store-protocol-error? c) #t))
> +      (build-derivations %store (list drv))
> +      #f)))

(in tests/store.scm)

It may be good to forego actually creating the setuid guile here, since
it's not part of what's actually being tested, just a looming threat to
motivate what's being tested.  In the event that this test fails
someday, let's be kind to our testers and leave them with only a
world-readable setuid empty file or file containing a frowny face or
something.

... actually, on closer inspection, won't this test never fail?  The
gexp doesn't actually create #$output, so the build will always fail,
and the store-protocol-error will always be signaled, no?

> +# Provide the CAP_CHOWN capability so that guix-daemon cran create and chown
> +# /var/guix/profiles/per-user/$USER and also chown failed build directories
> +# when using '--keep-failed'.  Note that guix-daemon explicitly drops ambient
> +# capabilities before executing build processes so they don't inherit them.
> +AmbientCapabilities=CAP_CHOWN

(in patch 12/14, in etc/guix-daemon.service.in)

s/cran create/can create/

Also, I'm curious how this interacts with the changes to the installer
script.  What happens if the installer detects that the rootless daemon
isn't supported, but the guix that it's installing only has this version
of guix-daemon.service?  I haven't checked, but if the answer is "the
service is broken", perhaps we should have two variants of the service
file, and sys_enable_guix_daemon decides which one to symlink
/etc/systemd/system/guix-daemon.service to?

> +can_install_unprivileged_daemon()
> +{ # Return true if we can install guix-daemon running without privileges.
> +    [ "$INIT_SYS" = systemd ] && \
> +	grep -q "User=guix-daemon" \
> +	     ~root/.config/guix/current/lib/systemd/system/guix-daemon.service \
> +	&& ([ ! -f /proc/sys/kernel/unprivileged_userns_clone ] \
> +		|| [ "$(cat /proc/sys/kernel/unprivileged_userns_clone)" -eq 1 ])
> +}

(in patch 13/14, in etc/guix-install.sh)

I vaguely recall hearing that systemd only supports linux, so it might
not be strictly necessary to check whether user namespaces are supported
at all here like it was in test-env.  If we get support for the rootless
daemon working for the other init systems a check like that may be
necessary, though.

It does indeed look like it's going to be necessary to make some changes
to make the script able to support installing a newer guix to a
systemd-using system that doesn't support unprivileged user namespaces.

Perhaps a test case should be added for this system configuration?


We're getting close indeed!

- reepca
  
Ludovic Courtès March 18, 2025, 9:34 a.m. UTC | #2
Hi!

(Oops, forgot to hit “send” on this one…)

Reepca Russelstein <reepca@russelstein.xyz> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>>   • In ‘guix-daemon.service.in’, set
>>     ‘GUIX_DATABASE_DIRECTORY=/var/guix’ for forward compatibility
>>     (I’m thinking of eventually changing the default database
>>     location when not running as root).
>
> Did you intend GUIX_STATE_DIRECTORY here, or GUIX_DATABASE_DIRECTORY in
> the patch?  GUIX_STATE_DIRECTORY is what's in the patch.

Sorry, I meant GUIX_STATE_DIRECTORY.

> Minor note: this could be implemented solely in terms of close_range, by
> sorting the exceptions and iterating over the gaps between them.  It's
> fine as it is though.

Yes.  In practice ‘exceptions’ is always empty in this code, so we
should eventually remove the argument.

> (in patch 06/14, in doc/guix.texi)
>
> In the third sentence: s/requires the user/requires the use/

Noted.

>> +When using this option, you only need to create one user account, and
>> +@command{guix-daemon} will run with the authority of that account:

[...]

> It may be somewhat confusing to the reader looking at this page in
> isolation whether they still need to create this account after running
> the installation script, since it was mentioned immediately before this.

Note that this doesn’t change the situation compared to ‘master’: it
describes what guix-daemon expects, without regard for the installation
method.  I’ll see how this can be clarified though.

>> +@quotation Warning
>> +Using this option is not recommended since it allows build processes to
>> +gain access to undeclared dependencies, to interfere with one another,
>> +and more generally to do anything that can be done with the authority of
>> +the daemon---which includes at least the ability to tamper with any file
>> +in the store!
>> +
>> +You may find it necessary, though, when support for Linux unprivileged
>> +user namespaces is missing (@pxref{Build Environment Setup}).  Use at
>> +your own risk!
>> +@end quotation
>
> Note that the "do anything that can be done with the authority of the
> daemon" part is only true in the case where the builder and the daemon
> run under the same user.  For example, on Hurd, we use --disable-chroot
> but still use the separate builder users.

Oh right.

[...]

> Perhaps this would be a good place to recommend setting the permissions
> of LOCALSTATEDIR/guix/daemon-socket to allow only trusted users to
> access it?

I don’t think so; we’re really just describing ‘--disable-chroot’ here.

>> +    /* Pipe to notify readiness to the child process when using unprivileged
>> +       user namespaces.  */
>> +    Pipe readiness;
>> +
>>      /* Check that the derivation outputs all exist and register them
>>         as valid. */
>>      void registerOutputs();
>
> (in nix/libstore/build.cc)
>
> This is inside of DerivationGoal.  We found out while hunting that
> deadlock bug that it doesn't always get disposed of in a timely fashion.
> So we probably shouldn't rely on the AutoCloseFD type of
> readiness.readSide and readiness.writeSide to close them in a timely
> fashion, opting instead to explicitly close the end we use immediately
> after we're done with it.

Yes, agreed.

> On a slightly-related note, while investigating what we currently do for
> the other file-descriptor-bearing objects in DerivationGoal, I noticed
> that we appear to never explicitly close builderOut.readSide in the
> child (we do implicitly close it eventually with closeMostFDs).
> Probably not a problem in practice, since the parent never writes to the
> pipe, and if the parent process gets suddenly wiped out, the possibility
> of the child blocking indefinitely is probably the least of our worries,
> but maybe it would be good to close it in commonChildInit.

Agreed.

>> +    if [ ! -f /proc/sys/kernel/unprivileged_userns_clone ] \
>> +       || [ "$(cat /proc/sys/kernel/unprivileged_userns_clone)" -eq 1 ]; then
>> +	extra_options=""
>> +    else
>> +	extra_options="--disable-chroot"

[...]

> I think this test for /proc/sys/kernel/unprivileged_userns_clone will
> only work on linux, no?  On Hurd it will see that the file doesn't exist
> and assume that means it shouldn't pass --disable-chroot.

Oops yes.

> Maybe also require that something like /proc/self/ns/user exists?  I see
> in (gnu build linux-container) this is what we do for
> 'user-namespace-supported?'.  Perhaps
> 'unprivileged-user-namespace-supported?' should also be updated to only
> return true if user namespaces are supported at all?  Otherwise many of
> the test skips in this patch are going to do the opposite of what they
> should on Hurd.

OK.

> Also, perhaps the "Launch the daemon without chroot support because is
> ..."  comment should be updated (also s/is/it/ there) or removed.

OK.

>> +(unless (unprivileged-user-namespace-supported?)
>> +  (test-skip 1))
>> +(test-assert "build root cannot be made world-readable"

[...]

> It may be good to forego actually creating the setuid guile here, since
> it's not part of what's actually being tested, just a looming threat to
> motivate what's being tested.  In the event that this test fails
> someday, let's be kind to our testers and leave them with only a
> world-readable setuid empty file or file containing a frowny face or
> something.
>
> ... actually, on closer inspection, won't this test never fail?  The
> gexp doesn't actually create #$output, so the build will always fail,
> and the store-protocol-error will always be signaled, no?

Oops!  Done.

> (in patch 12/14, in etc/guix-daemon.service.in)
>
> s/cran create/can create/

Done.

> Also, I'm curious how this interacts with the changes to the installer
> script.  What happens if the installer detects that the rootless daemon
> isn't supported, but the guix that it's installing only has this version
> of guix-daemon.service?  I haven't checked, but if the answer is "the
> service is broken", perhaps we should have two variants of the service
> file, and sys_enable_guix_daemon decides which one to symlink
> /etc/systemd/system/guix-daemon.service to?

The assumption is indeed that unprivileged user namespaces are
supported.

I have reasons to believe that this assumption holds these days, which
is why I didn’t bother with the fallback you suggest.

>> +can_install_unprivileged_daemon()
>> +{ # Return true if we can install guix-daemon running without privileges.
>> +    [ "$INIT_SYS" = systemd ] && \
>> +	grep -q "User=guix-daemon" \
>> +	     ~root/.config/guix/current/lib/systemd/system/guix-daemon.service \
>> +	&& ([ ! -f /proc/sys/kernel/unprivileged_userns_clone ] \
>> +		|| [ "$(cat /proc/sys/kernel/unprivileged_userns_clone)" -eq 1 ])
>> +}
>
> (in patch 13/14, in etc/guix-install.sh)
>
> I vaguely recall hearing that systemd only supports linux, so it might
> not be strictly necessary to check whether user namespaces are supported
> at all here like it was in test-env.  If we get support for the rootless
> daemon working for the other init systems a check like that may be
> necessary, though.

Right, systemd is Linux-only, but you’re right that this should be
changed like ./test-env for other systems.

> It does indeed look like it's going to be necessary to make some changes
> to make the script able to support installing a newer guix to a
> systemd-using system that doesn't support unprivileged user namespaces.

A system running systemd definitely supports unprivileged user
namespaces; it is still possible to disable it manually, but I’m willing
to assume that it’s uncommon.

I’m sending v6 with these changes.

Ludo’.