From patchwork Wed Apr 27 22:01:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ludovic_Court=C3=A8s?= X-Patchwork-Id: 38977 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 97F2D27BBE9; Wed, 27 Apr 2022 23:03:12 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 6375827BBEA for ; Wed, 27 Apr 2022 23:03:11 +0100 (BST) Received: from localhost ([::1]:38074 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1njpkc-0002Mp-JF for patchwork@mira.cbaines.net; Wed, 27 Apr 2022 18:03:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60894) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njpkU-0002MD-Id for guix-patches@gnu.org; Wed, 27 Apr 2022 18:03:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:50787) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1njpkU-0007sv-A5 for guix-patches@gnu.org; Wed, 27 Apr 2022 18:03:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1njpkU-0006ee-4j for guix-patches@gnu.org; Wed, 27 Apr 2022 18:03:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#54997] [PATCH 00/12] Add "least authority" program wrapper Resent-From: Ludovic =?utf-8?q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 27 Apr 2022 22:03:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 54997 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Maxime Devos Cc: 54997@debbugs.gnu.org Received: via spool by 54997-submit@debbugs.gnu.org id=B54997.165109692425505 (code B ref 54997); Wed, 27 Apr 2022 22:03:02 +0000 Received: (at 54997) by debbugs.gnu.org; 27 Apr 2022 22:02:04 +0000 Received: from localhost ([127.0.0.1]:44684 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1njpjX-0006dI-Rh for submit@debbugs.gnu.org; Wed, 27 Apr 2022 18:02:04 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39192) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1njpjV-0006cm-Ny for 54997@debbugs.gnu.org; Wed, 27 Apr 2022 18:02:02 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:38050) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njpjP-0007k5-WC; Wed, 27 Apr 2022 18:01:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=E1qBCqgNovU16XffzQ+yNY56AwR7xX7ESJGqWoPXBKs=; b=m1IWQb6FanF012eRzh2y bGA8CaJYo7WFeorAEhtsxGHKZKqZKES/jZd8t+7ecSQ7CTV9Ji5lWlXuewL3BMbeXNNxRYx4b5lZ4 ROKUHKDnmKBynuC3uhjrVP2LmIIQTSX7/+NfL41qPtUEvOjz9X17qo5rbNYjm86Tga7Q2MlRhLDXe 5o+ORH/beI7Mv+GAT73JbBh2S2Ez8CZ4SFL5dOJWH/HvjKd039E6gAyHGz/wyUHfAMKy/Nf1fDI7q SCPsRhpPvdfrw5MIi9nH6xIlDeO1Anjd0IEapSFBJq/zX0jXVVDN4D9jNpxowW9gd/O03pNVXNxLZ 4wYmm8kEp0KGWQ==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:57326 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1njpjP-00040O-Ij; Wed, 27 Apr 2022 18:01:55 -0400 From: Ludovic =?utf-8?q?Court=C3=A8s?= References: <20220417210453.27884-1-ludo@gnu.org> <20220417210453.27884-9-ludo@gnu.org> <4eac7fd571ddafd46bcadfa2ef5c6b3e41a162ab.camel@telenet.be> <8735i8ratp.fsf_-_@gnu.org> <616af1474c44d6c1caf71fa1f9d263ff46462201.camel@telenet.be> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 8 =?utf-8?q?Flor=C3=A9al?= an 230 de la =?utf-8?q?R?= =?utf-8?q?=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Thu, 28 Apr 2022 00:01:54 +0200 In-Reply-To: <616af1474c44d6c1caf71fa1f9d263ff46462201.camel@telenet.be> (Maxime Devos's message of "Fri, 22 Apr 2022 16:39:43 +0200") Message-ID: <878rrqgp7x.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches Hi, Maxime Devos skribis: > Many of these are supported by 'least-authority-wrapper' but these POLA > wrappers require creating an additional process which seems a bit > unoptimal to me (memory- and latency-wise). Yeah, that’s why I initially looked at unshare(2), just to find out that we can’t quite do the same as with clone(2)—in particular we cannot escape the current PID namespace. (There were also complications, such as the fact that you can only unshare(2) a single-threaded process, meaning that Guile had to be started with GC_MARKERS=1. For posterity, part of the patch I had is attached below.) > Also, having to do fork, waitpid and primitive-fork seems rather low- > level to me, so I prefer moving this code into somewhere like (gnu > build SOMEWHERE) or to keep the old make-forkexec-constructor/container > code. ‘primitive-fork’ and ‘waitpid’ calls are in (gnu build linux-container) right now so I guess we’re fine? The goal though is to replace uses of ‘make-forkexec-constructor/container’ with uses of ‘least-authority-wrapper’, as done in this patch series. Ludo’. diff --git a/gnu/build/linux-container.scm b/gnu/build/linux-container.scm index bdeca2cdb9..308c0bb325 100644 --- a/gnu/build/linux-container.scm +++ b/gnu/build/linux-container.scm @@ -1,6 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2015 David Thompson -;;; Copyright © 2017, 2018, 2019 Ludovic Courtès +;;; Copyright © 2017, 2018, 2019, 2022 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -21,6 +21,7 @@ (define-module (gnu build linux-container) #:use-module (ice-9 format) #:use-module (ice-9 match) #:use-module (ice-9 rdelim) + #:use-module (srfi srfi-1) #:use-module (srfi srfi-98) #:use-module (guix build utils) #:use-module (guix build syscalls) @@ -33,7 +34,8 @@ (define-module (gnu build linux-container) run-container call-with-container container-excursion - container-excursion*)) + container-excursion* + self-sever)) (define (user-namespace-supported?) "Return #t if user namespaces are supported on this system." @@ -174,50 +176,53 @@ (define* (mount* source target type #:optional (flags 0) options (chmod "/" #o755))) (define* (initialize-user-namespace pid host-uids - #:key (guest-uid 0) (guest-gid 0)) + #:key (guest-uid 0) (guest-gid 0) + (uid (getuid)) (gid (getgid))) "Configure the user namespace for PID. HOST-UIDS specifies the number of host user identifiers to map into the user namespace. GUEST-UID and GUEST-GID specify the first UID (respectively GID) that host UIDs (respectively GIDs) map to in the namespace." (define proc-dir - (string-append "/proc/" (number->string pid))) + (string-append "/proc/" + (match pid + ('self "self") + (_ (number->string pid))))) (define (scope file) (string-append proc-dir file)) - (let ((uid (getuid)) - (gid (getgid))) - - ;; Only root can write to the gid map without first disabling the - ;; setgroups syscall. - (unless (and (zero? uid) (zero? gid)) - (call-with-output-file (scope "/setgroups") - (lambda (port) - (display "deny" port)))) - - ;; Map the user/group that created the container to the root user - ;; within the container. - (call-with-output-file (scope "/uid_map") + ;; Only root can write to the gid map without first disabling the + ;; setgroups syscall. + (unless (and (zero? uid) (zero? gid)) + (call-with-output-file (scope "/setgroups") (lambda (port) - (format port "~d ~d ~d" guest-uid uid host-uids))) - (call-with-output-file (scope "/gid_map") - (lambda (port) - (format port "~d ~d ~d" guest-gid gid host-uids))))) + (display "deny" port)))) + + ;; Map the user/group that created the container to the root user + ;; within the container. + (call-with-output-file (scope "/uid_map") + (lambda (port) + (format port "~d ~d ~d" guest-uid uid host-uids))) + (call-with-output-file (scope "/gid_map") + (lambda (port) + (format port "~d ~d ~d" guest-gid gid host-uids)))) (define (namespaces->bit-mask namespaces) "Return the number suitable for the 'flags' argument of 'clone' that corresponds to the symbols in NAMESPACES." ;; Use the same flags as fork(3) in addition to the namespace flags. - (apply logior SIGCHLD - (map (match-lambda - ('cgroup CLONE_NEWCGROUP) - ('mnt CLONE_NEWNS) - ('uts CLONE_NEWUTS) - ('ipc CLONE_NEWIPC) - ('user CLONE_NEWUSER) - ('pid CLONE_NEWPID) - ('net CLONE_NEWNET)) - namespaces))) + (fold (lambda (namespace flags) + (logior flags + (match namespace + ('cgroup CLONE_NEWCGROUP) + ('mnt CLONE_NEWNS) + ('uts CLONE_NEWUTS) + ('ipc CLONE_NEWIPC) + ('user CLONE_NEWUSER) + ('pid CLONE_NEWPID) + ('net CLONE_NEWNET)))) + 0 + namespaces)) (define* (run-container root mounts namespaces host-uids thunk #:key (guest-uid 0) (guest-gid 0)) @@ -236,7 +241,7 @@ (define* (run-container root mounts namespaces host-uids thunk (match (socketpair PF_UNIX SOCK_STREAM 0) ((child . parent) (let ((flags (namespaces->bit-mask namespaces))) - (match (clone flags) + (match (clone (logior SIGCHLD flags)) (0 (call-with-clean-exit (lambda () @@ -392,3 +397,23 @@ (define (container-excursion* pid thunk) (close-port out) (close-port in) #f))))) + +(define* (self-sever mounts + #:key (namespaces %namespaces) (host-uids 1) + (guest-uid 0) (guest-gid 0)) + (let ((uid (getuid)) + (gid (getgid))) + (unshare (namespaces->bit-mask namespaces)) + + (initialize-user-namespace 'self host-uids + #:uid uid #:gid gid + #:guest-uid uid + #:guest-gid guest-gid) + + (when (memq 'mnt namespaces) + ;; (mount "none" "/" #f (logior MS_REC MS_PRIVATE)) + (call-with-temporary-directory + (lambda (root) + (mount-file-systems root mounts + #:mount-/proc? (memq 'pid namespaces) + #:mount-/sys? (memq 'net namespaces))))))) diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm index a7401fd73f..5ee6bd1229 100644 --- a/guix/build/syscalls.scm +++ b/guix/build/syscalls.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès +;;; Copyright © 2014-2022 Ludovic Courtès ;;; Copyright © 2015 David Thompson ;;; Copyright © 2015 Mark H Weaver ;;; Copyright © 2017 Mathieu Othacehe @@ -49,6 +49,11 @@ (define-module (guix build syscalls) MS_RELATIME MS_BIND MS_MOVE + MS_REC + MS_SILENT + MS_POSIXACL + MS_UNBINDABLE + MS_PRIVATE MS_LAZYTIME MNT_FORCE MNT_DETACH @@ -140,6 +145,7 @@ (define-module (guix build syscalls) CLONE_NEWPID CLONE_NEWNET clone + unshare setns PF_PACKET @@ -537,6 +543,11 @@ (define MS_REMOUNT 32) (define MS_NOATIME 1024) (define MS_BIND 4096) (define MS_MOVE 8192) +(define MS_REC 16384) +(define MS_SILENT 32768) +(define MS_POSIXACL 65536) +(define MS_UNBINDABLE 131072) +(define MS_PRIVATE 262144) (define MS_RELATIME 2097152) (define MS_STRICTATIME 16777216) (define MS_LAZYTIME 33554432) @@ -1101,6 +1112,23 @@ (define clone (list err)) ret))))) +(define unshare + (let ((proc (syscall->procedure int "unshare" (list int)))) + (lambda (flags) + "Disassociate the current process from parts of its execution context +according to FLAGS, which must be a logical or of CLONE_NEW* constants. + +Note that CLONE_NEWUSER requires that the calling process be single-threaded, +which is possible if and only if libgc is running a single marker thread; this +can be achieved by setting the GC_MARKERS environment variable to 1. If the +calling process is multi-threaded, this throws to 'system-error' with EINVAL." + (let-values (((ret err) + (without-automatic-finalization (proc flags)))) + (unless (zero? ret) + (throw 'system-error "unshare" "~a: ~A" + (list flags (strerror err)) + err)))))) + (define setns ;; Some systems may be using an old (pre-2.14) version of glibc where there ;; is no 'setns' function available.