From patchwork Fri Feb 21 15:40:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Romain GARBAGE X-Patchwork-Id: 38918 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 83BF227BBEC; Fri, 21 Feb 2025 15:42:47 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_CERTIFIED,RCVD_IN_VALIDITY_RPBL,RCVD_IN_VALIDITY_SAFE, SPF_HELO_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 7158F27BBE2 for ; Fri, 21 Feb 2025 15:42:46 +0000 (GMT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tlVAE-0000ub-PH; Fri, 21 Feb 2025 10:42:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tlVAC-0000uJ-Nm for guix-patches@gnu.org; Fri, 21 Feb 2025 10:42:04 -0500 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tlVAC-0002Rn-Eo for guix-patches@gnu.org; Fri, 21 Feb 2025 10:42:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:From:To:In-Reply-To:References:Subject; bh=GZ9Vt/+3exncLsJkVeAUAzaPucqP2ol5s5FhXbuMpgM=; b=UC0mtvdP4P7cUnWtoqR0ONBGo4YThQCnpNDCQ8DxXsor9D7DsFMC4lRHK9iF2tWDA9n4N9yPewnO+g1VqEXJRH1KiCl7VlPBd1p453JbB8VSJDbsnmwzg6aPTF2MBvufh56rs/qudFC87Es6q+gv9fIvpfqlo6er/fX6kNYvCE88lCsT9EM4yCprJl5I6mb8lE9IPcp3ezhou+NWoRAVbuVA1xRAHgO2scLvxC/OAGYwKaf5Os4wDBGomz3jJGfKqgSifHtodpIgTQEja/I+Lz0E77Asjbt2/xA74AwA6UCj+birHIdzxuWtIa/KJ0ZWHkBdsmlgyWLLy8rzYZvIQQ==; Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tlVAB-0001qi-U6 for guix-patches@gnu.org; Fri, 21 Feb 2025 10:42:03 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#76474] [PATCH Cuirass 1/6] base: =?utf-8?b?4oCYcmVtb3RlLWJ1?= =?utf-8?b?aWxkZXLigJk=?= listens for notifications from =?utf-8?b?4oCYY3Vp?= =?utf-8?b?cmFzcyByZW1vdGUtc2VydmVy4oCZLg==?= References: <20250221153551.22658-1-romain.garbage@inria.fr> In-Reply-To: <20250221153551.22658-1-romain.garbage@inria.fr> Resent-From: Romain GARBAGE Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 21 Feb 2025 15:42:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 76474 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 76474@debbugs.gnu.org Cc: ludovic.courtes@inria.fr, Ludovic =?utf-8?q?Court=C3=A8s?= Received: via spool by 76474-submit@debbugs.gnu.org id=B76474.17401524886758 (code B ref 76474); Fri, 21 Feb 2025 15:42:03 +0000 Received: (at 76474) by debbugs.gnu.org; 21 Feb 2025 15:41:28 +0000 Received: from localhost ([127.0.0.1]:33088 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tlV9Y-0001kT-KJ for submit@debbugs.gnu.org; Fri, 21 Feb 2025 10:41:27 -0500 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]:63151) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1tlV9V-0001j2-BH for 76474@debbugs.gnu.org; Fri, 21 Feb 2025 10:41:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=GZ9Vt/+3exncLsJkVeAUAzaPucqP2ol5s5FhXbuMpgM=; b=vsV+zYKBn/Fc/wVdegbuJ1pSE2Gm2QvD2sspqkT0BImoxRKKSNT02P8a Bagk+QX/o/kG360IdBQJFV3qBjR8BKdJr9kxzYOCLHYh5ucObHLusklEn TROyjPb00DH5UYdj/UYSSF3BExyqfFqFk2YRnloUf8XFqUWuqR/ysMh7D o=; Authentication-Results: mail3-relais-sop.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=romain.garbage@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="6.13,305,1732575600"; d="scan'208";a="109734821" Received: from 91-160-179-8.subs.proxad.net (HELO localhost.localdomain) ([91.160.179.8]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Feb 2025 16:41:14 +0100 From: Romain GARBAGE Date: Fri, 21 Feb 2025 16:40:10 +0100 Message-ID: <20250221154108.16375-1-romain.garbage@inria.fr> X-Mailer: git-send-email 2.48.1 MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches From: Ludovic Courtès * src/cuirass/base.scm (%remote-server-socket-file-name): New variable. (remote-builder-listener): New procedure. (remote-builder): Add ‘socket’ parameter. Call ‘remote-builder-listener’. Handle ‘build-status-change’ messages. (spawn-remote-builder): Add call to ‘open-unix-listening-socket’ and pass it to ‘remote-builder’. * src/cuirass/scripts/remote-server.scm (%notification-socket-pool): New variable. (send-build-status-change-notification, update-build-status): New procedures. (run-fetch, serve-build-requests): Call ‘update-build-status’ instead of ‘db-update-build-status!’ and ‘set-build-successful!’. (open-build-notification-socket): New procedure. (cuirass-remote-server): Use it and parameterize ‘%notification-socket-pool’. * tests/remote.scm (notification-server): New variable. (terminate-process): New procedure. (stop-worker, stop-server): Use it. (start-notification-server, stop-notification-server): New procedures. ("remote-server"): Call ‘start-notification-server’. ("clean-up"): Call ‘stop-notification-server’. --- src/cuirass/base.scm | 53 +++++++++++++-- src/cuirass/scripts/remote-server.scm | 96 ++++++++++++++++++--------- tests/remote.scm | 44 ++++++++++-- 3 files changed, 151 insertions(+), 42 deletions(-) base-commit: b6c2e340b24c4ea9701eb687669a091027bd361e diff --git a/src/cuirass/base.scm b/src/cuirass/base.scm index 27eb89a..631e81f 100644 --- a/src/cuirass/base.scm +++ b/src/cuirass/base.scm @@ -90,6 +90,7 @@ ;; Parameters. %bridge-socket-file-name open-bridge-connection + %remote-server-socket-file-name %package-cachedir %fallback?)) @@ -166,6 +167,8 @@ (string-append (%cuirass-state-directory) "/run")))) (define %bridge-socket-file-name + ;; Socket 'cuirass register' listens to, allowing 'cuirass web' to connect + ;; to it. (make-parameter (string-append (%cuirass-run-state-directory) "/cuirass/bridge"))) @@ -178,6 +181,12 @@ socket." (connect sock AF_UNIX file) sock)) +(define %remote-server-socket-file-name + ;; Socket 'cuirass register' listens to, allowing 'cuirass remote-server' to + ;; connect to it. + (make-parameter (string-append (%cuirass-run-state-directory) + "/cuirass/remote-builds"))) + ;;; ;;; Read parameters. @@ -456,22 +465,56 @@ by handing them to the local build daemon." (spawn-fiber (local-builder channel)) channel)) -(define (remote-builder channel) +(define (remote-builder-listener socket channel) + "Spawn a server that accepts connections on SOCKET and forwards messages it +reads to CHANNEL." + (define (serve-client client) + (let loop () + (match (read client) + ((? eof-object?) + (log-info "terminating remote server client connection on EOF")) + (message + (put-message channel message) + (loop))))) + + (lambda () + (let loop () + (match (accept socket (logior SOCK_NONBLOCK SOCK_CLOEXEC)) + ((connection . peer) + (spawn-fiber (lambda () + (log-info "remote builder accepted connection: ~s" + connection) + (serve-client connection))) + (loop)))))) + +(define (remote-builder channel socket) + "Spawn a remote builder that accepts messages on CHANNEL and receives +notifications from 'cuirass remote-server' over SOCKET." (lambda () (log-info "builds will be delegated to 'cuirass remote-server'") + (spawn-fiber (remote-builder-listener socket channel)) + (let loop () (match (get-message channel) (`(build ,derivations) ;; Currently there's nothing to do here: 'cuirass remote-server' ;; periodically calls 'db-get-pending-build'. ;; TODO: Push notifications to 'remote-server' instead. - (log-info "~a pending derivation builds" (length derivations)))) + (log-info "~a pending derivation builds" (length derivations))) + (`(build-status-change ,derivation ,status) + ;; TODO: Handle database operations, notifications, etc. from here. + (log-info "status of '~a' changed to ~a" derivation status))) (loop)))) (define (spawn-remote-builder) - "Spawn a build actor that performs builds using \"remote workers\"." - (let ((channel (make-channel))) - (spawn-fiber (remote-builder channel)) + "Spawn a build actor that performs builds using \"remote workers\". Return +once ready to listen for incoming connections from 'cuirass remote-server'." + (log-info "listening for 'cuirass remote-server' notifications on '~a'" + (%remote-server-socket-file-name)) + (let ((channel (make-channel)) + (socket (open-unix-listening-socket + (%remote-server-socket-file-name)))) + (spawn-fiber (remote-builder channel socket)) channel)) diff --git a/src/cuirass/scripts/remote-server.scm b/src/cuirass/scripts/remote-server.scm index 90a3db7..ae45232 100644 --- a/src/cuirass/scripts/remote-server.scm +++ b/src/cuirass/scripts/remote-server.scm @@ -1,6 +1,6 @@ ;;; remote-server.scm -- Remote build server. ;;; Copyright © 2020, 2021 Mathieu Othacehe -;;; Copyright © 2023, 2024 Ludovic Courtès +;;; Copyright © 2023-2025 Ludovic Courtès ;;; ;;; This file is part of Cuirass. ;;; @@ -20,7 +20,8 @@ (define-module (cuirass scripts remote-server) #:autoload (cuirass base) (read-parameters set-build-successful! - spawn-build-maintainer) + spawn-build-maintainer + %remote-server-socket-file-name) #:use-module (cuirass config) #:use-module (cuirass database) #:use-module (cuirass logging) @@ -88,6 +89,10 @@ (define %publish-port (make-parameter #f)) +(define %notification-socket-pool + ;; Pool of sockets connected to 'cuirass register'. + (make-parameter #f)) + (define service-name "Cuirass remote server") @@ -288,6 +293,22 @@ signing key for URL is not authorized." #t)))) outputs)))) +(define (send-build-status-change-notification drv status) + "Send a notification to 'cuirass register' that the status of DRV changed to +STATUS." + (with-resource-from-pool (%notification-socket-pool) sock + (write `(build-status-change ,drv ,status) sock) + (newline sock))) + +(define* (update-build-status drv status + #:key log-file) + "Change the status of DRV to STATUS, both in the database and by sending a +notification to 'cuirass register'." + (if (= (build-status succeeded) status) + (set-build-successful! drv) + (db-update-build-status! drv status #:log-file log-file)) + (send-build-status-change-notification drv status)) + (define* (run-fetch message) "Read MESSAGE and download the corresponding build outputs. If %CACHE-DIRECTORY is set, download the matching NAR and NARINFO files in this @@ -311,15 +332,15 @@ directory." (add-to-store drv outputs url)) (begin (log-info "build succeeded: '~a'" drv) - (set-build-successful! drv)) + (update-build-status drv (build-status succeeded))) (begin (log-error "failed to retrieve output of \ successful build '~a'; rescheduling" drv) - (db-update-build-status! drv (build-status scheduled)))))) + (update-build-status drv (build-status scheduled)))))) (('build-failed ('drv drv) ('url url) _ ...) (log-info "build failed: '~a'" drv) - (db-update-build-status! drv (build-status failed))))) + (update-build-status drv (build-status failed))))) (define (fetch-worker channel max-parallel-downloads) (define queue-size @@ -529,7 +550,7 @@ Use WORKER-DIRECTORY to maintain the list of active workers." ('dependency dependency) _ ...) (log-info "build failed: dependency '~a' of '~a'" dependency drv) - (db-update-build-status! drv (build-status failed-dependency)) + (update-build-status drv (build-status failed-dependency)) (put-message build-maintainer 'failed-dependency) ;mark 'failed-dependency' builds (let ((parent (db-get-build drv))) (when parent @@ -567,7 +588,7 @@ Use WORKER-DIRECTORY to maintain the list of active workers." (build-id build) derivation (build-system build))) (db-update-build-worker! derivation name) - (db-update-build-status! derivation (build-status submitted)) + (update-build-status derivation (build-status submitted)) (catch 'zmq-error (lambda () (reply-worker @@ -583,7 +604,7 @@ Use WORKER-DIRECTORY to maintain the list of active workers." (worker-name worker) (worker-address worker) message) - (db-update-build-status! derivation + (update-build-status derivation (build-status scheduled))))) (begin (when worker @@ -605,8 +626,8 @@ Use WORKER-DIRECTORY to maintain the list of active workers." (worker-name worker) drv)) (db-update-build-worker! drv name) - (db-update-build-status! drv (build-status started) - #:log-file log-file))) + (update-build-status drv (build-status started) + #:log-file log-file))) (`(build-rejected (drv ,drv) (worker ,name)) ;; Worker rejected the build, which might be either because the ;; derivation is unavailable or because of a transient error. In @@ -619,11 +640,11 @@ Use WORKER-DIRECTORY to maintain the list of active workers." (begin (log-warning "~a: build rejected: ~a; rescheduling" name drv) - (db-update-build-status! drv (build-status scheduled))) + (update-build-status drv (build-status scheduled))) (begin (log-warning "~a: build rejected: ~a; canceling" name drv) - (db-update-build-status! drv (build-status canceled))))) + (update-build-status drv (build-status canceled))))) (_ (log-warning "ignoring unrecognized message: ~s" command))))) @@ -699,6 +720,15 @@ exiting." (terminate-helper-processes) (primitive-exit 1)))) +(define (open-build-notification-socket) + "Return a socket connected to the 'cuirass register' process, used to send +status updates." + (let ((sock (socket AF_UNIX + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) + 0))) + (connect sock AF_UNIX (%remote-server-socket-file-name)) + sock)) + (define (cuirass-remote-server args) (signal-handler) (with-error-handling @@ -793,24 +823,30 @@ exiting." (run-fibers (lambda () - (with-database - (receive-logs log-port (%cache-directory)) - (spawn-notification-fiber) - (spawn-build-log-cleaner (assoc-ref opts 'build-log-expiry)) + (parameterize ((%notification-socket-pool + (make-resource-pool + (map (lambda (i) + (open-build-notification-socket)) + (iota 8)) + 'notification-socket))) + (with-database + (receive-logs log-port (%cache-directory)) + (spawn-notification-fiber) + (spawn-build-log-cleaner (assoc-ref opts 'build-log-expiry)) - (let ((fetch-worker (spawn-fetch-worker)) - (worker-directory (spawn-worker-directory))) - (catch 'zmq-error - (lambda () - (serve-build-requests backend-port - fetch-worker - worker-directory - #:build-maintainer - (spawn-build-maintainer))) - (lambda (key errno message . _) - (log-error (G_ "ZeroMQ error in build server: ~a") - message) - (terminate-helper-processes) - (primitive-exit 1)))))) + (let ((fetch-worker (spawn-fetch-worker)) + (worker-directory (spawn-worker-directory))) + (catch 'zmq-error + (lambda () + (serve-build-requests backend-port + fetch-worker + worker-directory + #:build-maintainer + (spawn-build-maintainer))) + (lambda (key errno message . _) + (log-error (G_ "ZeroMQ error in build server: ~a") + message) + (terminate-helper-processes) + (primitive-exit 1))))))) #:hz 0 #:parallelism (min 8 (current-processor-count))))))) diff --git a/tests/remote.scm b/tests/remote.scm index 797c92e..3fa9299 100644 --- a/tests/remote.scm +++ b/tests/remote.scm @@ -29,6 +29,7 @@ symbol)))) (cuirass specification) ((cuirass remote) #:select (worker-systems)) + ((cuirass base) #:select (spawn-remote-builder)) (gnu packages base) (guix build utils) (guix channels) @@ -39,6 +40,7 @@ ((guix store) #:hide (build)) ((guix utils) #:select (%current-system)) (tests common) + (fibers) (squee) (simple-zmq) ((zlib) #:select (call-with-gzip-input-port)) @@ -57,6 +59,9 @@ (define worker (make-parameter #f)) +(define notification-server + (make-parameter #f)) + (define (start-worker) (setenv "REQUEST_PERIOD" "1") (setenv "CUIRASS_LOGGING_LEVEL" "debug") @@ -67,10 +72,12 @@ "--public-key=tests/signing-key.pub") #:search-path? #t))) +(define (terminate-process pid) + (kill pid SIGINT) + (waitpid pid)) + (define (stop-worker) - (let ((worker (worker))) - (kill worker SIGINT) - (waitpid worker))) + (terminate-process (worker))) (define (start-server) (mkdir-p "tests/cache") @@ -85,9 +92,30 @@ #:search-path? #t))) (define (stop-server) - (let ((server (server))) - (kill server SIGINT) - (waitpid server))) + (terminate-process (server))) + +(define (start-notification-server) + ;; Spawn the notification server that normally runs as part of 'cuirass + ;; register', and which 'cuirass remote-server' connects to. Do so in a + ;; separate process because 'run-fibers' installs suspendable ports, which + ;; this process may not be able to deal with. + (notification-server + (spawn "guile" + (list "guile" "-c" + (object->string + '(begin + (use-modules (cuirass base) (fibers)) + + (setvbuf (current-output-port) 'none) + (setvbuf (current-error-port) 'none) + (run-fibers + (lambda () + (spawn-remote-builder) + (sleep 120)) ;wait + #:drain? #t))))))) + +(define (stop-notification-server) + (terminate-process (notification-server))) (define* (dummy-drv #:optional sleep #:key (name "foo") dependency message) (let ((dependency (and=> dependency read-derivation-from-file))) @@ -185,6 +213,7 @@ (test-assert "remote-server" (begin + (start-notification-server) (start-server) #t)) @@ -384,4 +413,5 @@ Failing dependency ~s.\n" (test-assert "clean-up" (begin (stop-worker) - (stop-server)))) + (stop-server) + (stop-notification-server))))