Message ID | 460cdfa67b473ea2f1593668b2d9d0fd159378d0.1700244314.git.mail@cbaines.net |
---|---|
State | New |
Headers |
Return-Path: <guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org> X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 89EE527BBEA; Fri, 17 Nov 2023 18:06:25 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_PASS autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 861F827BBE2 for <patchwork@mira.cbaines.net>; Fri, 17 Nov 2023 18:06:21 +0000 (GMT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <guix-patches-bounces@gnu.org>) id 1r43EF-0006tF-3a; Fri, 17 Nov 2023 13:06:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1r43EC-0006sw-TT for guix-patches@gnu.org; Fri, 17 Nov 2023 13:06:04 -0500 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1r43EB-0000ji-ST; Fri, 17 Nov 2023 13:06:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1r43E9-00085k-Ou; Fri, 17 Nov 2023 13:06:01 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#67245] [PATCH] store: Use a non-blocking socket for store connections. Resent-From: Christopher Baines <mail@cbaines.net> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org> Resent-CC: guix@cbaines.net, dev@jpoiret.xyz, ludo@gnu.org, othacehe@gnu.org, rekado@elephly.net, zimon.toutoune@gmail.com, me@tobias.gr, guix-patches@gnu.org Resent-Date: Fri, 17 Nov 2023 18:06:01 +0000 Resent-Message-ID: <handler.67245.B.170024433631067@debbugs.gnu.org> Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 67245 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 67245@debbugs.gnu.org Cc: Christopher Baines <guix@cbaines.net>, Josselin Poiret <dev@jpoiret.xyz>, Ludovic =?utf-8?q?Court=C3=A8s?= <ludo@gnu.org>, Mathieu Othacehe <othacehe@gnu.org>, Ricardo Wurmus <rekado@elephly.net>, Simon Tournier <zimon.toutoune@gmail.com>, Tobias Geerinckx-Rice <me@tobias.gr> X-Debbugs-Original-To: guix-patches@gnu.org X-Debbugs-Original-Xcc: Christopher Baines <guix@cbaines.net>, Josselin Poiret <dev@jpoiret.xyz>, Ludovic =?utf-8?q?Court=C3=A8s?= <ludo@gnu.org>, Mathieu Othacehe <othacehe@gnu.org>, Ricardo Wurmus <rekado@elephly.net>, Simon Tournier <zimon.toutoune@gmail.com>, Tobias Geerinckx-Rice <me@tobias.gr> Received: via spool by submit@debbugs.gnu.org id=B.170024433631067 (code B ref -1); Fri, 17 Nov 2023 18:06:01 +0000 Received: (at submit) by debbugs.gnu.org; 17 Nov 2023 18:05:36 +0000 Received: from localhost ([127.0.0.1]:47108 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces@debbugs.gnu.org>) id 1r43Dj-000851-T2 for submit@debbugs.gnu.org; Fri, 17 Nov 2023 13:05:36 -0500 Received: from lists.gnu.org ([2001:470:142::17]:39432) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <mail@cbaines.net>) id 1r43Dh-00084l-Kv for submit@debbugs.gnu.org; Fri, 17 Nov 2023 13:05:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <mail@cbaines.net>) id 1r43Db-0006kN-O9 for guix-patches@gnu.org; Fri, 17 Nov 2023 13:05:27 -0500 Received: from mira.cbaines.net ([2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <mail@cbaines.net>) id 1r43DV-0000ZB-3w for guix-patches@gnu.org; Fri, 17 Nov 2023 13:05:27 -0500 Received: from localhost (unknown [217.155.61.229]) by mira.cbaines.net (Postfix) with ESMTPSA id B1DA227BBE2 for <guix-patches@gnu.org>; Fri, 17 Nov 2023 18:05:15 +0000 (GMT) Received: from localhost (localhost [local]) by localhost (OpenSMTPD) with ESMTPA id fbe70812 for <guix-patches@gnu.org>; Fri, 17 Nov 2023 18:05:14 +0000 (UTC) From: Christopher Baines <mail@cbaines.net> Date: Fri, 17 Nov 2023 18:05:14 +0000 Message-ID: <460cdfa67b473ea2f1593668b2d9d0fd159378d0.1700244314.git.mail@cbaines.net> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: <guix-patches.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-patches>, <mailto:guix-patches-request@gnu.org?subject=unsubscribe> List-Archive: <https://lists.gnu.org/archive/html/guix-patches> List-Post: <mailto:guix-patches@gnu.org> List-Help: <mailto:guix-patches-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-patches>, <mailto:guix-patches-request@gnu.org?subject=subscribe> Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches |
Series |
[bug#67245] store: Use a non-blocking socket for store connections.
|
|
Commit Message
Christopher Baines
Nov. 17, 2023, 6:05 p.m. UTC
For some applications, it's important to do this here rather than just making the socket non-blocking after the connection is established because there can be I/O on the socket that will block during the handshake. I've noticed this blocking during the handshake causing issues in the build coordinator for example. * guix/store.scm (open-unix-domain-socket, open-inet-socket): Pass SOCK_NONBLOCK when calling socket. Change-Id: I8225762b78448bc1f7b698c8de5d736e13f577bf --- guix/store.scm | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) base-commit: e35b7c5386c1bfacf47ed31bac9b503373dd26fc
Comments
Hi Christopher, Christopher Baines <mail@cbaines.net> skribis: > For some applications, it's important to do this here rather than just making > the socket non-blocking after the connection is established because there can > be I/O on the socket that will block during the handshake. > > I've noticed this blocking during the handshake causing issues in the build > coordinator for example. > > * guix/store.scm (open-unix-domain-socket, open-inet-socket): Pass > SOCK_NONBLOCK when calling socket. > > Change-Id: I8225762b78448bc1f7b698c8de5d736e13f577bf I feel we should really discuss on Guix + Fibers since we’ve apparently been going through the exact same set of issues. :-) (The other thing that comes to mind is the resource pool!) > +++ b/guix/store.scm > @@ -460,7 +460,9 @@ (define (open-unix-domain-socket file) > '&store-connection-error' upon error." > (let ((s (with-fluids ((%default-port-encoding #f)) > ;; This trick allows use of the `scm_c_read' optimization. > - (socket PF_UNIX (logior SOCK_STREAM SOCK_CLOEXEC) 0))) > + (socket PF_UNIX > + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) > + 0))) We cannot do this here because callers have to be prepared to deal with non-blocking sockets, and that’s not the case in Guix itself. In Cuirass, I have this: --8<---------------cut here---------------start------------->8--- (define (non-blocking-port port) "Make PORT non-blocking and return it." (let ((flags (fcntl port F_GETFL))) (when (zero? (logand O_NONBLOCK flags)) (fcntl port F_SETFL (logior O_NONBLOCK flags))) port)) (define (ensure-non-blocking-store-connection store) "Mark the file descriptor that backs STORE, a <store-connection>, as O_NONBLOCK." (match (store-connection-socket store) ((? file-port? port) (non-blocking-port port)) (_ #f))) (define-syntax-rule (with-store/non-blocking store exp ...) "Like 'with-store', bind STORE to a connection to the store, but ensure that said connection is non-blocking (O_NONBLOCK). Evaluate EXP... in that context." (with-store store (ensure-non-blocking-store-connection store) (let () exp ...))) --8<---------------cut here---------------end--------------->8--- Then ‘with-store/non-blocking’ is used in fiberized context where I know this is fine. I think it’ll have to remain this way until Guix itself is fiberized or something. Does that make sense? Ludo’.
Ludovic Courtès <ludo@gnu.org> writes: > Hi Christopher, > > Christopher Baines <mail@cbaines.net> skribis: > >> For some applications, it's important to do this here rather than just making >> the socket non-blocking after the connection is established because there can >> be I/O on the socket that will block during the handshake. >> >> I've noticed this blocking during the handshake causing issues in the build >> coordinator for example. >> >> * guix/store.scm (open-unix-domain-socket, open-inet-socket): Pass >> SOCK_NONBLOCK when calling socket. >> >> Change-Id: I8225762b78448bc1f7b698c8de5d736e13f577bf > > I feel we should really discuss on Guix + Fibers since we’ve apparently > been going through the exact same set of issues. :-) > > (The other thing that comes to mind is the resource pool!) I'm mostly ignoring these issues then coping the code once you write it :) >> +++ b/guix/store.scm >> @@ -460,7 +460,9 @@ (define (open-unix-domain-socket file) >> '&store-connection-error' upon error." >> (let ((s (with-fluids ((%default-port-encoding #f)) >> ;; This trick allows use of the `scm_c_read' optimization. >> - (socket PF_UNIX (logior SOCK_STREAM SOCK_CLOEXEC) 0))) >> + (socket PF_UNIX >> + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) >> + 0))) > > We cannot do this here because callers have to be prepared to deal with > non-blocking sockets, and that’s not the case in Guix itself. I can see potential problems for programs outside of Guix which use suspendable ports, but given Guix doesn't use suspendable ports, this won't change behaviour, right? Obviously Guile will be working a bit differently, using poll when it needs to wait for I/O, but at the scheme level within Guix, things should be no different. I tried guix weather with this change, and things seemed fine. Is there a specific bit of Guix you're concerned about?
Hi Chris, Christopher Baines <mail@cbaines.net> skribis: > Ludovic Courtès <ludo@gnu.org> writes: >> I feel we should really discuss on Guix + Fibers since we’ve apparently >> been going through the exact same set of issues. :-) >> >> (The other thing that comes to mind is the resource pool!) > > I'm mostly ignoring these issues then coping the code once you write it > :) Heh, so we’re already in sync maybe, not bad. :-) >>> +++ b/guix/store.scm >>> @@ -460,7 +460,9 @@ (define (open-unix-domain-socket file) >>> '&store-connection-error' upon error." >>> (let ((s (with-fluids ((%default-port-encoding #f)) >>> ;; This trick allows use of the `scm_c_read' optimization. >>> - (socket PF_UNIX (logior SOCK_STREAM SOCK_CLOEXEC) 0))) >>> + (socket PF_UNIX >>> + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) >>> + 0))) >> >> We cannot do this here because callers have to be prepared to deal with >> non-blocking sockets, and that’s not the case in Guix itself. > > I can see potential problems for programs outside of Guix which use > suspendable ports, but given Guix doesn't use suspendable ports, this > won't change behaviour, right? > > Obviously Guile will be working a bit differently, using poll when it > needs to wait for I/O, but at the scheme level within Guix, things > should be no different. Hmm yes, I think you’re right. One issue is if we hand over the file descriptor to something that’s not Guile. Off the top of my head, this happens with inferiors and in the ‘build’ procedure of ‘build-self.scm’ (well, the process that receives that file descriptor is Guile, but if it’s older than 3.0 (?), then it may behave differently.) So it should be safe indeed, but adds a bit of overhead (hopping via ‘current-{read,write}-waiter’) and needs good testing. Laziness gives an incentive for the status quo, but I’m not opposed to the change if we get more confidence (test suite passing, tests with inferiors and ‘time-machine’, and some more auditing.) Ludo’.
Ludovic Courtès <ludo@gnu.org> writes: >>>> +++ b/guix/store.scm >>>> @@ -460,7 +460,9 @@ (define (open-unix-domain-socket file) >>>> '&store-connection-error' upon error." >>>> (let ((s (with-fluids ((%default-port-encoding #f)) >>>> ;; This trick allows use of the `scm_c_read' optimization. >>>> - (socket PF_UNIX (logior SOCK_STREAM SOCK_CLOEXEC) 0))) >>>> + (socket PF_UNIX >>>> + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) >>>> + 0))) >>> >>> We cannot do this here because callers have to be prepared to deal with >>> non-blocking sockets, and that’s not the case in Guix itself. >> >> I can see potential problems for programs outside of Guix which use >> suspendable ports, but given Guix doesn't use suspendable ports, this >> won't change behaviour, right? >> >> Obviously Guile will be working a bit differently, using poll when it >> needs to wait for I/O, but at the scheme level within Guix, things >> should be no different. > > Hmm yes, I think you’re right. > > One issue is if we hand over the file descriptor to something that’s not > Guile. Off the top of my head, this happens with inferiors and in the > ‘build’ procedure of ‘build-self.scm’ (well, the process that receives > that file descriptor is Guile, but if it’s older than 3.0 (?), then it > may behave differently.) > > So it should be safe indeed, but adds a bit of overhead (hopping via > ‘current-{read,write}-waiter’) and needs good testing. > > Laziness gives an incentive for the status quo, but I’m not opposed to > the change if we get more confidence (test suite passing, tests with > inferiors and ‘time-machine’, and some more auditing.) Maybe we can just move the with-store/non-blocking in to Guix, as that will solve the immediate issue. I've sent a new patch for that.
diff --git a/guix/store.scm b/guix/store.scm index f8e77b2cd9..216be98c05 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -460,7 +460,9 @@ (define (open-unix-domain-socket file) '&store-connection-error' upon error." (let ((s (with-fluids ((%default-port-encoding #f)) ;; This trick allows use of the `scm_c_read' optimization. - (socket PF_UNIX (logior SOCK_STREAM SOCK_CLOEXEC) 0))) + (socket PF_UNIX + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) + 0))) (a (make-socket-address PF_UNIX file))) (system-error-to-connection-error file @@ -488,7 +490,8 @@ (define (open-inet-socket host port) ((ai rest ...) (let ((s (socket (addrinfo:fam ai) ;; TCP/IP only - (logior SOCK_STREAM SOCK_CLOEXEC) IPPROTO_IP))) + (logior SOCK_STREAM SOCK_CLOEXEC SOCK_NONBLOCK) + IPPROTO_IP))) (catch 'system-error (lambda ()