Message ID | 20200605091151.GB3225@zpidnp36 |
---|---|
State | Accepted |
Headers | show |
Series | [bug#41720] store: Use buffered I/O for all protocol writes | expand |
Context | Check | Description |
---|---|---|
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | fail | View Laminar job |
Hi, Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis: > I did some digging to investigate why SSH-based guix operations are currently > so slow. One of the reasons seems to be that some operations write single words > to the socket and due to NOWAIT these result in a lot of small packets. Good catch! > The attached patch modifies store.scm to buffer all writes, so that won’t happen > any more. I’m seeing about ~40% speedup (6.194s vs 10.075s for the best out of > five runs) for `GUIX_DAEMON_SOCKET=ssh://localhost guix environment guix -- > true`, but it also seems to have a negative impact on UNIX domain socket > communication by ~10% (1.561s vs 1.385s). For the notorious r-learnr package > it’s still ~20% better (ssh, 16.620/21.418) and only ~3% worse (unix, > 11.489/11.199). Keep in mind localhost has a much larger MTU and much lower > latency than usual networks though. It might benefit from having an even bigger > write buffer (64k vs. currently 8k). Nice, though the slowdown on Unix-domain sockets is annoying given that it’s the primary transport. Are the 10% significant and stable over multiple runs? It may be that writing through the buffered port, with the extra allocations, indirection through the custom port, etc., outweighs the gains. OTOH all the other RPCs are buffered, so it makes sense to just do the same. (It may be wiser to make the ‘operation’ macro smarter and emit code that directly allocates a bytevector of the right size and sends it. Future work!) > From eb38dc246e5dd33f8a48a99aded8b8dc0c378376 Mon Sep 17 00:00:00 2001 > From: Lars-Dominik Braun <ldb@leibniz-psychology.org> > Date: Fri, 5 Jun 2020 10:38:32 +0200 > Subject: [PATCH] store: Use buffered I/O for all protocol writes > > * guix/store.scm (run-gc) Use buffered output port. > (export-path) Same. > (add-file-tree-to-store) Same. > (set-build-options): Same. Add explicit flush. Look: --8<---------------cut here---------------start------------->8--- $ GUIX_PROFILING=rpc guix environment guix -- true Remote procedure call summary: 1927 RPCs built-in-builders ... 1 build-things ... 1 query-substitutable-path-infos ... 1 valid-path? ... 2 add-to-store/tree ... 19 query-references ... 108 add-to-store ... 133 add-text-to-store ... 1662 --8<---------------cut here---------------end--------------->8--- So probably the gain you’re seeing comes from ‘add-file-tree-to-store’. Otherwise the patch LGTM as long as it doesn’t introduce any test suite regressions. Thanks for investigating! Ludo’.
Hi Ludo, > Nice, though the slowdown on Unix-domain sockets is annoying given that > it’s the primary transport. Are the 10% significant and stable over > multiple runs? I don’t want to blind your eyes with meaningless statistics here, but let’s do this properly then. So for `guix environment guix -- true` with 10 trials these are the results: ---snip--- $ ministat -A -c 99 before_ssh after_ssh x before_ssh + after_ssh N Min Max Median Avg Stddev x 10 7.63 10.635 10.1585 9.9382 0.85517585 + 10 5.986 6.424 6.1755 6.1917 0.15398416 Difference at 99.0% confidence -3.7465 +/- 0.790815 -37.698% +/- 5.0788% (Student's t, pooled s = 0.614425) $ ministat -A -c 99 before_unix after_unix x before_unix + after_unix N Min Max Median Avg Stddev x 10 1.528 1.624 1.5445 1.5636 0.035842867 + 10 1.576 1.701 1.6145 1.6229 0.037869513 Difference at 99.0% confidence 0.0593 +/- 0.0474548 3.79253% +/- 3.08989% (Student's t, pooled s = 0.0368701) ---snap--- And the same for `guix environment --ad-hoc r-learnr`: ---snip--- $ ministat -A -c 99 before_ssh after_ssh x before_ssh + after_ssh N Min Max Median Avg Stddev x 10 20.104 21.629 21.082 21.0115 0.45142472 + 10 16.082 16.713 16.292 16.3416 0.25018224 Difference at 99.0% confidence -4.6699 +/- 0.469719 -22.2254% +/- 1.86734% (Student's t, pooled s = 0.364949) $ ministat -A -c 99 before_unix after_unix x before_unix + after_unix N Min Max Median Avg Stddev x 10 11.322 11.529 11.401 11.4127 0.062467858 + 10 11.249 11.469 11.3515 11.368 0.078689262 No difference proven at 99.0% confidence ---snap--- > Otherwise the patch LGTM as long as it doesn’t introduce any test suite > regressions. Yes, it passes the test suite. Lars
Hi! Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis: >> Nice, though the slowdown on Unix-domain sockets is annoying given that >> it’s the primary transport. Are the 10% significant and stable over >> multiple runs? > I don’t want to blind your eyes with meaningless statistics here, but let’s do > this properly then. So for `guix environment guix -- true` with 10 trials these > are the results: > > ---snip--- > $ ministat -A -c 99 before_ssh after_ssh > x before_ssh > + after_ssh > N Min Max Median Avg Stddev > x 10 7.63 10.635 10.1585 9.9382 0.85517585 > + 10 5.986 6.424 6.1755 6.1917 0.15398416 > Difference at 99.0% confidence > -3.7465 +/- 0.790815 > -37.698% +/- 5.0788% > (Student's t, pooled s = 0.614425) > > $ ministat -A -c 99 before_unix after_unix > x before_unix > + after_unix > N Min Max Median Avg Stddev > x 10 1.528 1.624 1.5445 1.5636 0.035842867 > + 10 1.576 1.701 1.6145 1.6229 0.037869513 > Difference at 99.0% confidence > 0.0593 +/- 0.0474548 > 3.79253% +/- 3.08989% > (Student's t, pooled s = 0.0368701) > ---snap--- > > And the same for `guix environment --ad-hoc r-learnr`: > > ---snip--- > $ ministat -A -c 99 before_ssh after_ssh > x before_ssh > + after_ssh > N Min Max Median Avg Stddev > x 10 20.104 21.629 21.082 21.0115 0.45142472 > + 10 16.082 16.713 16.292 16.3416 0.25018224 > Difference at 99.0% confidence > -4.6699 +/- 0.469719 > -22.2254% +/- 1.86734% > (Student's t, pooled s = 0.364949) > > $ ministat -A -c 99 before_unix after_unix > x before_unix > + after_unix > N Min Max Median Avg Stddev > x 10 11.322 11.529 11.401 11.4127 0.062467858 > + 10 11.249 11.469 11.3515 11.368 0.078689262 > No difference proven at 99.0% confidence > ---snap--- Awesome, thanks a lot. Looks like the difference on Unix-domain sockets is not significant. Applied! What package provides ‘ministat’? Thanks, Ludo’.
Hi,
great!
> What package provides ‘ministat’?
It’s originally a FreeBSD tool[1], but there are several unofficial Linux ports
on GitHub, like [2].
Cheers,
Lars
[1] https://www.freebsd.org/cgi/man.cgi?query=ministat&sektion=1&manpath=freebsd-release-ports
[2] https://github.com/csjayp/ministat
From eb38dc246e5dd33f8a48a99aded8b8dc0c378376 Mon Sep 17 00:00:00 2001 From: Lars-Dominik Braun <ldb@leibniz-psychology.org> Date: Fri, 5 Jun 2020 10:38:32 +0200 Subject: [PATCH] store: Use buffered I/O for all protocol writes * guix/store.scm (run-gc) Use buffered output port. (export-path) Same. (add-file-tree-to-store) Same. (set-build-options): Same. Add explicit flush. --- guix/store.scm | 70 ++++++++++++++++++++++++++++---------------------- 1 file changed, 39 insertions(+), 31 deletions(-) diff --git a/guix/store.scm b/guix/store.scm index 014d08aaec..9b3879b4a7 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -3,6 +3,7 @@ ;;; Copyright © 2018 Jan Nieuwenhuizen <janneke@gnu.org> ;;; Copyright © 2019, 2020 Mathieu Othacehe <m.othacehe@gmail.com> ;;; Copyright © 2020 Florian Pelz <pelzflorian@pelzflorian.de> +;;; Copyright © 2020 Lars-Dominik Braun <ldb@leibniz-psychology.org> ;;; ;;; This file is part of GNU Guix. ;;; @@ -821,8 +822,8 @@ encoding conversion errors." (locale (false-if-exception (setlocale LC_ALL)))) ;; Must be called after `open-connection'. - (define socket - (store-connection-socket server)) + (define buffered + (store-connection-output-port server)) (unless (unspecified? use-build-hook?) (warn-about-deprecation #:use-build-hook? #f @@ -831,9 +832,9 @@ encoding conversion errors." (let-syntax ((send (syntax-rules () ((_ (type option) ...) (begin - (write-arg type option socket) + (write-arg type option buffered) ...))))) - (write-int (operation-id set-options) socket) + (write-int (operation-id set-options) buffered) (send (boolean keep-failed?) (boolean keep-going?) (boolean fallback?) (integer verbosity)) (when (< (store-connection-minor-version server) #x61) @@ -896,6 +897,7 @@ encoding conversion errors." `(("locale" . ,locale)) '())))) (send (string-pairs pairs)))) + (write-buffered-output server) (let loop ((done? (process-stderr server))) (or done? (process-stderr server))))) @@ -1108,13 +1110,14 @@ path." ;; We don't use the 'operation' macro so we can pass SELECT? to ;; 'write-file'. (record-operation 'add-to-store) - (let ((port (store-connection-socket server))) - (write-int (operation-id add-to-store) port) - (write-string basename port) - (write-int 1 port) ;obsolete, must be #t - (write-int (if recursive? 1 0) port) - (write-string hash-algo port) - (write-file file-name port #:select? select?) + (let ((port (store-connection-socket server)) + (buffered (store-connection-output-port server))) + (write-int (operation-id add-to-store) buffered) + (write-string basename buffered) + (write-int 1 buffered) ;obsolete, must be #t + (write-int (if recursive? 1 0) buffered) + (write-string hash-algo buffered) + (write-file file-name buffered #:select? select?) (write-buffered-output server) (let loop ((done? (process-stderr server))) (or done? (loop (process-stderr server)))) @@ -1220,13 +1223,14 @@ an arbitrary directory layout in the store without creating a derivation." ;; We don't use the 'operation' macro so we can use 'write-file-tree' ;; instead of 'write-file'. (record-operation 'add-to-store/tree) - (let ((port (store-connection-socket server))) - (write-int (operation-id add-to-store) port) - (write-string basename port) - (write-int 1 port) ;obsolete, must be #t - (write-int (if recursive? 1 0) port) - (write-string hash-algo port) - (write-file-tree basename port + (let ((port (store-connection-socket server)) + (buffered (store-connection-output-port server))) + (write-int (operation-id add-to-store) buffered) + (write-string basename buffered) + (write-int 1 buffered) ;obsolete, must be #t + (write-int (if recursive? 1 0) buffered) + (write-string hash-algo buffered) + (write-file-tree basename buffered #:file-type+size file-type+size #:file-port file-port #:symlink-target symlink-target @@ -1644,17 +1648,19 @@ the list of store paths to delete. IGNORE-LIVENESS? should always be #f. MIN-FREED is the minimum amount of disk space to be freed, in bytes, before the GC can stop. Return the list of store paths delete, and the number of bytes freed." - (let ((s (store-connection-socket server))) - (write-int (operation-id collect-garbage) s) - (write-int action s) - (write-store-path-list to-delete s) - (write-arg boolean #f s) ; ignore-liveness? - (write-long-long min-freed s) - (write-int 0 s) ; obsolete + (let ((s (store-connection-socket server)) + (buffered (store-connection-output-port server))) + (write-int (operation-id collect-garbage) buffered) + (write-int action buffered) + (write-store-path-list to-delete buffered) + (write-arg boolean #f buffered) ; ignore-liveness? + (write-long-long min-freed buffered) + (write-int 0 buffered) ; obsolete (when (>= (store-connection-minor-version server) 5) ;; Obsolete `use-atime' and `max-atime' parameters. - (write-int 0 s) - (write-int 0 s)) + (write-int 0 buffered) + (write-int 0 buffered)) + (write-buffered-output server) ;; Loop until the server is done sending error output. (let loop ((done? (process-stderr server))) @@ -1711,10 +1717,12 @@ is raised if the set of paths read from PORT is not signed (as per (define* (export-path server path port #:key (sign? #t)) "Export PATH to PORT. When SIGN? is true, sign it." - (let ((s (store-connection-socket server))) - (write-int (operation-id export-path) s) - (write-store-path path s) - (write-arg boolean sign? s) + (let ((s (store-connection-socket server)) + (buffered (store-connection-output-port server))) + (write-int (operation-id export-path) buffered) + (write-store-path path buffered) + (write-arg boolean sign? buffered) + (write-buffered-output server) (let loop ((done? (process-stderr server port))) (or done? (loop (process-stderr server port)))) (= 1 (read-int s)))) -- 2.20.1