diff mbox series

[bug#69291,4/5] scripts: substitute: Untangle selecting fast vs small compressions.

Message ID 2a4b5b57cb676c5a0149296d0251b269b2cebd6e.1708458147.git.mail@cbaines.net
State New
Headers show
Series None | expand

Commit Message

Christopher Baines Feb. 20, 2024, 7:42 p.m. UTC
Pulling the logic up to the script makes this code more portable and not
reliant on setting a global variable.

* guix/scripts/substitute.scm (%prefer-fast-decompression?): Rename to…
(%default-prefer-fast-decompression?): this.
(display-narinfo-data): Update accordingly.
(download-nar): Add prefer-fast-decompression? as a keyword argument, remove
code to set! it and return the cpu-usage recorded.
(process-substitution, process-substitution/fallback): Accept and pass through
prefer-fast-decompression? to download-nar.
(guix-substitute): Move the prefer fast decompression switching logic here.

Change-Id: I4e80b457b55bcda8c0ff4ee224dd94a55e1b24fb
---
 guix/scripts/substitute.scm | 90 +++++++++++++++++++++----------------
 1 file changed, 52 insertions(+), 38 deletions(-)

Comments

Ludovic Courtès Feb. 23, 2024, 4:26 p.m. UTC | #1
Christopher Baines <mail@cbaines.net> skribis:

> Pulling the logic up to the script makes this code more portable and not
> reliant on setting a global variable.
>
> * guix/scripts/substitute.scm (%prefer-fast-decompression?): Rename to…
> (%default-prefer-fast-decompression?): this.
> (display-narinfo-data): Update accordingly.
> (download-nar): Add prefer-fast-decompression? as a keyword argument, remove
> code to set! it and return the cpu-usage recorded.
> (process-substitution, process-substitution/fallback): Accept and pass through
> prefer-fast-decompression? to download-nar.
> (guix-substitute): Move the prefer fast decompression switching logic here.
>
> Change-Id: I4e80b457b55bcda8c0ff4ee224dd94a55e1b24fb

[...]

> -(define %prefer-fast-decompression?
> -  ;; Whether to prefer fast decompression over good compression ratios.  This
> -  ;; serves in particular to choose between lzip (high compression ratio but
> -  ;; low decompression throughput) and zstd (lower compression ratio but high
> -  ;; decompression throughput).
> -  #f)
> +(define %default-prefer-fast-decompression? #f)

I would either remove this variable or add a comment describing it (we
should do that for all top-level variables).

> @@ -604,7 +585,9 @@ (define* (download-nar narinfo destination
>              (format status-port "hash-mismatch ~a ~a ~a~%"
>                      (hash-algorithm-name algorithm)
>                      (bytevector->nix-base32-string expected)
> -                    (bytevector->nix-base32-string actual)))))))
> +                    (bytevector->nix-base32-string actual))))
> +
> +      cpu-usage)))

[...]

> +               (let ((cpu-usage
> +                      (process-substitution reply-port store-path destination
> +                                            #:cache-urls (substitute-urls)
> +                                            #:acl (current-acl)
> +                                            #:deduplicate? deduplicate?
> +                                            #:print-build-trace?
> +                                            print-build-trace?
> +                                            #:prefer-fast-decompression?
> +                                            prefer-fast-decompression?)))
> +
> +                 ;; Create a hysteresis: depending on CPU usage, favor
> +                 ;; compression methods with faster decompression (like ztsd)
> +                 ;; or methods with better compression ratios (like lzip).
> +                 ;; This stems from the observation that substitution can be
> +                 ;; CPU-bound when high-speed networks are used:
> +                 ;; <https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html>.
> +                 ;; To simulate "slow" networking or changing conditions, run:
> +                 ;; sudo tc qdisc add dev eno1 root tbf rate 512kbit latency
> +                 ;; 50ms burst 1540 and then cancel with: sudo tc qdisc del
> +                 ;; dev eno1 root
> +                 (loop (cond
> +                        ;; Whether to prefer fast decompression over good
> +                        ;; compression ratios.  This serves in particular to
> +                        ;; choose between lzip (high compression ratio but low
> +                        ;; decompression throughput) and zstd (lower
> +                        ;; compression ratio but high decompression
> +                        ;; throughput).
> +                        ((> cpu-usage .8) #t)
> +                        ((< cpu-usage .2) #f)
> +                        (else prefer-fast-decompression?)))))))))


Instead of having ‘download-nar’ return its CPU usage, which is
surprising, maybe should wrap the ‘process-substitution’ call in
‘guix-substitute’ in ‘with-cpu-usage-monitoring’ and keep all the logic
in ‘guix-substitute’?

Ludo’.
Christopher Baines April 3, 2024, 5:28 p.m. UTC | #2
Ludovic Courtès <ludo@gnu.org> writes:

> Christopher Baines <mail@cbaines.net> skribis:
>
>> Pulling the logic up to the script makes this code more portable and not
>> reliant on setting a global variable.
>>
>> * guix/scripts/substitute.scm (%prefer-fast-decompression?): Rename to…
>> (%default-prefer-fast-decompression?): this.
>> (display-narinfo-data): Update accordingly.
>> (download-nar): Add prefer-fast-decompression? as a keyword argument, remove
>> code to set! it and return the cpu-usage recorded.
>> (process-substitution, process-substitution/fallback): Accept and pass through
>> prefer-fast-decompression? to download-nar.
>> (guix-substitute): Move the prefer fast decompression switching logic here.
>>
>> Change-Id: I4e80b457b55bcda8c0ff4ee224dd94a55e1b24fb
>
> [...]
>
>> -(define %prefer-fast-decompression?
>> -  ;; Whether to prefer fast decompression over good compression ratios.  This
>> -  ;; serves in particular to choose between lzip (high compression ratio but
>> -  ;; low decompression throughput) and zstd (lower compression ratio but high
>> -  ;; decompression throughput).
>> -  #f)
>> +(define %default-prefer-fast-decompression? #f)
>
> I would either remove this variable or add a comment describing it (we
> should do that for all top-level variables).

I've added a comment now, and I'll sent an updated patch.

>> @@ -604,7 +585,9 @@ (define* (download-nar narinfo destination
>>              (format status-port "hash-mismatch ~a ~a ~a~%"
>>                      (hash-algorithm-name algorithm)
>>                      (bytevector->nix-base32-string expected)
>> -                    (bytevector->nix-base32-string actual)))))))
>> +                    (bytevector->nix-base32-string actual))))
>> +
>> +      cpu-usage)))
>
> [...]
>
>> +               (let ((cpu-usage
>> +                      (process-substitution reply-port store-path destination
>> +                                            #:cache-urls (substitute-urls)
>> +                                            #:acl (current-acl)
>> +                                            #:deduplicate? deduplicate?
>> +                                            #:print-build-trace?
>> +                                            print-build-trace?
>> +                                            #:prefer-fast-decompression?
>> +                                            prefer-fast-decompression?)))
>> +
>> +                 ;; Create a hysteresis: depending on CPU usage, favor
>> +                 ;; compression methods with faster decompression (like ztsd)
>> +                 ;; or methods with better compression ratios (like lzip).
>> +                 ;; This stems from the observation that substitution can be
>> +                 ;; CPU-bound when high-speed networks are used:
>> +                 ;; <https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html>.
>> +                 ;; To simulate "slow" networking or changing conditions, run:
>> +                 ;; sudo tc qdisc add dev eno1 root tbf rate 512kbit latency
>> +                 ;; 50ms burst 1540 and then cancel with: sudo tc qdisc del
>> +                 ;; dev eno1 root
>> +                 (loop (cond
>> +                        ;; Whether to prefer fast decompression over good
>> +                        ;; compression ratios.  This serves in particular to
>> +                        ;; choose between lzip (high compression ratio but low
>> +                        ;; decompression throughput) and zstd (lower
>> +                        ;; compression ratio but high decompression
>> +                        ;; throughput).
>> +                        ((> cpu-usage .8) #t)
>> +                        ((< cpu-usage .2) #f)
>> +                        (else prefer-fast-decompression?)))))))))
>
>
> Instead of having ‘download-nar’ return its CPU usage, which is
> surprising, maybe should wrap the ‘process-substitution’ call in
> ‘guix-substitute’ in ‘with-cpu-usage-monitoring’ and keep all the logic
> in ‘guix-substitute’?

Yeah, that makes sense. I'll send an updated patch shortly.
diff mbox series

Patch

diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 1875a4332d..61e16b22db 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -261,12 +261,7 @@  (define (show-help)
 ;;; Daemon/substituter protocol.
 ;;;
 
-(define %prefer-fast-decompression?
-  ;; Whether to prefer fast decompression over good compression ratios.  This
-  ;; serves in particular to choose between lzip (high compression ratio but
-  ;; low decompression throughput) and zstd (lower compression ratio but high
-  ;; decompression throughput).
-  #f)
+(define %default-prefer-fast-decompression? #f)
 
 (define (call-with-cpu-usage-monitoring proc)
   (let ((before (times)))
@@ -297,7 +292,7 @@  (define (display-narinfo-data port narinfo)
   (let ((uri compression file-size
              (narinfo-best-uri narinfo
                                #:fast-decompression?
-                               %prefer-fast-decompression?)))
+                               %default-prefer-fast-decompression?)))
     (format port "~a\n~a\n"
             (or file-size 0)
             (or (narinfo-size narinfo) 0))))
@@ -453,7 +448,8 @@  (define-syntax-rule (catch-system-error exp)
 (define* (download-nar narinfo destination
                        #:key status-port
                        deduplicate? print-build-trace?
-                       (fetch-timeout %fetch-timeout))
+                       (fetch-timeout %fetch-timeout)
+                       prefer-fast-decompression?)
   "Download the nar prescribed in NARINFO, which is assumed to be authentic
 and authorized, and write it to DESTINATION.  When DEDUPLICATE? is true, and
 if DESTINATION is in the store, deduplicate its files.  Print a status line to
@@ -525,7 +521,7 @@  (define* (download-nar narinfo destination
 
   (let ((choices (narinfo-preferred-uris narinfo
                                          #:fast-decompression?
-                                         %prefer-fast-decompression?)))
+                                         prefer-fast-decompression?)))
     ;; 'guix publish' without '--cache' doesn't specify a Content-Length, so
     ;; DOWNLOAD-SIZE is #f in this case.
     (let* ((raw uri compression download-size (try-fetch choices))
@@ -566,21 +562,6 @@  (define* (download-nar narinfo destination
                                             deduplicate?)
                                        dump-file/deduplicate*
                                        dump-file))))
-
-      ;; Create a hysteresis: depending on CPU usage, favor compression
-      ;; methods with faster decompression (like ztsd) or methods with better
-      ;; compression ratios (like lzip).  This stems from the observation that
-      ;; substitution can be CPU-bound when high-speed networks are used:
-      ;; <https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html>.
-      ;; To simulate "slow" networking or changing conditions, run:
-      ;;   sudo tc qdisc add dev eno1 root tbf rate 512kbit latency 50ms burst 1540
-      ;; and then cancel with:
-      ;;   sudo tc qdisc del dev eno1 root
-      (when (> cpu-usage .8)
-        (set! %prefer-fast-decompression? #t))
-      (when (< cpu-usage .2)
-        (set! %prefer-fast-decompression? #f))
-
       (close-port hashed)
       (close-port input)
 
@@ -604,7 +585,9 @@  (define* (download-nar narinfo destination
             (format status-port "hash-mismatch ~a ~a ~a~%"
                     (hash-algorithm-name algorithm)
                     (bytevector->nix-base32-string expected)
-                    (bytevector->nix-base32-string actual)))))))
+                    (bytevector->nix-base32-string actual))))
+
+      cpu-usage)))
 
 (define (system-error? exception)
   "Return true if EXCEPTION is a Guile 'system-error exception."
@@ -628,7 +611,8 @@  (define network-error?
 
 (define* (process-substitution/fallback port narinfo destination
                                         #:key cache-urls acl
-                                        deduplicate? print-build-trace?)
+                                        deduplicate? print-build-trace?
+                                        prefer-fast-decompression?)
   "Attempt to substitute NARINFO, which is assumed to be authorized or
 equivalent, by trying to download its nar from each entry in CACHE-URLS.
 
@@ -662,14 +646,17 @@  (define* (process-substitution/fallback port narinfo destination
                 (download-nar alternate destination
                               #:status-port port
                               #:deduplicate? deduplicate?
-                              #:print-build-trace? print-build-trace?))
+                              #:print-build-trace? print-build-trace?
+                              #:prefer-fast-decompression?
+                              prefer-fast-decompression?))
               (loop rest)))
          (()
           (loop rest)))))))
 
 (define* (process-substitution port store-item destination
                                #:key cache-urls acl
-                               deduplicate? print-build-trace?)
+                               deduplicate? print-build-trace?
+                               prefer-fast-decompression?)
   "Substitute STORE-ITEM (a store file name) from CACHE-URLS, and write it to
 DESTINATION as a nar file.  Verify the substitute against ACL, and verify its
 hash against what appears in the narinfo.  When DEDUPLICATE? is true, and if
@@ -701,11 +688,14 @@  (define* (process-substitution port store-item destination
                                             #:acl acl
                                             #:deduplicate? deduplicate?
                                             #:print-build-trace?
-                                            print-build-trace?)))
+                                            print-build-trace?
+                                            #:prefer-fast-decompression?
+                                            prefer-fast-decompression?)))
     (download-nar narinfo destination
                   #:status-port port
                   #:deduplicate? deduplicate?
-                  #:print-build-trace? print-build-trace?)))
+                  #:print-build-trace? print-build-trace?
+                  #:prefer-fast-decompression? prefer-fast-decompression?)))
 
 
 ;;;
@@ -895,18 +885,42 @@  (define-command (guix-substitute . args)
         ;; Specify the number of columns of the terminal so the progress
         ;; report displays nicely.
         (parameterize ((current-terminal-columns (client-terminal-columns)))
-          (let loop ()
+          (let loop ((prefer-fast-decompression?
+                      %default-prefer-fast-decompression?))
             (match (read-line)
               ((? eof-object?)
                #t)
               ((= string-tokenize ("substitute" store-path destination))
-               (process-substitution reply-port store-path destination
-                                     #:cache-urls (substitute-urls)
-                                     #:acl (current-acl)
-                                     #:deduplicate? deduplicate?
-                                     #:print-build-trace?
-                                     print-build-trace?)
-               (loop))))))
+               (let ((cpu-usage
+                      (process-substitution reply-port store-path destination
+                                            #:cache-urls (substitute-urls)
+                                            #:acl (current-acl)
+                                            #:deduplicate? deduplicate?
+                                            #:print-build-trace?
+                                            print-build-trace?
+                                            #:prefer-fast-decompression?
+                                            prefer-fast-decompression?)))
+
+                 ;; Create a hysteresis: depending on CPU usage, favor
+                 ;; compression methods with faster decompression (like ztsd)
+                 ;; or methods with better compression ratios (like lzip).
+                 ;; This stems from the observation that substitution can be
+                 ;; CPU-bound when high-speed networks are used:
+                 ;; <https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html>.
+                 ;; To simulate "slow" networking or changing conditions, run:
+                 ;; sudo tc qdisc add dev eno1 root tbf rate 512kbit latency
+                 ;; 50ms burst 1540 and then cancel with: sudo tc qdisc del
+                 ;; dev eno1 root
+                 (loop (cond
+                        ;; Whether to prefer fast decompression over good
+                        ;; compression ratios.  This serves in particular to
+                        ;; choose between lzip (high compression ratio but low
+                        ;; decompression throughput) and zstd (lower
+                        ;; compression ratio but high decompression
+                        ;; throughput).
+                        ((> cpu-usage .8) #t)
+                        ((< cpu-usage .2) #f)
+                        (else prefer-fast-decompression?)))))))))
        (opts
         (leave (G_ "~a: unrecognized options~%") opts))))))