diff mbox series

[bug#58017,2/2] substitute: Retry downloading when a nar is unavailable.

Message ID 20220923061957.5658-2-ludo@gnu.org
State Accepted
Headers show
Series Retry nar downloads upon failure | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git-branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

Ludovic Courtès Sept. 23, 2022, 6:19 a.m. UTC
Fixes <https://issues.guix.gnu.org/57978>
Reported by Attila Lendvai <attila@lendvai.name>.

Previously, if a narinfo was available but its corresponding nar was
missing (for instance because the narinfo was cached and the server
became unreachable in the meantime), 'guix substitute --substitute'
would try to download the nar from its preferred location and abort when
that fails.  This change forces one retry with each of the URLs.

* guix/scripts/substitute.scm (download-nar): Do not catch
'http-get-error?' exceptions.
(system-error?, network-error?, process-substitution/fallback): New
procedures.
(process-substitution): Call 'process-substitution/fallback' upon
'network-error?'.
* tests/substitute.scm ("substitute, first URL has narinfo but lacks nar, second URL unauthorized")
("substitute, first URL has narinfo but nar is 404, both URLs authorized")
("substitute, first URL has narinfo but nar is 404, one URL authorized")
("substitute, narinfo is available but nar is missing"): New tests.
---
 guix/scripts/substitute.scm | 113 ++++++++++++++++++++++++++++--------
 tests/substitute.scm        | 113 ++++++++++++++++++++++++++++++++++++
 2 files changed, 203 insertions(+), 23 deletions(-)

Comments

Simon Tournier Sept. 23, 2022, 8:17 a.m. UTC | #1
Hi,

On ven., 23 sept. 2022 at 08:19, Ludovic Courtès <ludo@gnu.org> wrote:

> Fixes <https://issues.guix.gnu.org/57978>
> Reported by Attila Lendvai <attila@lendvai.name>.
>
> Previously, if a narinfo was available but its corresponding nar was
> missing (for instance because the narinfo was cached and the server
> became unreachable in the meantime), 'guix substitute --substitute'
> would try to download the nar from its preferred location and abort when
> that fails.  This change forces one retry with each of the URLs.
>
> * guix/scripts/substitute.scm (download-nar): Do not catch
> 'http-get-error?' exceptions.
> (system-error?, network-error?, process-substitution/fallback): New
> procedures.
> (process-substitution): Call 'process-substitution/fallback' upon
> 'network-error?'.
> * tests/substitute.scm ("substitute, first URL has narinfo but lacks nar, second URL unauthorized")
> ("substitute, first URL has narinfo but nar is 404, both URLs authorized")
> ("substitute, first URL has narinfo but nar is 404, one URL authorized")
> ("substitute, narinfo is available but nar is missing"): New tests.

LGTM.


> +(test-equal "substitute, first URL has narinfo but nar is 404, one URL authorized"
> +  "Substitutable data."
> +  (with-narinfo*
> +      (string-append %narinfo "Signature: "
> +                     (signature-field
> +                      %narinfo
> +                      #:public-key %wrong-public-key))
> +      %main-substitute-directory
> +
> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                             (signature-field
> +                                              %narinfo
> +                                              #:public-key %wrong-public-key)))
> +                        (404 "Sorry, nar is missing!"))
> +      (let ((url1 (%local-url)))
> +        (parameterize ((%http-server-port 0))
> +          (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                                   (signature-field %narinfo)))
> +                              (404 "Sorry, nar is missing!"))
> +            (let ((url2 (%local-url)))
> +              (dynamic-wind
> +                (const #t)
> +                (lambda ()
> +                  (parameterize ((substitute-urls
> +                                  (list url1 url2
> +                                        (string-append "file://"
> +                                                       %main-substitute-directory))))
> +                    (request-substitution (string-append (%store-prefix)
> +                                                         "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
> +                                          "substitute-retrieved"))
> +                  (call-with-input-file "substitute-retrieved" get-string-all))
> +                (lambda ()
> +                  (false-if-exception (delete-file "substitute-retrieved")))))))))))

Although I do not understand this test.  Why is 404 appearing twice?


Cheers,
simon
M Sept. 24, 2022, 1:57 a.m. UTC | #2
> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
> +  "Substitutable data."
> +  (with-narinfo*
> +      (string-append %narinfo "Signature: "
> +                     (signature-field %narinfo))
> +      %main-substitute-directory
> +
> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
> +                                             (signature-field %narinfo)))
> +                        (404 "Sorry, nar is missing!"))
> +      (dynamic-wind
> +        (const #t)
> +        (lambda ()
> +          (parameterize ((substitute-urls
> +                          (list (%local-url)
> +                                (string-append "file://"
> +                                               %main-substitute-directory))))
> +            (request-substitution (string-append (%store-prefix)
> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
> +                                  "substitute-retrieved"))
> +          (call-with-input-file "substitute-retrieved" get-string-all))
> +        (lambda ()
> +          (false-if-exception (delete-file "substitute-retrieved")))))))

Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?
If the exception handling is refined a bit, it becomes a bit more 
complicated, and could be simplified to (when [exists] [delete]), as 
there are no atomicity concerns.

This test, and some others, can be improved by also checking the URI. 
While currently 'with-http-server' does not support that, there are (5 
months, with the v1 having seen some reviewing and a v2 available) 
patches for that at <https://issues.guix.gnu.org/53389>.

That patch also _requires_ always mentioning the URI, if the cover 
letter is correct.  It also allows simplifying the use of '%local-url' a 
bit.

Greetings,
Maxime.
Ludovic Courtès Sept. 24, 2022, 4:20 p.m. UTC | #3
Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

>> +  (with-narinfo*
>> +      (string-append %narinfo "Signature: "
>> +                     (signature-field
>> +                      %narinfo
>> +                      #:public-key %wrong-public-key))
>> +      %main-substitute-directory
>> +
>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                             (signature-field
>> +                                              %narinfo
>> +                                              #:public-key %wrong-public-key)))
>> +                        (404 "Sorry, nar is missing!"))
>> +      (let ((url1 (%local-url)))
>> +        (parameterize ((%http-server-port 0))
>> +          (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                                   (signature-field %narinfo)))
>> +                              (404 "Sorry, nar is missing!"))
>> +            (let ((url2 (%local-url)))
>> +              (dynamic-wind
>> +                (const #t)
>> +                (lambda ()
>> +                  (parameterize ((substitute-urls
>> +                                  (list url1 url2
>> +                                        (string-append "file://"
>> +                                                       %main-substitute-directory))))

[...]

> Although I do not understand this test.  Why is 404 appearing twice?

That’s because it’s testing with 3 substitute URLs.

Thanks for taking a look!

Ludo’.
Ludovic Courtès Sept. 24, 2022, 4:22 p.m. UTC | #4
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> skribis:

>> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
>> +  "Substitutable data."
>> +  (with-narinfo*
>> +      (string-append %narinfo "Signature: "
>> +                     (signature-field %narinfo))
>> +      %main-substitute-directory
>> +
>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>> +                                             (signature-field %narinfo)))
>> +                        (404 "Sorry, nar is missing!"))
>> +      (dynamic-wind
>> +        (const #t)
>> +        (lambda ()
>> +          (parameterize ((substitute-urls
>> +                          (list (%local-url)
>> +                                (string-append "file://"
>> +                                               %main-substitute-directory))))
>> +            (request-substitution (string-append (%store-prefix)
>> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
>> +                                  "substitute-retrieved"))
>> +          (call-with-input-file "substitute-retrieved" get-string-all))
>> +        (lambda ()
>> +          (false-if-exception (delete-file "substitute-retrieved")))))))
>
> Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?

By “it”, do you mean ‘dynamic-wind’ should be replaced by a ‘catch’
form?

We could discuss it, but note that this patch just keeps with the style
of existing tests.

> This test, and some others, can be improved by also checking the
> URI. While currently 'with-http-server' does not support that, there
> are (5 months, with the v1 having seen some reviewing and a v2
> available) patches for that at <https://issues.guix.gnu.org/53389>.
>
> That patch also _requires_ always mentioning the URI, if the cover
> letter is correct.  It also allows simplifying the use of '%local-url'
> a bit.

Ah, thanks for the reminder!  I’ve just spent most of the day reviewing
patches, but not that one…

Ludo’.
M Sept. 24, 2022, 5:18 p.m. UTC | #5
On 24-09-2022 18:22, Ludovic Courtès wrote:
> Hi Maxime,
> 
> Maxime Devos <maximedevos@telenet.be> skribis:
> 
>>> +(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
>>> +  "Substitutable data."
>>> +  (with-narinfo*
>>> +      (string-append %narinfo "Signature: "
>>> +                     (signature-field %narinfo))
>>> +      %main-substitute-directory
>>> +
>>> +    (with-http-server `((200 ,(string-append %narinfo "Signature: "
>>> +                                             (signature-field %narinfo)))
>>> +                        (404 "Sorry, nar is missing!"))
>>> +      (dynamic-wind
>>> +        (const #t)
>>> +        (lambda ()
>>> +          (parameterize ((substitute-urls
>>> +                          (list (%local-url)
>>> +                                (string-append "file://"
>>> +                                               %main-substitute-directory))))
>>> +            (request-substitution (string-append (%store-prefix)
>>> +                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
>>> +                                  "substitute-retrieved"))
>>> +          (call-with-input-file "substitute-retrieved" get-string-all))
>>> +        (lambda ()
>>> +          (false-if-exception (delete-file "substitute-retrieved")))))))
>>
>> Shouldn't it only ignore 'file not found' (ENOENT?) exceptions?
> 
> By “it”, do you mean ‘dynamic-wind’ should be replaced by a ‘catch’
> form?

No, I'm not referring to the dynamic-wind as a whole, rather 'it' = the 
following code:

  (false-if-exception (delete-file "substitute-retrieved"))

-- the catch can stay, AFAIK.

> We could discuss it, but note that this patch just keeps with the style
> of existing tests.

For the reasons given, I don't think this style should be continued, 
though I suppose all of them can be done at once in a separate patch.

Greetings,
Maxime.
diff mbox series

Patch

diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index e3b382d0d8..cf59db4315 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -460,25 +460,20 @@  (define (fetch uri)
        (let ((port (open-file (uri-path uri) "r0b")))
          (values port (stat:size (stat port)))))
       ((http https)
-       (guard (c ((http-get-error? c)
-                  (leave (G_ "download from '~a' failed: ~a, ~s~%")
-                         (uri->string (http-get-error-uri c))
-                         (http-get-error-code c)
-                         (http-get-error-reason c))))
-         ;; Test this with:
-         ;;   sudo tc qdisc add dev eth0 root netem delay 1500ms
-         ;; and then cancel with:
-         ;;   sudo tc qdisc del dev eth0 root
-         (with-timeout %fetch-timeout
-           (begin
-             (warning (G_ "while fetching ~a: server is somewhat slow~%")
-                      (uri->string uri))
-             (warning (G_ "try `--no-substitutes' if the problem persists~%")))
-           (with-cached-connection uri port
-             (http-fetch uri #:text? #f
-                         #:port port
-                         #:keep-alive? #t
-                         #:buffered? #f)))))
+       ;; Test this with:
+       ;;   sudo tc qdisc add dev eth0 root netem delay 1500ms
+       ;; and then cancel with:
+       ;;   sudo tc qdisc del dev eth0 root
+       (with-timeout %fetch-timeout
+         (begin
+           (warning (G_ "while fetching ~a: server is somewhat slow~%")
+                    (uri->string uri))
+           (warning (G_ "try `--no-substitutes' if the problem persists~%")))
+         (with-cached-connection uri port
+           (http-fetch uri #:text? #f
+                       #:port port
+                       #:keep-alive? #t
+                       #:buffered? #f))))
       (else
        (leave (G_ "unsupported substitute URI scheme: ~a~%")
               (uri->string uri)))))
@@ -572,6 +567,68 @@  (define cpu-usage
                     (bytevector->nix-base32-string expected)
                     (bytevector->nix-base32-string actual)))))))
 
+(define system-error?
+  (let ((kind-and-args? (exception-predicate &exception-with-kind-and-args)))
+    (lambda (exception)
+      "Return true if EXCEPTION is a Guile 'system-error exception."
+      (and (kind-and-args? exception)
+           (eq? 'system-error (exception-kind exception))))))
+
+(define network-error?
+  (let ((kind-and-args? (exception-predicate &exception-with-kind-and-args)))
+    (lambda (exception)
+      "Return true if EXCEPTION denotes a networking error."
+      (or (and (system-error? exception)
+               (let ((errno (system-error-errno
+                             (cons 'system-error (exception-args exception)))))
+                 (memv errno (list ECONNRESET ECONNABORTED
+                                   ECONNREFUSED EHOSTUNREACH
+                                   ENOENT))))     ;for "file://"
+          (and (kind-and-args? exception)
+               (memq (exception-kind exception)
+                     '(gnutls-error getaddrinfo-error)))
+          (and (http-get-error? exception)
+               (begin
+                 (warning (G_ "download from '~a' failed: ~a, ~s~%")
+                          (uri->string (http-get-error-uri exception))
+                          (http-get-error-code exception)
+                          (http-get-error-reason exception))
+                 #t))))))
+
+(define* (process-substitution/fallback port narinfo destination
+                                        #:key cache-urls acl
+                                        deduplicate? print-build-trace?)
+  "Attempt to substitute NARINFO, which is assumed to be authorized or
+equivalent, by trying to download its nar from each entry in CACHE-URLS.
+
+This can be less efficient than 'lookup-narinfo', which stops at the first
+entry that provides a valid narinfo, but it makes sure we eventually find a
+way to download the nar."
+  ;; Note: Keep NARINFO's uri-base in CACHE-URLS: that lets us retry in case
+  ;; this was a transient issue.
+  (let loop ((cache-urls cache-urls))
+    (match cache-urls
+      (()
+       (leave (G_ "failed to find alternative substitute for '~a'~%")
+              (narinfo-path narinfo)))
+      ((cache-url rest ...)
+       (match (lookup-narinfos cache-url
+                               (list (narinfo-path narinfo))
+                               #:open-connection
+                               open-connection-for-uri/cached)
+         ((alternate)
+          (if (or (equivalent-narinfo? narinfo alternate)
+                  (valid-narinfo? alternate acl)
+                  (%allow-unauthenticated-substitutes?))
+              (guard (c ((network-error? c) (loop rest)))
+                (download-nar alternate destination
+                              #:status-port port
+                              #:deduplicate? deduplicate?
+                              #:print-build-trace? print-build-trace?))
+              (loop rest)))
+         (()
+          (loop rest)))))))
+
 (define* (process-substitution port store-item destination
                                #:key cache-urls acl
                                deduplicate? print-build-trace?)
@@ -590,10 +647,20 @@  (define narinfo
     (leave (G_ "no valid substitute for '~a'~%")
            store-item))
 
-  (download-nar narinfo destination
-                #:status-port port
-                #:deduplicate? deduplicate?
-                #:print-build-trace? print-build-trace?))
+  (guard (c ((network-error? c)
+             (format (current-error-port)
+                     (G_ "retrying download of '~a' with other substitute URLs...~%")
+                     store-item)
+             (process-substitution/fallback port narinfo destination
+                                            #:cache-urls cache-urls
+                                            #:acl acl
+                                            #:deduplicate? deduplicate?
+                                            #:print-build-trace?
+                                            print-build-trace?)))
+    (download-nar narinfo destination
+                  #:status-port port
+                  #:deduplicate? deduplicate?
+                  #:print-build-trace? print-build-trace?)))
 
 
 ;;;
diff --git a/tests/substitute.scm b/tests/substitute.scm
index 5315292987..9032a50268 100644
--- a/tests/substitute.scm
+++ b/tests/substitute.scm
@@ -523,6 +523,119 @@  (define-syntax-rule (with-narinfo* narinfo directory body ...)
         (lambda ()
           (false-if-exception (delete-file "substitute-retrieved")))))))
 
+(test-equal "substitute, first URL has narinfo but lacks nar, second URL unauthorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %alternate-substitute-directory
+
+    (with-narinfo* (string-append %narinfo "Signature: "
+                                  (signature-field %narinfo))
+        %main-substitute-directory
+
+      (dynamic-wind
+        (const #t)
+        (lambda ()
+          ;; Remove this file so that the substitute can only be retrieved
+          ;; from %ALTERNATE-SUBSTITUTE-DIRECTORY.
+          (delete-file (string-append %main-substitute-directory
+                                      "/example.nar"))
+
+          (parameterize ((substitute-urls
+                          (map (cut string-append "file://" <>)
+                               (list %main-substitute-directory
+                                     %alternate-substitute-directory))))
+            (request-substitution (string-append (%store-prefix)
+                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                  "substitute-retrieved"))
+          (call-with-input-file "substitute-retrieved" get-string-all))
+        (lambda ()
+          (false-if-exception (delete-file "substitute-retrieved")))))))
+
+(test-equal "substitute, first URL has narinfo but nar is 404, both URLs authorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field %narinfo))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field %narinfo)))
+                        (404 "Sorry, nar is missing!"))
+      (dynamic-wind
+        (const #t)
+        (lambda ()
+          (parameterize ((substitute-urls
+                          (list (%local-url)
+                                (string-append "file://"
+                                               %main-substitute-directory))))
+            (request-substitution (string-append (%store-prefix)
+                                                 "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                  "substitute-retrieved"))
+          (call-with-input-file "substitute-retrieved" get-string-all))
+        (lambda ()
+          (false-if-exception (delete-file "substitute-retrieved")))))))
+
+(test-equal "substitute, first URL has narinfo but nar is 404, one URL authorized"
+  "Substitutable data."
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field
+                                              %narinfo
+                                              #:public-key %wrong-public-key)))
+                        (404 "Sorry, nar is missing!"))
+      (let ((url1 (%local-url)))
+        (parameterize ((%http-server-port 0))
+          (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                                   (signature-field %narinfo)))
+                              (404 "Sorry, nar is missing!"))
+            (let ((url2 (%local-url)))
+              (dynamic-wind
+                (const #t)
+                (lambda ()
+                  (parameterize ((substitute-urls
+                                  (list url1 url2
+                                        (string-append "file://"
+                                                       %main-substitute-directory))))
+                    (request-substitution (string-append (%store-prefix)
+                                                         "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                                          "substitute-retrieved"))
+                  (call-with-input-file "substitute-retrieved" get-string-all))
+                (lambda ()
+                  (false-if-exception (delete-file "substitute-retrieved")))))))))))
+
+(test-quit "substitute, narinfo is available but nar is missing"
+    "failed to find alternative substitute"
+  (with-narinfo*
+      (string-append %narinfo "Signature: "
+                     (signature-field
+                      %narinfo
+                      #:public-key %wrong-public-key))
+      %main-substitute-directory
+
+    (with-http-server `((200 ,(string-append %narinfo "Signature: "
+                                             (signature-field %narinfo)))
+                        (404 "Sorry, nar is missing!"))
+      (parameterize ((substitute-urls
+                      (list (%local-url)
+                            (string-append "file://"
+                                           %main-substitute-directory))))
+        (delete-file (string-append %main-substitute-directory
+                                    "/example.nar"))
+        (request-substitution (string-append (%store-prefix)
+                                             "/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa-foo")
+                              "substitute-retrieved")
+        (not (file-exists? "substitute-retrieved"))))))
+
 (test-equal "substitute, first narinfo is unsigned and has wrong hash"
   "Substitutable data."
   (with-narinfo* (regexp-substitute #f