Message ID | 20200209202358.17bb4a39@tachikoma.lepiller.eu |
---|---|
State | Work in progress |
Headers | show |
Series | [bug#39530] guix: Support partial download | expand |
Context | Check | Description |
---|---|---|
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | fail | View Laminar job |
Hi, Julien Lepiller <julien@lepiller.eu> skribis: > First, I make sure that the guix daemon will not remove previously > failed attempts when trying to build something again, when that is a > fixed-output derivation. Then, I add a Range HTTP header when > performing an HTTP fetch; this ensures that we only query for the part > we don't already have, and append it to the target file. > > If a partial download fails, the same mirror/url is tried again, but > the partial file is removed first, ensuring we do a complete fetch this > time around. If that failed too, we try with the following url. If we > only perform a complete fetch, we proceed as usual. The next url will > be a partial fetch if there is already something locally. Nice! > However, with that daemon there was a lot of new builds required to run > guix environment guix as my user (and nothing was substituted, which > is weird), whereas with the system's daemon, there was nothing to > build. Maybe there's something fishy in that patch... Hmm, that sounds really weird. Could you clarify what you did? >>From 332793b7f29ea68ac9a1af22e3d1c4745200da7e Mon Sep 17 00:00:00 2001 > From: Julien Lepiller <julien@lepiller.eu> > Date: Sun, 9 Feb 2020 19:47:27 +0100 > Subject: [PATCH] guix: download: Add partial download support. Nitpick: you can remove “guix:” from the subject. > * nix/libstore/build.cc (tryToBuild): Do not remove invalid fixed-output > derivations. > * guix/build/download.scm (http-fetch): Add a range argument. > (url-fetch): Performa partial download if a file already exists. [...] > -(define* (http-fetch uri #:key timeout (verify-certificate? #t)) > +(define* (http-fetch uri #:key timeout (verify-certificate? #t) range) > "Return an input port containing the data at URI, and the expected number of > bytes available or #f. When TIMEOUT is true, bail out if the connection could > not be established in less than TIMEOUT seconds. When VERIFY-CERTIFICATE? is > -true, verify HTTPS certificates; otherwise simply ignore them." > +true, verify HTTPS certificates; otherwise simply ignore them. When RANGE is > +a number, it is the number of bytes we want to skip from the data at URI; > +otherwise the full document is requested." I’d suggest to rename #:range to #:offset because it denotes the start offset. What response do we get if the server doesn’t support “Range”? Can servers silently ignore “Range”? > + (if (file-exists? file) > + (http-fetch uri > + #:verify-certificate? verify-certificate? > + #:timeout timeout > + #:range (stat:size (stat file))) > + (http-fetch uri > + #:verify-certificate? verify-certificate? > + #:timeout timeout)))) I’d remove the ‘if’: (http-fetch … #:offset (and=> (stat file #f) stat:size)) > --- a/nix/libstore/build.cc > +++ b/nix/libstore/build.cc > @@ -1320,6 +1320,7 @@ void DerivationGoal::tryToBuild() > Path path = i->second.path; > if (worker.store.isValidPath(path)) continue; > if (!pathExists(path)) continue; > + if (fixedOutput) continue; Please add a comment above explaining why fixed outputs are not deleted. Also please: not tabs. :-) Thanks! Ludo’.
Le 19 février 2020 11:04:32 GMT-05:00, "Ludovic Courtès" <ludo@gnu.org> a écrit : >Hi, > >Julien Lepiller <julien@lepiller.eu> skribis: > >> First, I make sure that the guix daemon will not remove previously >> failed attempts when trying to build something again, when that is a >> fixed-output derivation. Then, I add a Range HTTP header when >> performing an HTTP fetch; this ensures that we only query for the >part >> we don't already have, and append it to the target file. >> >> If a partial download fails, the same mirror/url is tried again, but >> the partial file is removed first, ensuring we do a complete fetch >this >> time around. If that failed too, we try with the following url. If we >> only perform a complete fetch, we proceed as usual. The next url will >> be a partial fetch if there is already something locally. > >Nice! > >> However, with that daemon there was a lot of new builds required to >run >> guix environment guix as my user (and nothing was substituted, which >> is weird), whereas with the system's daemon, there was nothing to >> build. Maybe there's something fishy in that patch... > >Hmm, that sounds really weird. Could you clarify what you did? Actually I'm not sure how to test the new daemon. I ran the guix daemon from a checkout and probably forgot to set sysconfdir to /etc, so it was missing authorized keys, hence no substitutes. > >>>From 332793b7f29ea68ac9a1af22e3d1c4745200da7e Mon Sep 17 00:00:00 >2001 >> From: Julien Lepiller <julien@lepiller.eu> >> Date: Sun, 9 Feb 2020 19:47:27 +0100 >> Subject: [PATCH] guix: download: Add partial download support. > >Nitpick: you can remove “guix:” from the subject. > >> * nix/libstore/build.cc (tryToBuild): Do not remove invalid >fixed-output >> derivations. >> * guix/build/download.scm (http-fetch): Add a range argument. >> (url-fetch): Performa partial download if a file already exists. > >[...] > >> -(define* (http-fetch uri #:key timeout (verify-certificate? #t)) >> +(define* (http-fetch uri #:key timeout (verify-certificate? #t) >range) >> "Return an input port containing the data at URI, and the expected >number of >> bytes available or #f. When TIMEOUT is true, bail out if the >connection could >> not be established in less than TIMEOUT seconds. When >VERIFY-CERTIFICATE? is >> -true, verify HTTPS certificates; otherwise simply ignore them." >> +true, verify HTTPS certificates; otherwise simply ignore them. When >RANGE is >> +a number, it is the number of bytes we want to skip from the data at >URI; >> +otherwise the full document is requested." > >I’d suggest to rename #:range to #:offset because it denotes the start >offset. > >What response do we get if the server doesn’t support “Range”? > >Can servers silently ignore “Range”? > >> + (if (file-exists? file) >> + (http-fetch uri >> + #:verify-certificate? >verify-certificate? >> + #:timeout timeout >> + #:range (stat:size (stat file))) >> + (http-fetch uri >> + #:verify-certificate? >verify-certificate? >> + #:timeout timeout)))) > >I’d remove the ‘if’: > > (http-fetch … > #:offset (and=> (stat file #f) stat:size)) > >> --- a/nix/libstore/build.cc >> +++ b/nix/libstore/build.cc >> @@ -1320,6 +1320,7 @@ void DerivationGoal::tryToBuild() >> Path path = i->second.path; >> if (worker.store.isValidPath(path)) continue; >> if (!pathExists(path)) continue; >> + if (fixedOutput) continue; > >Please add a comment above explaining why fixed outputs are not >deleted. > >Also please: not tabs. :-) > >Thanks! > >Ludo’. Thanks!
Julien Lepiller <julien@lepiller.eu> skribis:
> Actually I'm not sure how to test the new daemon. I ran the guix daemon from a checkout and probably forgot to set sysconfdir to /etc, so it was missing authorized keys, hence no substitutes.
Ah yes. This should work:
sudo -E ./pre-inst-env guix-daemon --build-users-group=… …
Ludo’.
From 332793b7f29ea68ac9a1af22e3d1c4745200da7e Mon Sep 17 00:00:00 2001 From: Julien Lepiller <julien@lepiller.eu> Date: Sun, 9 Feb 2020 19:47:27 +0100 Subject: [PATCH] guix: download: Add partial download support. * nix/libstore/build.cc (tryToBuild): Do not remove invalid fixed-output derivations. * guix/build/download.scm (http-fetch): Add a range argument. (url-fetch): Performa partial download if a file already exists. --- guix/build/download.scm | 36 +++++++++++++++++++++++++++--------- nix/libstore/build.cc | 1 + 2 files changed, 28 insertions(+), 9 deletions(-) diff --git a/guix/build/download.scm b/guix/build/download.scm index 0f2d5f402a..c2b710becc 100644 --- a/guix/build/download.scm +++ b/guix/build/download.scm @@ -654,11 +654,13 @@ Return the resulting target URI." #:query (uri-query ref) #:fragment (uri-fragment ref))))) -(define* (http-fetch uri #:key timeout (verify-certificate? #t)) +(define* (http-fetch uri #:key timeout (verify-certificate? #t) range) "Return an input port containing the data at URI, and the expected number of bytes available or #f. When TIMEOUT is true, bail out if the connection could not be established in less than TIMEOUT seconds. When VERIFY-CERTIFICATE? is -true, verify HTTPS certificates; otherwise simply ignore them." +true, verify HTTPS certificates; otherwise simply ignore them. When RANGE is +a number, it is the number of bytes we want to skip from the data at URI; +otherwise the full document is requested." (define headers `(;; Some web sites, such as http://dist.schmorp.de, would block you if @@ -670,6 +672,10 @@ true, verify HTTPS certificates; otherwise simply ignore them." ;; Acceptable" when not explicitly told that everything is accepted. (Accept . "*/*") + ,@(if range + `((Range . ,(string-append "bytes=" (number->string range) "-"))) + '()) + ;; Basic authentication, if needed. ,@(match (uri-userinfo uri) ((? string? str) @@ -690,7 +696,8 @@ true, verify HTTPS certificates; otherwise simply ignore them." ((code) (response-code resp))) (case code - ((200) ; OK + ((200 ; OK + 206) ; Partial content (values port (response-content-length resp))) ((301 ; moved permanently 302 ; found (redirection) @@ -703,7 +710,8 @@ true, verify HTTPS certificates; otherwise simply ignore them." (close connection) (http-fetch uri #:timeout timeout - #:verify-certificate? verify-certificate?))) + #:verify-certificate? verify-certificate? + #:range range))) (else (error "download failed" (uri->string uri) code (response-reason-phrase resp)))))) @@ -775,10 +783,15 @@ otherwise simply ignore them." ((http https) (false-if-exception* (let-values (((port size) - (http-fetch uri - #:verify-certificate? verify-certificate? - #:timeout timeout))) - (call-with-output-file file + (if (file-exists? file) + (http-fetch uri + #:verify-certificate? verify-certificate? + #:timeout timeout + #:range (stat:size (stat file))) + (http-fetch uri + #:verify-certificate? verify-certificate? + #:timeout timeout)))) + (call-with-port (open-file file (if (file-exists? file) "a" "w")) (lambda (output) (dump-port* port output #:buffer-size %http-receive-buffer-size @@ -817,7 +830,12 @@ otherwise simply ignore them." (let try ((uri (append uri content-addressed-uris))) (match uri ((uri tail ...) - (or (fetch uri file) + (or (if (file-exists? file) + ;; If the file already exists, we do a partial fetch for the + ;; remainder. If something went wrong, we remove the file + ;; and try to fetch the whole file instead. + (or (fetch uri file) (begin (delete-file file) (fetch uri file))) + (fetch uri file)) (try tail))) (() (format (current-error-port) "failed to download ~s from ~s~%" diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc index 17e92c68a7..176ab40226 100644 --- a/nix/libstore/build.cc +++ b/nix/libstore/build.cc @@ -1320,6 +1320,7 @@ void DerivationGoal::tryToBuild() Path path = i->second.path; if (worker.store.isValidPath(path)) continue; if (!pathExists(path)) continue; + if (fixedOutput) continue; debug(format("removing invalid path `%1%'") % path); deletePath(path); } -- 2.24.0