Message ID | 8900fa8c8eef7f72fc97adc2408be26c88de7803.camel@telenet.be |
---|---|
State | New |
Headers | show |
Series | [bug#50384] Optimise search-patch (reducing I/O) | expand |
Context | Check | Description |
---|---|---|
cbaines/applying patch | fail | View Laminar job |
cbaines/issue | success | View issue |
Hi! Some initial comments… Maxime Devos <maximedevos@telenet.be> skribis: > +++ b/guix/gexp.scm > @@ -531,13 +531,37 @@ appears." > (define-gexp-compiler (local-file-compiler (file <local-file>) system target) > ;; "Compile" FILE by adding it to the store. > (match file > - (($ <local-file> file (= force absolute) name sha256 recursive? select?) > - ;; Canonicalize FILE so that if it's a symlink, it is resolved. Failing > - ;; to do that, when RECURSIVE? is #t, we could end up creating a dangling > - ;; symlink in the store, and when RECURSIVE? is #f 'add-to-store' would > - ;; just throw an error, both of which are inconvenient. > - (interned-file absolute name > - #:recursive? recursive? #:select? select?)))) > + ;; Delay computing the absolute file name until 'intern', as this > + ;; might be a relatively expensive computation (e.g. if search-patch > + ;; is used), especially on a spinning disk. > + (($ <local-file> file absolute-promise name sha256 recursive? select?) > + (let () > + (define (intern) > + ;; Canonicalize FILE so that if it's a symlink, it is resolved. > + ;; Failing to do that, when RECURSIVE? is #t, we could end up creating > + ;; a dangling symlink in the store, and when RECURSIVE? is #f > + ;; 'add-to-store' would just throw an error, both of which are > + ;; inconvenient. > + (interned-file (force absolute-promise) name > + #:recursive? recursive? #:select? select?)) > + (if sha256 > + (let ((path (fixed-output-path name sha256 #:recursive? recursive?))) > + ;; If the hash is known in advance and the store already has the > + ;; item, there is no need to intern the file. > + (if (file-exists? path) > + (mbegin %store-monad > + ;; Tell the GC that PATH will be used, such that it won't > + ;; be deleted. > + ((store-lift add-temp-root) path) > + ;; The GC could have deleted the item before add-temp-root > + ;; completed, so check again if PATH exists. > + (if (file-exists? path) > + (return path) > + ;; If it has been removed, fall-back interning. > + (intern))) > + ;; If PATH does not yet exist, fall back to interning. > + (intern))) > + (intern)))))) ‘file-exists?’ won’t work when talking to a remote store (e.g., GUIX_DAEMON_SOCKET=ssh://…). ‘add-temp-root’ doesn’t throw if the given store item does not exist. So it could be written like this: (if sha256 (mbegin %store-monad (add-temp-root* item) (if (valid-path?* item) (return item) (intern))) (intern)) But then, we’d add one RPC for every ‘add-to-store’ RPC corresponding to a patch (you can set “GUIX_PROFILING=rpc” to see the numbers), which is not great. Ludo’.
Maxime Devos <maximedevos@telenet.be> skribis: > +(define-syntax search-patch > + (lambda (s) > + "Search the patch FILE-NAME and compute its hash at expansion time > +if possible. Return #f if not found." > + (syntax-case s () > + ((_ file-name) > + (string? (syntax->datum #'file-name)) > + ;; FILE-NAME is a constant string, so the hash can be computed > + ;; in advance. > + (let ((patch (try-search-patch (syntax->datum #'file-name)))) > + (if patch > + #`(%local-patch-file file-name #,(file-hash* patch #:select? true)) > + (begin > + (warning (source-properties->location > + (syntax-source #'file-name)) > + (G_ "~a: patch not found at expansion time") > + (syntax->datum #'ile-name)) > + #'(%search-patch file-name))))) > + ;; FILE-NAME is variable, so the hash cannot be pre-computed. > + ((_ file-name) #'(%search-patch file-name)) > + ;; search-patch is being used used in a construct like > + ;; (map search-patch ...). > + (id (identifier? #'id) #'%search-patch)))) It’s clever… but also a bit evil, in that it changes the semantics of package files in a surprising way. Modifying foo.patch without recompiling foo.scm would lead you to still use the old foo.patch, which can be rather off-putting and error-prone IMO. To address this, ‘local-file’ could store the inode/mtime + computed store file name (rather than the SHA256). ‘local-file-compiler’ would check whether the actual file has matching inode/mtime before returning the computed store file name. Problem is that inode/mtime are guaranteed to differ once you’ve run “make install”. :-/ Intuitively, I’d have imagined a cache populated at run time; it would map, say, file name/inode/mtime to a store file name. ‘add-to-store’ (or some wrapper above it) would check the cache and return the store file name directly, unless ‘valid-path?’ says it no longer exists. Downside is that this would be a per-user cache and you’d still pay the cost until it’s warm. Advantage is that you could easily tell whether it’s stale. Thoughts? Ludo’.
I split off the base16 and base32 optimisations to a separate patch series: 50456@debbugs.gnu.org.
From 0fc54bdd9ccc9729fff54f5935a552e5e608a1d0 Mon Sep 17 00:00:00 2001 From: Maxime Devos <maximedevos@telenet.be> Date: Sat, 4 Sep 2021 18:10:32 +0200 Subject: [PATCH 6/6] gexp: Do not intern if the file is already in the store. * guix/gexp.scm (local-file-compiler): When the file is already in the store, re-use the fixed output path instead of interning the file again. --- guix/gexp.scm | 38 +++++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 7 deletions(-) diff --git a/guix/gexp.scm b/guix/gexp.scm index c69e4aa299..da1e918801 100644 --- a/guix/gexp.scm +++ b/guix/gexp.scm @@ -531,13 +531,37 @@ appears." (define-gexp-compiler (local-file-compiler (file <local-file>) system target) ;; "Compile" FILE by adding it to the store. (match file - (($ <local-file> file (= force absolute) name sha256 recursive? select?) - ;; Canonicalize FILE so that if it's a symlink, it is resolved. Failing - ;; to do that, when RECURSIVE? is #t, we could end up creating a dangling - ;; symlink in the store, and when RECURSIVE? is #f 'add-to-store' would - ;; just throw an error, both of which are inconvenient. - (interned-file absolute name - #:recursive? recursive? #:select? select?)))) + ;; Delay computing the absolute file name until 'intern', as this + ;; might be a relatively expensive computation (e.g. if search-patch + ;; is used), especially on a spinning disk. + (($ <local-file> file absolute-promise name sha256 recursive? select?) + (let () + (define (intern) + ;; Canonicalize FILE so that if it's a symlink, it is resolved. + ;; Failing to do that, when RECURSIVE? is #t, we could end up creating + ;; a dangling symlink in the store, and when RECURSIVE? is #f + ;; 'add-to-store' would just throw an error, both of which are + ;; inconvenient. + (interned-file (force absolute-promise) name + #:recursive? recursive? #:select? select?)) + (if sha256 + (let ((path (fixed-output-path name sha256 #:recursive? recursive?))) + ;; If the hash is known in advance and the store already has the + ;; item, there is no need to intern the file. + (if (file-exists? path) + (mbegin %store-monad + ;; Tell the GC that PATH will be used, such that it won't + ;; be deleted. + ((store-lift add-temp-root) path) + ;; The GC could have deleted the item before add-temp-root + ;; completed, so check again if PATH exists. + (if (file-exists? path) + (return path) + ;; If it has been removed, fall-back interning. + (intern))) + ;; If PATH does not yet exist, fall back to interning. + (intern))) + (intern)))))) (define-record-type <plain-file> (%plain-file name content references) -- 2.33.0