Message ID | 20210928214044.437-1-attila@lendvai.name |
---|---|
State | New |
Headers | show |
Series | [bug#50878] union: Resolve collisions by stable-sort'ing them. | expand |
Context | Check | Description |
---|---|---|
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | success | View Laminar job |
cbaines/issue | success | View issue |
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | success | View Laminar job |
cbaines/issue | success | View issue |
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | success | View Laminar job |
cbaines/issue | success | View issue |
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | success | View Laminar job |
cbaines/issue | success | View issue |
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | success | View Laminar job |
cbaines/issue | success | View issue |
Attila Lendvai schreef op di 28-09-2021 om 23:40 [+0200]: > * guix/build/union.scm (resolve-collision/alphanumeric-last): New function. > (warn-about-collision): Renamed to default-collision-resolver. > --- > > this should work, but i cannot test it, because srfi-43 seems not to be > available on the build side: > > unpacking bootstrap Guile to '/home/alendvai/workspace/guix/guix/test-tmp/store/qky0jf68rr7pnsvmhj0ay42rzh4qk6r9-guile-bootstrap-2.0'... > [...] output without sfri-43.go > > and then unsurprisingly: "no code for module (srfi srfi-43)" SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS. The bootstrap guile is older: $(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version guile (GNU Guile) 2.0.9 [...] Greetings, Maxime.
> SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS. > The bootstrap guile is older: > > $(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version > > guile (GNU Guile) 2.0.9 thank you for the analysis! is it easy and desirable to upgrade it to 2.0.10 or newer? shall i try to do it, or advocate for it if it's not trivial, by e.g. opening an issue? - attila PGP: 5D5F 45C7 DFCD 0A39
Hi, Am Dienstag, den 28.09.2021, 23:40 +0200 schrieb Attila Lendvai: > [...] > index 961ac3298b..747902ec6c 100644 > --- a/guix/build/union.scm > +++ b/guix/build/union.scm > @@ -23,11 +23,12 @@ > #:use-module (ice-9 format) > #:use-module (srfi srfi-1) > #:use-module (srfi srfi-26) > + #:use-module (srfi srfi-43) > #:use-module (rnrs bytevectors) > #:use-module (rnrs io ports) > #:export (union-build > > - warn-about-collision > + default-collision-resolver > > relative-file-name > symlink-relative)) > @@ -102,10 +103,23 @@ identical, #f otherwise." > ;; applications via 'glib-or-gtk-build-system'. > '("icon-theme.cache" "gschemas.compiled")) > > -(define (warn-about-collision files) > - "Handle the collision among FILES by emitting a warning and > choosing the > -first one of THEM." > - (let ((file (first files))) > +(define (resolve-collision/alphanumeric-last files) > + ;; Let's do a stable-sort at least, so that multiple foo- > 1.2.3/bin/foo > + ;; variants will predictably resolve to the highest versioned one. > + (let* ((original-files (list->vector files)) > + (count (vector-length original-files)) > + (stripped-files (vector-map (lambda (_ el) > + (strip-store-file-name el)) > + original-files)) > + (indices (vector-unfold values count))) > + (stable-sort! indices > + (lambda (a b) > + (string> (vector-ref stripped-files a) > + (vector-ref stripped-files b)))) > + (vector-ref original-files (vector-ref indices 0)))) Instead of stable-sort!-ing the indices of a vector, what about stable- sort!-ing (map strip-store-file-name original-files) in more or less one go? > +(define (default-collision-resolver files) > + (let ((file (resolve-collision/alphanumeric-last files))) > (unless (member (basename file) %harmless-collisions) > (format (current-error-port) > "~%warning: collision encountered:~%~{ ~a~%~}" > @@ -117,7 +131,7 @@ first one of THEM." > #:key (log-port (current-error-port)) > (create-all-directories? #f) > (symlink symlink) > - (resolve-collision warn-about-collision)) > + (resolve-collision default-collision- > resolver)) > "Build in the OUTPUT directory a symlink tree that is the union of > all the > INPUTS, using SYMLINK to create symlinks. As a special case, if > CREATE-ALL-DIRECTORIES?, creates the subdirectories in the output > directory to I don't think the default collision resolver ought to sort the files. The rationale behind ignoring certain collisions, e.g. icon caches relies on the fact that Guix will use the correct files because they are put first in the manifest. The hooks themselves have no special names that could put them "always first" and profiles are themselves union-built. I do however support the addition of sorting methods as collision resolvers in general and would welcome a way of doing so for profiles pre-hook. Regards, Liliana
Attila Lendvai schreef op wo 29-09-2021 om 16:03 [+0000]: > > SRFI-43 is in Guile since Guile 2.0.10, according to Guile's NEWS. > > The bootstrap guile is older: > > > > $(guix build -e '(@@ (gnu packages bootstrap) %bootstrap-guile)')/bin/guile --version > > > > guile (GNU Guile) 2.0.9 > > thank you for the analysis! > > is it easy and desirable to upgrade it to 2.0.10 or newer? It's possible by modifying 'bootstrap-guile-url-path' and 'bootstrap-guile-hash'. Apparently, some architectures already use a newer guile. E.g., aarch64 has 2.0.14. However, this probably would entail a world-rebuild I think, so this probably needs to be done on core-updates. Why limit to guile 2.0.10, why not go for guile 3.0.7 instead? I don't kow how one would go about updating the bootstrap binaries though. Anyway, there have been quite a few bug fixes and new features since 2.0.9, and updating to guile 3.0.? would allow dropping some compatibility code in various guix/build/, so I wouldn't be opposed to such a change. Greetings, Maxime
> > - (let* ((original-files (list->vector files)) > > - (count (vector-length original-files)) > > - (stripped-files (vector-map (lambda (_ el) > > - (strip-store-file-name el)) > > - original-files)) > > - (indices (vector-unfold values count))) > > > > - (stable-sort! indices > > - (lambda (a b) > > - (string> (vector-ref stripped-files a) > > - (vector-ref stripped-files b)))) > > - (vector-ref original-files (vector-ref indices 0)))) > > Instead of stable-sort!-ing the indices of a vector, what about stable- > sort!-ing (map strip-store-file-name original-files) in more or less > one go? the hash also needs to be dropped from the path for sorting to be useful, but the return value must be the full path, hence the complexity with sorting the indices, pointing both to the full paths and the cut parts. > I don't think the default collision resolver ought to sort the files. > > The rationale behind ignoring certain collisions, e.g. icon caches > relies on the fact that Guix will use the correct files because they > are put first in the manifest. The hooks themselves have no special > names that could put them "always first" and profiles are themselves > union-built. > > I do however support the addition of sorting methods as collision > resolvers in general and would welcome a way of doing so for profiles > pre-hook. please note that i almost completely lack the knowledge of the relevant internals. with that in mind, do i read you right? this should add a new exported resolver funtion, and leave the default one as it was? and use the new resolver somewhere (where?) that will affect only the union of profiles, i.e. user visible effecs when installing software? (because DIRECTORY-UNION is called in other contexts also?) thanks for the insights! i'd appreciate a bit more higher level guidance, and then i'll resend the patch accordingly. - attila PGP: 5D5F 45C7 DFCD 0A39
Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]: > > > - (let* ((original-files (list->vector files)) > > > - (count (vector-length original-files)) > > > - (stripped-files (vector-map (lambda (_ el) > > > - (strip-store-file-name el)) > > > - original-files)) > > > - (indices (vector-unfold values count))) > > > > > > - (stable-sort! indices > > > - (lambda (a b) > > > - (string> (vector-ref stripped-files a) > > > - (vector-ref stripped-files b)))) > > > - (vector-ref original-files (vector-ref indices 0)))) > > > > Instead of stable-sort!-ing the indices of a vector, what about stable- > > sort!-ing (map strip-store-file-name original-files) in more or less > > one go? > > the hash also needs to be dropped from the path for sorting to be > useful, but the return value must be the full path, hence the > complexity with sorting the indices, pointing both to the full paths > and the cut parts. You can replace the 'less' argument of 'stable-sort'. Example sorting by the second character of a string: (sort '("za" "yb" "xc") (lambda (x y) (char>? (string-ref x 1) (string-ref y 1))))) IIUC, you would need to replace char>? by string> and string-ref by strip-store-file-name. Greetings, Maxime.
Hi, Maxime Devos <maximedevos@telenet.be> skribis: > Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]: >> > > - (let* ((original-files (list->vector files)) >> > > - (count (vector-length original-files)) >> > > - (stripped-files (vector-map (lambda (_ el) >> > > - (strip-store-file-name el)) >> > > - original-files)) >> > > - (indices (vector-unfold values count))) >> > > >> > > - (stable-sort! indices >> > > - (lambda (a b) >> > > - (string> (vector-ref stripped-files a) >> > > - (vector-ref stripped-files b)))) >> > > - (vector-ref original-files (vector-ref indices 0)))) >> > >> > Instead of stable-sort!-ing the indices of a vector, what about stable- >> > sort!-ing (map strip-store-file-name original-files) in more or less >> > one go? >> >> the hash also needs to be dropped from the path for sorting to be >> useful, but the return value must be the full path, hence the >> complexity with sorting the indices, pointing both to the full paths >> and the cut parts. > > You can replace the 'less' argument of 'stable-sort'. > Example sorting by the second character of a string: > > (sort '("za" "yb" "xc") (lambda (x y) > (char>? (string-ref x 1) > (string-ref y 1))))) > > IIUC, you would need to replace char>? by string> and string-ref > by strip-store-file-name. Agreed. I’d advice using this strategy rather than resorting to SRFI-43; it should have the desired effect. BTW, because TeX Live packages rely on ‘union-build’, this patch triggers a lot of rebuilds, but we can try to squeeze it in the upcoming ‘core-updates-frozen’ rebuild. Thanks, Ludo’.
> > the hash also needs to be dropped from the path for sorting to be > > useful, but the return value must be the full path, hence the > > complexity with sorting the indices, pointing both to the full paths > > and the cut parts. > > You can replace the 'less' argument of 'stable-sort'. > Example sorting by the second character of a string: > > (sort '("za" "yb" "xc") (lambda (x y) > (char>? (string-ref x 1) > (string-ref y 1))))) i don't know about the expected size of the collision list here, but that would cons much more, because that would cons up two substrings at each comparison, i.e. O(n^2) vs O(n) at least in GC load, probably in time also. i think it's not worth it, but let me know, and then i can simplify the code somewhat at the cost of more consing. sorting lists is probably also much slower than sorting vectors, but there may be some tricks i don't know about. random note: civodul said on IRC that there's a certain reluctancy to update the opaque binary blobs, and the bootstrap guile will be replaced by mes anyway. so, if we want to have this merged, then we will need to give up on testing it until mes becomes the bootstrap scheme (and assuming it will have srfi vectors). in the lights of the above, i think i'll stop pursuing this patch, but i'd be happy to implement something if you can give me highlevel guidance. with the above pointed out, are you still happy with the list sorting? i'd love to have idris packaged so that it has bin/idris symlinked to bin/idris-1.2.3, and installing multiple versions of it would pick the newest one for bin/idris. -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “Have the courage to take your own thoughts seriously, for they will shape you.” — Albert Einstein (1879–1955)
Attila Lendvai schreef op do 30-09-2021 om 14:12 [+0000]: > > > the hash also needs to be dropped from the path for sorting to be > > > useful, but the return value must be the full path, hence the > > > complexity with sorting the indices, pointing both to the full paths > > > and the cut parts. > > > > You can replace the 'less' argument of 'stable-sort'. > > Example sorting by the second character of a string: > > > > (sort '("za" "yb" "xc") (lambda (x y) > > (char>? (string-ref x 1) > > (string-ref y 1))))) > > i don't know about the expected size of the collision list here, but > that would cons much more, because that would cons up two substrings > at each comparison, i.e. O(n^2) vs O(n) at least in GC load, probably > in time also. I don't see any consing here, or any increase in complexity? > i think it's not worth it, but let me know, and then i can simplify > the code somewhat at the cost of more consing. > > sorting lists is probably also much slower than sorting vectors, but > there may be some tricks i don't know about. I was suggesting replacing the 'less' procedure with a procedure like (lambda (x y) (char>? (string-ref x 1) (string-ref y 1))) instead of doing the sorting in two steps. I wasn't suggesting using lists. The '("za" "yb" "xc") was only for demonstration, a vector should work as well. Greetings, Maxime.
Am Donnerstag, den 30.09.2021, 10:42 +0200 schrieb Maxime Devos: > Attila Lendvai schreef op do 30-09-2021 om 08:10 [+0000]: > > > > - (let* ((original-files (list->vector files)) > > > > - (count (vector-length original-files)) > > > > - (stripped-files (vector-map (lambda (_ el) > > > > - (strip-store-file- > > > > name el)) > > > > - original-files)) > > > > - (indices (vector-unfold values count))) > > > > > > > > - (stable-sort! indices > > > > - (lambda (a b) > > > > - (string> (vector-ref stripped-files a) > > > > - (vector-ref stripped-files > > > > b)))) > > > > - (vector-ref original-files (vector-ref indices 0)))) > > > > > > Instead of stable-sort!-ing the indices of a vector, what about > > > stable- > > > sort!-ing (map strip-store-file-name original-files) in more or > > > less > > > one go? > > > > the hash also needs to be dropped from the path for sorting to be > > useful, but the return value must be the full path, hence the > > complexity with sorting the indices, pointing both to the full > > paths > > and the cut parts. > > You can replace the 'less' argument of 'stable-sort'. > Example sorting by the second character of a string: > > (sort '("za" "yb" "xc") (lambda (x y) > (char>? (string-ref x 1) > (string-ref y 1))))) > > IIUC, you would need to replace char>? by string> and string-ref > by strip-store-file-name. You could also store a mapping of long file name to stripped file name in a hash table for fast lookup, so as to not compute (strip-store- file-name) over and over. That way you can stable-sort! the long names, but don't forget to list-copy them at least for functional purity. Greetings, Liliana
Hi Attila Am Donnerstag, den 30.09.2021, 08:10 +0000 schrieb Attila Lendvai: > > > - (let* ((original-files (list->vector files)) > > > - (count (vector-length original-files)) > > > - (stripped-files (vector-map (lambda (_ el) > > > - (strip-store-file-name > > > el)) > > > - original-files)) > > > - (indices (vector-unfold values count))) > > > > > > - (stable-sort! indices > > > - (lambda (a b) > > > - (string> (vector-ref stripped-files a) > > > - (vector-ref stripped-files b)))) > > > - (vector-ref original-files (vector-ref indices 0)))) > > > > Instead of stable-sort!-ing the indices of a vector, what about > > stable-sort!-ing (map strip-store-file-name original-files) in more > > or less one go? > > the hash also needs to be dropped from the path for sorting to be > useful, but the return value must be the full path, hence the > complexity with sorting the indices, pointing both to the full paths > and the cut parts. This thing has received some replies already to which I already dropped a comment, so I'll try not to repeat myself here. If something is still unclear, go down the sub-thread started by Maxime. > > I don't think the default collision resolver ought to sort the > > files. > > > > The rationale behind ignoring certain collisions, e.g. icon caches > > relies on the fact that Guix will use the correct files because > > they are put first in the manifest. The hooks themselves have no > > special names that could put them "always first" and profiles are > > themselves union-built. > > > > I do however support the addition of sorting methods as collision > > resolvers in general and would welcome a way of doing so for > > profiles > > pre-hook. > > please note that i almost completely lack the knowledge of the > relevant internals. > > with that in mind, do i read you right? this should add a new > exported resolver funtion, and leave the default one as it was? Yes, I think that is safer than actually overriding the default for now at least. > and use the new resolver somewhere (where?) that will affect only the > union of profiles, i.e. user visible effecs when installing software? > (because DIRECTORY-UNION is called in other contexts also?) > > thanks for the insights! i'd appreciate a bit more higher level > guidance, and then i'll resend the patch accordingly. I must admit, I'm at a loss of knowledge myself here. Perhaps someone else can jive in and tell you which union-build is done without hooks, but I fear there might also be none. For now, simply having the resolver is in my opinion enough if we don't find the right location as well. We can default it later with "first-come-first-serve" exceptions where needed. Cheers
sorry for being slow with understanding the suggested solution. i have automatically dismissed everything with leaky abstractions, and that made me blind to it. i didn't realize that i could either introduce constants for the hash length, and the store path, or straight out export a special comparator function to be used. either way, i have organized the patches so that the first 3 are useful, and the 4th one is more of a demo/inspiration for prosperity. apply 1-3 as you see fit, and ignore or finish the 4th. - attila PGP: 5D5F 45C7 DFCD 0A39
Ho Attila, sorry for the long wait. Am Sonntag, dem 03.10.2021 um 12:59 +0000 schrieb Attila Lendvai: > sorry for being slow with understanding the suggested solution. i > have automatically dismissed everything with leaky abstractions, and > that made me blind to it. > > i didn't realize that i could either introduce constants for the hash > length, and the store path, or straight out export a special > comparator function to be used. > > either way, i have organized the patches so that the first 3 are > useful, and the 4th one is more of a demo/inspiration for prosperity. > > apply 1-3 as you see fit, and ignore or finish the 4th. I've applied 1-3 with some changes to core-updates (particularly reducing the number of indirections in the third patch). I verified that gcc-toolchain builds and the collision is still resolved as-is. If you still wish to stable-sort things, go ahead. Cheers
diff --git a/guix/build/union.scm b/guix/build/union.scm index 961ac3298b..747902ec6c 100644 --- a/guix/build/union.scm +++ b/guix/build/union.scm @@ -23,11 +23,12 @@ #:use-module (ice-9 format) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) + #:use-module (srfi srfi-43) #:use-module (rnrs bytevectors) #:use-module (rnrs io ports) #:export (union-build - warn-about-collision + default-collision-resolver relative-file-name symlink-relative)) @@ -102,10 +103,23 @@ identical, #f otherwise." ;; applications via 'glib-or-gtk-build-system'. '("icon-theme.cache" "gschemas.compiled")) -(define (warn-about-collision files) - "Handle the collision among FILES by emitting a warning and choosing the -first one of THEM." - (let ((file (first files))) +(define (resolve-collision/alphanumeric-last files) + ;; Let's do a stable-sort at least, so that multiple foo-1.2.3/bin/foo + ;; variants will predictably resolve to the highest versioned one. + (let* ((original-files (list->vector files)) + (count (vector-length original-files)) + (stripped-files (vector-map (lambda (_ el) + (strip-store-file-name el)) + original-files)) + (indices (vector-unfold values count))) + (stable-sort! indices + (lambda (a b) + (string> (vector-ref stripped-files a) + (vector-ref stripped-files b)))) + (vector-ref original-files (vector-ref indices 0)))) + +(define (default-collision-resolver files) + (let ((file (resolve-collision/alphanumeric-last files))) (unless (member (basename file) %harmless-collisions) (format (current-error-port) "~%warning: collision encountered:~%~{ ~a~%~}" @@ -117,7 +131,7 @@ first one of THEM." #:key (log-port (current-error-port)) (create-all-directories? #f) (symlink symlink) - (resolve-collision warn-about-collision)) + (resolve-collision default-collision-resolver)) "Build in the OUTPUT directory a symlink tree that is the union of all the INPUTS, using SYMLINK to create symlinks. As a special case, if CREATE-ALL-DIRECTORIES?, creates the subdirectories in the output directory to diff --git a/guix/gexp.scm b/guix/gexp.scm index f3d278b3e6..32e8748443 100644 --- a/guix/gexp.scm +++ b/guix/gexp.scm @@ -1983,7 +1983,7 @@ This yields an 'etc' directory containing these two files." (define* (directory-union name things #:key (copy? #f) (quiet? #f) - (resolve-collision 'warn-about-collision)) + (resolve-collision 'default-collision-resolver)) "Return a directory that is the union of THINGS, where THINGS is a list of file-like objects denoting directories. For example: diff --git a/tests/union.scm b/tests/union.scm index a8387edf42..cbf8840793 100644 --- a/tests/union.scm +++ b/tests/union.scm @@ -204,4 +204,13 @@ ("/a/b" "/a/b/c/d" => "c/d") ("/a/b/c" "/a/d/e/f" => "../../d/e/f"))) +(test-assert "resolve-collision/alphanumeric-last sorts alphanumerically" + (string= + ((@@ (guix build union) resolve-collision/alphanumeric-last) + (list "/gnu/store/c0000000000000000000000000000000-idris-0.0.0/bin/idris" + "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris" + "/gnu/store/z0000000000000000000000000000000-idris-1.3.5/bin/idris" + "/gnu/store/00000000000000000000000000000000-idris-1.3.3/bin/idris")) + "/gnu/store/60000000000000000000000000000000-idris-2.0.0/bin/idris")) + (test-end)