Message ID | c89e5b8436f2c53c9bb51a83a31eff0c9a71a1b5.camel@student.tugraz.at |
---|---|
State | Accepted |
Headers | show |
Series | [bug#38649] Parallelize `guix package` | expand |
Dec 17, 2019 8:19:14 AM Leo Prikler : > Hi Guix! > > Yesterday I had an interesting conversation on IRC about the behaviour > of multiple `guix package` processes running in parallel. > Specifically, when two transactions target the same profile (usually > /var/guix/profiles/per-user/$USER/guix-profile) at the same time, one > of them will fail to claim the lock and abort. 0001 makes it so that > the process waits for the lock. 0002 makes it so that packages > specified via -i can be built in parallel. > > Regards, > Leo > Can we extend this to include things like environment --ad-hoc?
Hi Leo, (Cc: Julien, who worked on this part.) Leo Prikler <leo.prikler@student.tugraz.at> skribis: > Yesterday I had an interesting conversation on IRC about the behaviour > of multiple `guix package` processes running in parallel. > Specifically, when two transactions target the same profile (usually > /var/guix/profiles/per-user/$USER/guix-profile) at the same time, one > of them will fail to claim the lock and abort. 0001 makes it so that > the process waits for the lock. 0002 makes it so that packages > specified via -i can be built in parallel. I actually like the current behavior, FWIW. Julien came up with this locking mostly so that people do not inadvertently attempt to perform several operations concurrently. The key word here is “inadvertently”: IMO, there’s no reason to run multiple ‘guix package’ on the same profile concurrently. With a wait-for-lock policy, the result would be non-deterministic: you cannot tell which one of the two processes will complete first. WDYT? Thanks, Ludo’.
Am Dienstag, den 17.12.2019, 14:20 +0000 schrieb Brett Gilio: > > Dec 17, 2019 8:19:14 AM Leo Prikler : > > > Hi Guix! > > > > Yesterday I had an interesting conversation on IRC about the > > behaviour > > of multiple `guix package` processes running in parallel. > > Specifically, when two transactions target the same profile > > (usually > > /var/guix/profiles/per-user/$USER/guix-profile) at the same time, > > one > > of them will fail to claim the lock and abort. 0001 makes it so > > that > > the process waits for the lock. 0002 makes it so that packages > > specified via -i can be built in parallel. > > > > Regards, > > Leo > > > > Can we extend this to include things like environment --ad-hoc? `guix environment` does not claim any locks, so it does not suffer from the problem that this patch tries to address. Perhaps my wording was bad: By "can be built in parallel", I meant that if one starts two processes, e.g. `guix install emacs` and `guix install ffmpeg`, emacs and ffmpeg are built in parallel. This does not mean, that dependencies of emacs are built in parallel – for that you'd have to dig closer to the core. Regards, Leo
Dec 17, 2019 8:34:25 AM Leo Prikler : > Am Dienstag, den 17.12.2019, 14:20 +0000 schrieb Brett Gilio: > > > > > Dec 17, 2019 8:19:14 AM Leo Prikler : > > > > > > > Hi Guix! > > > > > > Yesterday I had an interesting conversation on IRC about the > > > behaviour > > > of multiple `guix package` processes running in parallel. > > > Specifically, when two transactions target the same profile > > > (usually > > > /var/guix/profiles/per-user/$USER/guix-profile) at the same time, > > > one > > > of them will fail to claim the lock and abort. 0001 makes it so > > > that > > > the process waits for the lock. 0002 makes it so that packages > > > specified via -i can be built in parallel. > > > > > > Regards, > > > Leo > > > > > > > > > > Can we extend this to include things like environment --ad-hoc? > > > `guix environment` does not claim any locks, so it does not suffer from > the problem that this patch tries to address. Perhaps my wording was > bad: By "can be built in parallel", I meant that if one starts two > processes, e.g. `guix install emacs` and `guix install ffmpeg`, emacs > and ffmpeg are built in parallel. This does not mean, that > dependencies of emacs are built in parallel ? for that you'd have to > dig closer to the core. > > Regards, > Leo > Ah right. My mistake. I just woke up, so I think I need more coffee. :) Brett Gilio
Hi Ludo’, Am Dienstag, den 17.12.2019, 15:32 +0100 schrieb Ludovic Courtès: > Hi Leo, > > (Cc: Julien, who worked on this part.) > > Leo Prikler <leo.prikler@student.tugraz.at> skribis: > > > Yesterday I had an interesting conversation on IRC about the > > behaviour > > of multiple `guix package` processes running in parallel. > > Specifically, when two transactions target the same profile > > (usually > > /var/guix/profiles/per-user/$USER/guix-profile) at the same time, > > one > > of them will fail to claim the lock and abort. 0001 makes it so > > that > > the process waits for the lock. 0002 makes it so that packages > > specified via -i can be built in parallel. > > I actually like the current behavior, FWIW. Julien came up with this > locking mostly so that people do not inadvertently attempt to perform > several operations concurrently. Fair enough and that is an improvement over non-locking behaviour, where one could spawn multiple profile generations from one, neither of which is complete. Perhaps my attempt at doing this in a somewhat controlled manner is equally harmful, but I will still try my best arguing for it, as I believe it can make a positive impact. > The key word here is “inadvertently”: IMO, there’s no reason to run > multiple ‘guix package’ on the same profile concurrently. With a > wait-for-lock policy, the result would be non-deterministic: you > cannot > tell which one of the two processes will complete first. > > WDYT? I think the current policy is wait-for-lock deferred to the user. The user has to let the first task complete before they can start the second. In this setup, the user can simply launch the setup and trust, that it will complete later while taking into account the changes the first one has made. Let's talk about three classes of operations – installations, removals and upgrades – and their interactions. I will not take into account roll-back, switch-generation and delete-generation, as it is nonsensical to perform these in parallel to any other action. Perhaps we could check for their presence first and acquire the lock with no- wait semantics in that case. - any operation on different packages: Either succeeds first and the other builds on the profile it generates. As there is no collision in the packages themselves, there will be no harm. - install same package twice: Either succeeds first, the other will be a no-op. - install vs. remove same package: Non-deterministic, but why would you do that? - install vs. upgrade same package: Upgrade will be a no-op in either case. - remove vs. upgrade same package: Upgrade may inadvertently upgrade the old package if it happens to come first, but in the final package it will be removed either way. Of course, any operation can also fail midway due to some step not succeeding. In that case it would be as if one had issued the other command right after that, which may perhaps not be what one wanted to do (assuming I install package A, and some guide suggests to also build related, but not dependency-connected package B, so I end up installing B without A). However, such cases can easily be fixed by either installing a fixed version of A later, using B on its own if it can be, or rolling back. Of course, both solutions are flawed in the way that they assume user intent either way. Perhaps a better one would be to let the user specify whether they want to wait or not through a command line parameter, using the current behaviour as the default approach. WDYT? Regards, Leo
Le 17 décembre 2019 16:19:34 GMT+01:00, Leo Prikler <leo.prikler@student.tugraz.at> a écrit : > >Of course, any operation can also fail midway due to some step not >succeeding. In that case it would be as if one had issued the other >command right after that, which may perhaps not be what one wanted to >do (assuming I install package A, and some guide suggests to also build >related, but not dependency-connected package B, so I end up installing >B without A). However, such cases can easily be fixed by either >installing a fixed version of A later, using B on its own if it can be, >or rolling back. > >Of course, both solutions are flawed in the way that they assume user >intent either way. Perhaps a better one would be to let the user >specify whether they want to wait or not through a command line >parameter, using the current behaviour as the default approach. > >WDYT? I might be missing something. Guix install etc act on a "hidden" descripcion of the profile. Tgey take the current profile, modify it as specified (adding a package, renovinh another or upgrading some). When you run two guix package in parallel, they both work on the same profile, which creates unexpected results. The expectation behind tge lock is that users will cancel tge ocher command and fix it before re-running it (e.g. instead of guix install foo & guix install bar, run guix install foo bar). > >Regards, >Leo
Am Dienstag, den 17.12.2019, 16:50 +0100 schrieb Julien Lepiller: > Le 17 décembre 2019 16:19:34 GMT+01:00, Leo Prikler < > leo.prikler@student.tugraz.at> a écrit : > > Of course, any operation can also fail midway due to some step not > > succeeding. In that case it would be as if one had issued the > > other > > command right after that, which may perhaps not be what one wanted > > to > > do (assuming I install package A, and some guide suggests to also > > build > > related, but not dependency-connected package B, so I end up > > installing > > B without A). However, such cases can easily be fixed by either > > installing a fixed version of A later, using B on its own if it can > > be, > > or rolling back. > > > > Of course, both solutions are flawed in the way that they assume > > user > > intent either way. Perhaps a better one would be to let the user > > specify whether they want to wait or not through a command line > > parameter, using the current behaviour as the default approach. > > > > WDYT? > > I might be missing something. Guix install etc act on a "hidden" > descripcion of the profile. Tgey take the current profile, modify it > as specified (adding a package, renovinh another or upgrading some). > When you run two guix package in parallel, they both work on the same > profile, which creates unexpected results. That's why the lock is claimed first. This way, the second process acts on the profile that the first generated. I've tested this by installing cowsay in parallel to lolcat, but it should work for bigger packages in much the same way. > The expectation behind tge lock is that users will cancel tge ocher > command and fix it before re-running it (e.g. instead of guix install > foo & guix install bar, run guix install foo bar). That is perhaps a reasonable expectation in most cases, but may be annoying in others. Take any package with an absurdly long build time (e.g. icecat) and then think "Oh, but I also wanted this" while it is building. Now you have to either actively wait for icecat to complete or stop it, add the other package and suffer the same build time again, (whereas in the other way, you can wait for icecat to complete and still launch a second process). With the parallel builds of 0002, thing become even better, as you can use bar even before foo is completed in case it manages to grab the lock first. With the long build of icecat against a package with a relatively short build, this could very well be the case and might end up being a game changer. Of course, one could abuse ad-hoc environments as well while waiting for the first process to finish, but I don't think that's how people running into this problem expect to be solving it (especially if they do want both foo and bar in their profiles). Regards, Leo
Hi Leo, Leo Prikler <leo.prikler@student.tugraz.at> skribis: > I think the current policy is wait-for-lock deferred to the user. The > user has to let the first task complete before they can start the > second. In this setup, the user can simply launch the setup and trust, > that it will complete later while taking into account the changes the > first one has made. > > Let's talk about three classes of operations – installations, removals > and upgrades – and their interactions. I will not take into account > roll-back, switch-generation and delete-generation, as it is > nonsensical to perform these in parallel to any other action. Perhaps > we could check for their presence first and acquire the lock with no- > wait semantics in that case. > > - any operation on different packages: Either succeeds first and the > other builds on the profile it generates. As there is no collision in > the packages themselves, there will be no harm. > - install same package twice: Either succeeds first, the other will be > a no-op. > - install vs. remove same package: Non-deterministic, but why would you > do that? > - install vs. upgrade same package: Upgrade will be a no-op in either > case. > - remove vs. upgrade same package: Upgrade may inadvertently upgrade > the old package if it happens to come first, but in the final package > it will be removed either way. > > Of course, any operation can also fail midway due to some step not > succeeding. In that case it would be as if one had issued the other > command right after that, which may perhaps not be what one wanted to > do (assuming I install package A, and some guide suggests to also build > related, but not dependency-connected package B, so I end up installing > B without A). However, such cases can easily be fixed by either > installing a fixed version of A later, using B on its own if it can be, > or rolling back. > > Of course, both solutions are flawed in the way that they assume user > intent either way. Perhaps a better one would be to let the user > specify whether they want to wait or not through a command line > parameter, using the current behaviour as the default approach. I cannot think of a useful behavior if wait-for-lock were implemented. Really, as a user, you’d be unable to know what the end result is. I don’t see that as very useful. :-) What you describe above as potential mitigation is just that, mitigation, and it could easily become complex (as a maintainer, I woudn’t want to be responsible for this kind of complexity :-)), and again, for very questionable “gains”. Thoughts? Julien? Thanks, Ludo’.
close 38649 thanks Am Mittwoch, den 18.12.2019, 15:37 +0100 schrieb Ludovic Courtès: > I cannot think of a useful behavior if wait-for-lock were > implemented. > Really, as a user, you’d be unable to know what the end result is. I > don’t see that as very useful. :-) It took a while, but I feel you've convinced me.
From 336692df15e77f9d90619d0fe60e864c4d2fb37a Mon Sep 17 00:00:00 2001 From: Leo Prikler <leo.prikler@student.tugraz.at> Date: Tue, 17 Dec 2019 14:04:12 +0100 Subject: [PATCH 2/2] guix: Build to be installed packages in parallel. * guix/scripts/package.scm (options->buildable): New procedure. (process-actions): Build packages before acquiring profile lock. --- guix/scripts/package.scm | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/guix/scripts/package.scm b/guix/scripts/package.scm index 202a6d6470..1278d1a65f 100644 --- a/guix/scripts/package.scm +++ b/guix/scripts/package.scm @@ -587,6 +587,19 @@ the resulting manifest entry." (package->manifest-entry package output #:properties (provenance-properties package))) +(define (options->buildable opts) + (filter-map (match-lambda + (('install . (? package? p)) + p) + (('install . (? string? spec)) + (if (store-path? spec) + #f ;; assume already interned + (specification->package spec))) + (('install . obj) + (leave (G_ "cannot build non-package object: ~s~%") + obj)) + (_ #f)) + opts)) (define (options->installable opts manifest transaction) "Given MANIFEST, the current manifest, and OPTS, the result of 'args-fold', @@ -861,8 +874,15 @@ processed, #f otherwise." (package-version item) (manifest-entry-version entry)))))) + ;; First, process installations, as these can be handled in parallel. + (unless dry-run? + (let* ((drv (map (compose (lambda (drv) (drv store)) package->derivation) + (options->buildable opts)))) + (show-what-to-build store drv + #:use-substitutes? substitutes?) + (build-derivations store drv))) - ;; First, acquire a lock on the profile, to ensure only one guix process + ;; Now, acquire a lock on the profile, to ensure only one guix process ;; is modifying it at a time. (format #t "Waiting for lock on ~a...~%" profile) (with-profile-lock profile -- 2.24.1