Message ID | cover.1706027375.git.ludo@gnu.org |
---|---|
Headers | show |
Series | Service for "virtual build machines" | expand |
Hi Ludo, On mar., 23 janv. 2024 at 17:46, Ludovic Courtès <ludo@gnu.org> wrote: > Lots of talk about reproducibility and how wonderful Guix is, but > as soon as you try to build packages from v1.0.0, released less > than 5 years ago, you hit a “time trap” in Python, in OpenSSL, or > some other ugly build failure—assuming you managed to fetch source > code in the first place¹. Cool! Workarounds for “time trap” of the current past. Note that today is the past of the future. ;-) Other said, the same workarounds will help to detect today thus fix the “time trap” that would arise in the future. Without mentioning the bug of 2038 year. :-) > This patch series defines a long-overdue > ‘virtual-build-machine-service-type’: a service to run a virtual > machine available for offloading. My main goal here is to > allow users to build stuff at a past date without having to > change their system clock. It can also be used to control other > aspects usually not under control: the CPU model, the Linux kernel. Yes, controlling CPU model and Linux kernel are worth: + CPU model because we already have examples of failures (Python 3.7 packaged in Guix v1.0.0, some BLAS libraries, etc.); + Linux kernel because its stability is one of the strong assumption we are making for reproducibility. Cheers, simon
Simon Tournier <zimon.toutoune@gmail.com> skribis: > Yes, controlling CPU model and Linux kernel are worth: > > + CPU model because we already have examples of failures (Python 3.7 > packaged in Guix v1.0.0, some BLAS libraries, etc.); Yes! And I think we should maintain a catalog of these problems (build processes influenced by date, hardware, or kernel version). Our horizon should be to somehow ensure such packages are always built in the right environment, automatically, whether or not it involves using a VM. Ludo’.
Hello there! Ludovic Courtès <ludo@gnu.org> skribis: > This patch series defines a long-overdue > ‘virtual-build-machine-service-type’: a service to run a virtual > machine available for offloading. My main goal here is to > allow users to build stuff at a past date without having to > change their system clock. It can also be used to control other > aspects usually not under control: the CPU model, the Linux kernel. Any comments on this patch series? https://issues.guix.gnu.org/68677 I’d like to go ahead and apply it by the end of the week if there are no objections. (I realize all the files being touched here are in a limbo in terms of team coverage. We should fix that!) Ludo’.
Ludovic Courtès <ludo@gnu.org> writes:
> Any comments on this patch series?
I don't have comments regarding the code, but I do have a couple of
questions and a comment. Please excuse my limited understanding of GNU
Shepherd and Guix System. None of the questions/comments below are
deal-breakers in my opinion.
1. The documentation references GNU Shepherd. Is GNU Shepherd a hard
requirement in order to use the facilities provided by the patch
series? Would it be possible to use, say, Systemd on a foreign
distribution? If so, could examples of those be documented in the
appropriate place as well?
2. The code sets the default date to be 2020-01-01; does this date have
any significance? It might help for the code to have a comment
explaining whether this value is completely arbitrary or whether it
has some significance. On a related note, it might help for the
documentation to note dates that are less likely to work (in case
values before a certain time aren't expected to be well supported).
Additionally, I'm not sure if this belongs in the manual or in the
cookbook (or elsewhere), but it would be helpful to have some small, but
complete, examples. The documentation in the patch series mentions two
situations (time traps, and CPU microarchitecture optimizations) and for
each it would be helpful to have a self-contained full working example
referenced. For the "time trap" use-case, perhaps one of the
submissions from the Ten Years Reproducibility Challenge could be used.
Hi Suhail, Suhail <suhail@bayesians.ca> skribis: > 1. The documentation references GNU Shepherd. Is GNU Shepherd a hard > requirement in order to use the facilities provided by the patch > series? Would it be possible to use, say, Systemd on a foreign > distribution? If so, could examples of those be documented in the > appropriate place as well? What this patch adds is a service one can use on Guix System. Someone who adds this service to their Guix System config can then run ‘herd start build-vm’ to enable offloading to the virtual build machine. It’s possible to do something similar on a distro other than Guix System but this patch series won’t help with that. On another distro, one would need to create a VM image and then manually start QEMU with the right flags and set up offloading to that VM. Nothing insurmountable, but it’s quite tedious. > 2. The code sets the default date to be 2020-01-01; does this date have > any significance? It might help for the code to have a comment > explaining whether this value is completely arbitrary or whether it > has some significance. On a related note, it might help for the > documentation to note dates that are less likely to work (in case > values before a certain time aren't expected to be well supported). I picked a date in the past because I figured this would be the most common use case at first: being able to rebuild things “in the past” (the manual says that the default date is “in the past”). Apart from that, it has no significance. I’ll add a comment as you suggest. The manual cannot really say which date “won’t work” because (1) it depends on what one is building, and (2) we simply don’t know in most cases. > Additionally, I'm not sure if this belongs in the manual or in the > cookbook (or elsewhere), but it would be helpful to have some small, but > complete, examples. The documentation in the patch series mentions two > situations (time traps, and CPU microarchitecture optimizations) and for > each it would be helpful to have a self-contained full working example > referenced. For the "time trap" use-case, perhaps one of the > submissions from the Ten Years Reproducibility Challenge could be used. Yes, I agree we need complete examples (maybe not in the manual, rather as blog posts and/or Cookbook entries I’d say). Thanks for chiming in! Ludo’.
Ludovic Courtès <ludo@gnu.org> skribis: > services: secret-service: Make the endpoint configurable. > vm: Add ‘date’ field to <virtual-machine>. > vm: Export <virtual-machine> accessors. > vm: Add ‘cpu-count’ field to <virtual-machine>. > marionette: Add #:peek? to ‘wait-for-tcp-port?’. > services: Add ‘virtual-build-machine’ service. Pushed as 9edbb2d7a40c9da7583a1046e39b87633459f656 with an extra comment explaining how the default date was chosen. Ludo’.
Hi, Thanks for your feedback. On lun., 05 févr. 2024 at 15:45, Suhail via Guix-patches via <guix-patches@gnu.org> wrote: > 1. The documentation references GNU Shepherd. Is GNU Shepherd a hard > requirement in order to use the facilities provided by the patch > series? Would it be possible to use, say, Systemd on a foreign > distribution? If so, could examples of those be documented in the > appropriate place as well? From my understanding, for now, it is for Guix System, so using Shepherd. It might be possible to use the ’vm’ on foreign distros but some details must be configured by hand, when it is automatically done by the “extended service”. More or less. :-) > 2. The code sets the default date to be 2020-01-01; does this date have > any significance? It might help for the code to have a comment > explaining whether this value is completely arbitrary or whether it > has some significance. On a related note, it might help for the > documentation to note dates that are less likely to work (in case > values before a certain time aren't expected to be well supported). For this date, nothing specific I guess. The oldest commit that one can reaches using “guix time-machine” is May 2019. Aside, it is hard to maintain a list of dates that “work”. Because nothing is written in stone and the passing of time cannot be frozen. For instance, 6 months ago, a jump of ~4 years was just working [1]. And now, it is broken [2]. Somehow, Guix provides features that demo a real-world experience which was simply impossible. Therefore, things are fluctuating toward more robustness. That’s said, based on my experience playing with “guix time-machine”, my rule of thumb is: 2-3 years old is most of the time ok. Older than 3 years is… cross-finger. 1: https://simon.tournier.info/posts/2023-06-23-hackathon-repro.html 2: https://issues.guix.gnu.org/69058 > Additionally, I'm not sure if this belongs in the manual or in the > cookbook (or elsewhere), but it would be helpful to have some small, but > complete, examples. The documentation in the patch series mentions two > situations (time traps, and CPU microarchitecture optimizations) and for > each it would be helpful to have a self-contained full working example > referenced. For the "time trap" use-case, perhaps one of the > submissions from the Ten Years Reproducibility Challenge could be used. The issue with time-trap is documented in the manual, see: Due to ‘guix time-machine’ relying on the “inferiors” mechanism (*note Inferiors::), the oldest commit it can travel to is commit ‘6298c3ff’ (“v1.0.0”), dated May 1^{st}, 2019, which is the first release that included the inferiors mechanism. An error is returned when attempting to navigate to older commits. Note: Although it should technically be possible to travel to such an old commit, the ease to do so will largely depend on the availability of binary substitutes. When traveling to a distant past, some packages may not easily build from source anymore. One such example are old versions of Python 2 which had time bombs in its test suite, in the form of expiring SSL certificates. This particular problem can be worked around by setting the hardware clock to a value in the past before attempting the build. https://guix.gnu.org/manual/devel/en/guix.html#Invoking-guix-time_002dmachine However, it appears to me hard to maintain a list of all the known time-trap. For now, we are not re-building the past, therefore most of the time-trap get unnoticed. About CPU microarchitecture, I know only two: Python [3] and OpenBLAS [4]. All in all we are at the infancy of this work and any help is welcome. :-) Cheers, simon 3: Try “guix time-machine --commit=v1.0.0 -- describe” 4: Investigating a reproducibility failure Konrad Hinsen <konrad.hinsen@fastmail.net> Tue, 01 Feb 2022 15:05:40 +0100 id:m1a6fahebv.fsf@fastmail.net https://lists.gnu.org/archive/html/guix-devel/2022-02 https://yhetil.org/guix/m1a6fahebv.fsf@fastmail.net Follow-up: Re: Investigating a reproducibility failure zimoun <zimon.toutoune@gmail.com> Wed, 02 Feb 2022 21:35:06 +0100 id:871r0l9fd1.fsf@gmail.com https://lists.gnu.org/archive/html/guix-devel/2022-02 https://yhetil.org/guix/871r0l9fd1.fsf@gmail.com