Message ID | 20190829231653.7607-1-ludo@gnu.org |
---|---|
Headers | show |
Hello, Ludovic Courtès <ludo@gnu.org> skribis: > tests: 'with-http-server' accepts multiple responses. > swh: Add hooks for rate limiting handling. > swh: Make 'commit-id?' public. > lint: Add 'archival' checker. I went ahead and pushed these at commit 55549c7b9b778a79d3e1f3d085861ef36aabdca6. I asked for feedback on #swh-devel and olasd (Nicolas Dandrimont), one of the SWH developers, replied: --8<---------------cut here---------------start------------->8--- <olasd> civodul: this seems like a sensible design to me; Does `guix lint` automatically call other network services? maybe the save request should be an optional flag [13:55] <olasd> (automatically _checking_ is fine; automatically _saving_, I don't know) <civodul> olasd: there's a 'refresh' checker that calls out to services to determine whether a newer version of the package is available, for instance [14:01] <civodul> initially i thought about not saving at all, and just writing "you should save this" <civodul> but then i thought it's more convenient to just do it right away <civodul> it's unlikely to send garbage anyway, and it'll necessarily send only public code, and very likely only free code [14:02] <civodul> or did you have other concerns? <olasd> I don't think it's going to be an issue for us [14:08] <olasd> I would just (personally) be surprised if a lint tool I'm using started to have side effects on somewhat unrelated systems :) [14:09] [...] <civodul> olasd: ah true, though i guess we just got used to that ;-) [14:12] <civodul> anyway, thanks for your feedback! <olasd> civodul: feel free to quote me by mail if you want to keep it archived --8<---------------cut here---------------end--------------->8--- Ludo’.
Hi, Nice ! And it is so aligned with their recent announcement [1] ;-) [1] https://www.softwareheritage.org/2019/08/05/saving-and-referencing-research-software-in-software-heritage/ On Fri, 30 Aug 2019 at 01:18, Ludovic Courtès <ludo@gnu.org> wrote: > Currently, only 25% of our packages are not fetched with ‘url-fetch’. > For the remaining 75%, this checker can only report whether the tarball > is missing (and apart from ftp.gnu.org and a few other exceptions, it > usually _is_ missing) and cannot actually save it. Maybe I miss something, but for example guile-2.0 is not yet archived. I am not able to find it with their search resources. And `guix lint -c archival guile@2.0' reports "guile@2.0.14: source not archived on Software Heritage". > Anyway, it’s a first step in that direction. Feedback welcome! I agree with the words on #swh-deve by olasd (Nicolas Dandrimont) from SWH that the automatic "save" should be optional (even if the default is save=true). > The second step will be to write a “lister” for Software Heritage that > grabs the list of source code URLs from > <https://guix.gnu.org/packages.json>. That could would run at SWH > and it could potentially grab the tarballs, not just the VCS checkouts. > Here’s are examples: > > https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/packagist/lister.py > https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gnu/lister.py > > It should be quite easy for a Pythonista to write something similar > for our ‘packages.json’. Any takers? :-) I am not sure to understand all but I will give a look... I am reading their GSoC about this topic [2]. [2] https://wiki.softwareheritage.org/wiki/Google_Summer_of_Code_2019/Increase_archive_coverage All the best, simon
Hello! zimoun <zimon.toutoune@gmail.com> skribis: > On Fri, 30 Aug 2019 at 01:18, Ludovic Courtès <ludo@gnu.org> wrote: > >> Currently, only 25% of our packages are not fetched with ‘url-fetch’. >> For the remaining 75%, this checker can only report whether the tarball >> is missing (and apart from ftp.gnu.org and a few other exceptions, it >> usually _is_ missing) and cannot actually save it. > > Maybe I miss something, but for example guile-2.0 is not yet archived. > I am not able to find it with their search resources. And `guix lint > -c archival guile@2.0' reports "guile@2.0.14: source not archived on > Software Heritage". Yeah, most not-too-recent tarballs from ftp.gnu.org are archived, so I don’t know why this one is missing. We’d have to check with them. > I agree with the words on #swh-deve by olasd (Nicolas Dandrimont) from > SWH that the automatic "save" should be optional (even if the default > is save=true). Maybe we could have a flag somewhere to turn it off? The good thing of having it on (or opt-out) is that we increase the chances that the code we care about is archived. :-) >> The second step will be to write a “lister” for Software Heritage that >> grabs the list of source code URLs from >> <https://guix.gnu.org/packages.json>. That could would run at SWH >> and it could potentially grab the tarballs, not just the VCS checkouts. >> Here’s are examples: >> >> https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/packagist/lister.py >> https://forge.softwareheritage.org/source/swh-lister/browse/master/swh/lister/gnu/lister.py >> >> It should be quite easy for a Pythonista to write something similar >> for our ‘packages.json’. Any takers? :-) > > I am not sure to understand all but I will give a look... I am reading > their GSoC about this topic [2]. Awesome, thank you! Having a “guix” lister in place would be perfect. Ludo’.
Hi Ludo, On Thu, 12 Sep 2019 at 09:41, Ludovic Courtès <ludo@gnu.org> wrote: > zimoun <zimon.toutoune@gmail.com> skribis: > > > On Fri, 30 Aug 2019 at 01:18, Ludovic Courtès <ludo@gnu.org> wrote: > > > >> Currently, only 25% of our packages are not fetched with ‘url-fetch’. > >> For the remaining 75%, this checker can only report whether the tarball > >> is missing (and apart from ftp.gnu.org and a few other exceptions, it > >> usually _is_ missing) and cannot actually save it. And it is interesting that Nix has the same stats. ;-) https://sympa.inria.fr/sympa/arc/swh-devel/2019-08/msg00024.html > > Maybe I miss something, but for example guile-2.0 is not yet archived. > > I am not able to find it with their search resources. And `guix lint > > -c archival guile@2.0' reports "guile@2.0.14: source not archived on > > Software Heritage". > > Yeah, most not-too-recent tarballs from ftp.gnu.org are archived, so I > don’t know why this one is missing. We’d have to check with them. Maybe I have wrong, but bunch of GNU packages seems missing. :-) > > I agree with the words on #swh-deve by olasd (Nicolas Dandrimont) from > > SWH that the automatic "save" should be optional (even if the default > > is save=true). > > Maybe we could have a flag somewhere to turn it off? The good thing of > having it on (or opt-out) is that we increase the chances that the code > we care about is archived. :-) I agree. :-) Speaking of UI, I would expect 2 different commands: - one to check if the package is in SWH, say: guix package <name> --is-in-swh - one to send a "save" request guix lint <name> -c archival And adding an option to turn "the push" off, say: guix lint <name> --no-archival Because when linting the process is generally iterative: guix lint <name> # fix mistake guix lint <name> # fix other mistake etc. and it will save network resource (latency, etc.) by avoiding to check again and again in this lint process; I guess. Or even something in this flavour should be a better UI: guix lint <name> --checkers=description,synopsis --no-checkers=license,archival What do you think? Cheers, simon
Hi! zimoun <zimon.toutoune@gmail.com> skribis: > Or even something in this flavour should be a better UI: > > guix lint <name> --checkers=description,synopsis > --no-checkers=license,archival > > What do you think? Good idea, this would be simple and effective! Thanks, Ludo’.