Message ID | cover.1706287537.git.ludo@gnu.org |
---|---|
Headers | show |
Series | Content-addressed downloads from Software Heritage | expand |
Oops, I forgot to Cc: the fine people for the cover letter; fixed! See <https://issues.guix.gnu.org/68741>. Ludovic Courtès <ludo@gnu.org> skribis: > Hello Guix! > > For those who’ve been following along, you might remember that the > main impedance mismatch between SWH and Guix is that SWH uses Git > tree SHA1 hashes to identify directories whereas Guix uses nar SHA256 > hashes (and possibly other hash functions in the future): > > https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/ > > Because of this, the SWH fallback path for ‘git-download’ had two > options: > > 1. If ‘git-reference’ specifies a full SHA1 commit ID, it would > look it up on SWH and fetch it. > > 2. If ‘git-reference’ specifies a tag, which is perhaps the > majority of cases, Guix would ask SWH the commit that once > corresponded to that tag at that URL, and then fetch it. > > Case #1 is ideal: it’s content-addressed. Case #2 is brittle: we’re > hoping that the tag hasn’t been modified and that the URL hasn’t been > reused for something else; if that’s not the case, SWH might return > the “wrong” commit and we end up fetching something unrelated. > > The good news is that our friends at SWH have just deployed a new > version of their code that lets us look up directories by some > “external identifier” (“ExtID”), among which there’s ‘nar-sha256’: > > https://archive.softwareheritage.org/api/1/extid/doc/ > > And that, my friends, makes a huge difference: the impedance mismatch > is gone, we can now use content-addressing to fetch our stuff from SWH!! > And that works not just for Git, but also for Mercurial, SVN, CVS, etc. > > Well, there’s a caveat: currently the ‘nar-sha256’ is added only on > new visits and it’s apparently not being added yet for Mercurial for > unclear reasons. So right now, we can get guile-sqlite3 0.1.3 (Git) by > nar-sha256, but we cannot get guile-wisp (hg) nor in fact most things. > That’ll improve over time though, and SWH comrades are open to adding > those ExtIDs retroactively. > > The patches that follow do several things: > > 1. Follow redirects in the Vault: (guix swh) previously did not > do that (oops!) but the newly-deployed Vault now responds with > 302 redirects so we have to handle that. > > 2. Add bindings for the ExtID HTTP interface. > > 3. Add ‘swh-download-directory-by-nar-hash’, which does what it > says. > > 4. Use that as the preferred fallback method for ‘git-fetch’. > > Here’s a REPLshot: > > scheme@(guile-user)> (lookup-external-id "nar-sha256" (content-hash-value(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) ) > $43 = #<<external-id> value: "0b56ba94c2b83b8f74e3772887c1109135802eb3e8962b628377987fe97e1e63" type: "nar-sha256" version: 0 target: "swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153" target-url: "https://archive.softwareheritage.org/swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153"> > scheme@(guile-user)> (swh-download-directory-by-nar-hash (content-hash-value(origin-hash (package-source (@ (gnu packages guile) guile-sqlite3)))) 'sha256 "/tmp/gsql") > SWH: found directory with nar-sha256 hash 0b56ba94c2b83b8f74e3772887c1109135802eb3e8962b628377987fe97e1e63 at 'swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153' > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/.gitignore > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/AUTHORS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/COPYING.LESSER > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/ChangeLog > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/Makefile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/NEWS > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/README > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/guile.am > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/build-aux/test-driver.scm > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/configure.ac > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/env.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/sqlite3.scm.in > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/ > swh:1:dir:84a8b34591712c0a90bab0af604188bcd1fe3153/tests/basic.scm > $46 = #t > > Huge thanks to everyone over at #swh-devel for helping me out > over the past few days! > > Next tasks: implement download fallback for ‘hg-fetch’, change > ‘guix lint -c archival’ to make ‘save-origin’ requests not just > for Git repos, assess the situation with SVN and sub-directories > to see what can be done. > > Thoughts? > > Ludo’. > > PS: Apologies for the wall of text! > > Ludovic Courtès (6): > swh: ‘vault-fetch’ follows redirects. > swh: Add bindings for the “ExtID” API. > swh: Add ‘swh-download-directory-by-nar-hash’. > lint: archival: Check with ‘lookup-directory-by-nar-hash’. > git-download: Download from SWH by nar hash when possible. > swh: Fix docstring of ‘lookup-directory’. > > guix/build/git.scm | 20 ++++-- > guix/git-download.scm | 4 +- > guix/lint.scm | 28 +++++--- > guix/scripts/perform-download.scm | 4 +- > guix/swh.scm | 113 ++++++++++++++++++++++++++---- > tests/lint.scm | 33 +++++++-- > tests/swh.scm | 21 +++++- > 7 files changed, 189 insertions(+), 34 deletions(-) > > > base-commit: 8bee6bb9aaaf35c36fe325675d1eb2daebd69c25
Hi, Ludovic Courtès <ludo@gnu.org> skribis: > swh: ‘vault-fetch’ follows redirects. > swh: Add bindings for the “ExtID” API. > swh: Add ‘swh-download-directory-by-nar-hash’. > lint: archival: Check with ‘lookup-directory-by-nar-hash’. > git-download: Download from SWH by nar hash when possible. > swh: Fix docstring of ‘lookup-directory’. Pushed as 5a61ce6bcfbd0882956e40457232da737776abe7. > Next tasks: implement download fallback for ‘hg-fetch’, change > ‘guix lint -c archival’ to make ‘save-origin’ requests not just > for Git repos, assess the situation with SVN and sub-directories > to see what can be done. Let’s make it happen! Ludo’.