Message ID | 5c72bcb9c86934deda97d952eb5cd459e615b313.camel@student.kuleuven.be |
---|---|
Headers | show |
Dear, Thank you for the patch. My questions are totally naive since I do not know much about GNUnet. On Sat, 24 Oct 2020 at 21:47, Maxime Devos <maxime.devos@student.kuleuven.be> wrote: > This patch defines a `gnunet-fetch' method, allowing for downloading > files from GNUnet by their GNUnet chk-URI. > > This patch does not provide: > - a service configuration > - downloading substitutes from GNUnet > - fall-back to non-P2P (e.g. http://) or other P2P (e.g. ipfs://) > systems > - downloading directories over GNUnet This means it only works for archives as tarball, right? > - actual packages definitions using this method > > Some issues and questions: [...] > - Would it be possible somehow for url-fetch to support gnunet://fs/chk > URIs? That way we could fall-back unto non-P2P URLs, which would be > useful to bootstrap a P2P distribution from a non-P2P system. Who is the “we”? What do you mean by “url-fetch supports gnunet:// and fall-back unto non-P2P”? Some recent discussions are about content-address and fallback. For example, roughly speaking ’git-fetch’ tries upstream, then the Guix build farms, then Software Heritage (swh). For Git repo, it works because the address from Guix side to SWH is straightforward. The 2 other VCS –hg and svn– supported by SWH should be implemented soon… who knows! ;-) The story about archives as tarball is a bit more complicated. The main issue –as I understand it– can be summarized as: Guix knows the URL, the integrity checksum and only at package time the content of the tarball. Later in time, it is difficult to lookup because of this very address; and some are around: nar, swh-id, ipfs, gnunet, etc. Bridges to reassemble the content are currently discussed, e.g., <https://git.ngyro.com/disarchive-db/> <https://git.ngyro.com/disarchive> Well, today the fallback of tarball archive to SWH is not reliable. What is your question? ;-) > Then publish the source tarball of the package to the GNUnet FS system: > $ guix environment --ad-hoc wget -- wget > https://ftp.gnu.org/gnu/hello/hello-2.10.tar.gz > $ gnunet-publish hello-2.10.tar.gz Naive question: are packages only available on GNUnet? All the best, simon
[CC'd to Timothy Sample because of discussion of defining a new format for disarchive, and to gnunet-developers because of obvious reasons] A small status update! zimoun schreef op di 27-10-2020 om 14:39 [+0100]: > [...] > > The story about archives as tarball is a bit more complicated. The > main > issue –as I understand it– can be summarized as: Guix knows the URL, > the > integrity checksum and only at package time the content of the > tarball. > Later in time, it is difficult to lookup because of this very > address; > and some are around: nar, swh-id, ipfs, gnunet, etc. > > Bridges to reassemble the content are currently discussed, e.g., > > <https://git.ngyro.com/disarchive-db/> > <https://git.ngyro.com/disarchive> > > Well, today the fallback of tarball archive to SWH is not reliable. > > > What is your question? ;-) I looked a bit into the GNUnet FS code and disarchive discussions. The part about tarballs seemed particularily relevant, as well as some older discussion on preserving the executable bit when using IPFS. Some issues with using GNUnet's directory format in GNUnet for Guix substitutes to address: * directory entries are not placed in any particular order. Solution: sort by file-name * there is no executable bit. Solution: define a new metadata property (*). This should only take a small patch to libextractor. (*) Not sure about the correct terminology * GNUnet sometimes inlines small files in directories, but strictly speaking when to do so is left up to the implementation. Solution: pick a fixed reference implementation. * By default, when publishing, gnunet-publish uses libextractor to figure out some meta-data (e.g. title, mime-type, album name), which may return different meta-data depending on the implementation. Solution: disable the use of libextractor, at least when GNUnet is used by Guix. I'm currently porting the directory creation code of GNUnet to Scheme (but not any other GNUnet code), to be used by Guix (for publishing substitutes) and disarchive (for reconstructing GNUnet directories). After addressing these issues, I believe I will end up with a fairly well-defined archive format. <friendly-footer/>
Hi Maxime, Maxime Devos <maxime.devos@student.kuleuven.be> skribis: > This patch defines a `gnunet-fetch' method, allowing for downloading > files from GNUnet by their GNUnet chk-URI. While I think this is a laudable goal, I’m reluctant to including GNUnet support just yet because, as stated in recent release announcements, GNUnet is still in flux and not considered “production ready”. So I think we should keep it around and revisit this issue when GNUnet is considered “stable”. WDYT? Thanks, Ludo’.
Hi Maxim, Thanks for your detailed answer. You might be interested by the coming oneline Guix Days conference on Sun. 22nd 2020. A session is specifically dedicated to a related topic: How to distribute P2P? Please browse this week the blog guix.gnu.org for the details. On Tue, 27 Oct 2020 at 19:50, Maxime Devos <maxime.devos@student.kuleuven.be> wrote: > Q: How does Guix figure out the GNUnet URI from the Guix (or nar, I > guess) hash? > A: Not automatically. The gnunet-fetch method as defined in this > patch needs to be passed the URI manually. However, an additional > service for GNUnet can be written that uses the DHT to map Guix > (or nar, or something else) hashes to corresponding GNUnet URI's. From my understanding, this is a show stopper. It has to be solved first going further, IMHO. It is not possible to write manually the URI for all the packages. And as perhaps you read with the project ’disassemble’, it is not straightforward. > Q: What about automatically generated tarballs (e.g. from git > repositories)? > A: Not addressed by this patch. The intention is to be able to replace > a http://*/*.tar.gz URL with a gnunet://fs/chk URI in package > definitions; source code repositories aren't supported by this > patch. (But perhaps a future patch could support this!) I think this is the main issue: it is not affordable to replace for some packages the current http:// by gnunet://. Especially when GNUnet is not “stable“. > * Is package *source code* only available, on *GNUnet*? > > If someone published the source code (e.g. as a tarball) on GNUnet > with `gnunet-publish hello-2.10.tar.gz`, it is only published(*) on > GNUnet, and not somewhere else as well. > > (*) I don't know exactly when one can be reasonably sure the file > will *remain* available for some time when published, except for > keeping the GNUnet daemon running continuously. > > However, in practice, $MAINTAINER will publish the source code > somewhere else as well (e.g. <https://ftp.gnu.org> or perhaps ipfs). > This patch doesn't automatically publish source code of built or > downloaded packages on GNUnet, although that seems a useful service > to run as a daemon. Therefore the corollary question is: how many tarballs currently used as source by Guix are also available on GNUnet? Thank you for your interest. And again I invite you to join the discussion about P2P and Guix this Sunday 22nd. Read on the Guix blog. :-) All the best, simon
Hi, (btw it's Maxim*e*, not Maxim. The ‘e’ isn't pronounced but it's still written.) I'll try to address the various issues in separate e-mails. zimoun schreef op ma 16-11-2020 om 01:35 [+0100 > [snip] > From my understanding, this is a show stopper. It has to be solved first going further, IMHO. It is not possible to write manually the URI for all the packages. And as perhaps you read with the project ’disassemble’, it is not straightforward. I agree! I see three straightforward answers to this. a) Fancy Write a GNUnet service using the DHT to map the hashes used in origin specifications (*) to URI's for the FS system. To let the local contribution to the DHT survive peer restarts, maintain a database (e.g. SQlite) of (Guix hash -> GNUnet hash) (^), that is expanded with each successful source (or binary) substitution or build. (Alternatively, as the DHT isn't anonymous, place hash -> GNUnet hash references into some well-known name space. Then hash lookup + FS should automatically be anonymous when desired.) Possible issues: time out behaviour, the DHT is not anonymous. Annoyance: probably requires extending the build daemon. Perhaps try regular downloads (e.g. via HTTP/S, ftp, ...) in parallel with the GNUnet download after a configurable delay? Perhaps use a well-known GNUnet FS namespace instead of the DHT for anonymous downloads? (*) Also usable for package outputs, if the hash of the output is used and not the hash of the outputs (^) In case the database is full, delete some old entries b) Simple, slow introduction (no additional GNUnet services required) Extend (origin ...) with an optional gnunet-hash field. Adjust ‘guix download’, ‘guix refresh’ and ‘guix import’ to emit the gnunet-hash (%) field. Plumb this field to the guix daemon somehow. Same approach is possible for IPFS. As packages are updated and new packages are defined, given sufficient time, there will be more packages with a gnunet-hash field than not. (%) Computing the gnunet-hash of a directory doesn't require a full-fledged GNUnet installation. My scheme-gnunet repository is not very far from the point where it can convert file trees + libextractor metadata into bytevectors, without depending on C gnunet. A TODO: different zlib's would produce different bytevectors --> different GNUnet hash --> perhaps always use a single version. A TODO (for nix archives on GNUnet): define EXTRACTOR_METATYPE_EXECUTABLE (or mimetype: application/x-executable). Perhaps use mimetype: x-inode/symlink (or something like that) as well? Repository URL: https://notabug.org/mdevos/scheme-gnunet c) Not scalable, but may reduce network traffic to ci.guix.gnu.org & co Like in a) keep a database of known (Guix hash -> GNUnet FS URI). Perhaps make this available through a web interface or git repository ... wait, this sounds familiar ... this seems to fit well into the ‘disarchive’ project! Greetings, Maxime
Hi Maxime, On Wed, 18 Nov 2020 at 20:14, Maxime Devos <maxime.devos@student.kuleuven.be> wrote: > Ludovic Courtès schreef op zo 15-11-2020 om 22:13 [+0100]: >> [snip] >> While I think this is a laudable goal, I’m reluctant to including >> GNUnet >> support just yet because, as stated in recent release announcements, >> GNUnet is still in flux and not considered “production ready”. >> >> So I think we should keep it around and revisit this issue when >> GNUnet >> is considered “stable”. WDYT? > > Sounds reasonable to me. There are also a lot of missing parts: a > service definition for Guix System, findings substitutes, finding > sources by hash (the one Guix uses, not the GNUnet hash) ..., so it > isn't like my rudimentary patch was usable on large scale anyway. Therefore, I am closing. Feel free to reopen once GNUnet is considered as (more) “stable”. Thank you for your contribution. All the best, simon
Hi Matias (and Guix, which I've CC'ed), To Matias: a follow up message will follow. Unfortunately, I've just taken a pause from Guix+GNUnet hacking (though probably I'll occasionally resume hacking once in a while). Some things that work now: * The rehash service itself seems to work (https://notabug.org/mdevos/rehash). This is the service where peers add SHA512<->GNUnet FS URI mappings they discover (replace SHA512 by whatever Guix uses). * Unless I broke anything, the ‘remirror’ service (actually just a daemon implementing a web server to run locally) can proxy http: downloads. Proxying https: is a little difficult, as ‘remirror’ would need to play man-in-the-middle, but may be implemented eventually. Or maybe guix can be patched to (optionally) not use the CONNECT method for proxying https: downloads. There is no ‘offloading’ to GNUnet yet, though. * Perhaps a better approach for substitutes: In the ‘scheme-gnunet’ repository (https://notabug.org/mdevos/scheme-gnunet/src/master/ROADMAP.org), I've written a publish-store.scm and download-store.scm script, that respectively upload and download an item from the store/GNUnet FS (using the gnunet-publish and gnunet-download binaries). It's not plugged into the guix substituter and guix publish yet, though. I'm a bit at a loss how to do this properly, so I'm more-or-less waiting until (a future revision of) the IPFS patch is merged, and then I'll try to add GNUnet as ‘just another p2p system’. Greetings, Maxime