Message ID | 87a6czbzvh.fsf@kitej |
---|---|
State | Accepted |
Headers | show |
Series | [bug#54723] Check URI when verifying narinfo validity. | expand |
Hi, Guillaume Le Vaillant <glv@posteo.net> skribis: > When trying to upgrade a machine using a substitute server on the same > LAN, I get this crash a lot: > > # guix system reconfigure --substitute-urls="http://192.168.0.22:8080 https://ci.guix.gnu.org" /etc/guix/config.scm > substitute: mise à jour des substituts depuis « http://192.168.0.22:8080 »... 100.0 % [...] > It looks like the 'narinfo-uri' field is an empty list instead of a list > of URIs. Is that supposed to be possible? I don’t think so. Could you grab a narinfo and share it? wget -qO - http://192.168.0.22:8080/HASH.narinfo where HASH is the hash part of a store item. What could happen though is a situation where ‘guix publish’ only offers a compression method not supported by the client. In that case, ‘narinfo-best-uri’ throws a match-error because ‘choices’ is the empty list. We should fix that. > Does the the attached patch adding a check for the validity of > this field in the 'valid-narinfo?' function make sense? Maybe, but I’d like to make sure we understand the issue. Thanks, Ludo’.
Ludovic Courtès <ludo@gnu.org> skribis: > Hi, > > [...] > > Could you grab a narinfo and share it? > > wget -qO - http://192.168.0.22:8080/HASH.narinfo > > where HASH is the hash part of a store item. > > What could happen though is a situation where ‘guix publish’ only offers > a compression method not supported by the client. In that case, > ‘narinfo-best-uri’ throws a match-error because ‘choices’ is the empty > list. We should fix that. I tried downloading a few narinfos and they don't look broken (2 of them in attachment). However for one of them it looks like the guix-publish server got stuck on the request for several minutes (the second attempt worked): --8<---------------cut here---------------start------------->8--- wget http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo --2022-04-05 19:33:56-- http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo Connexion à 192.168.0.22:8080… connecté. requête HTTP transmise, en attente de la réponse… Erreur de lecture (Connexion ré-initialisée par le correspondant) dans les en-têtes. Nouvel essai. --2022-04-05 19:36:40-- (essai : 2) http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo Connexion à 192.168.0.22:8080… connecté. requête HTTP transmise, en attente de la réponse… 200 OK Taille : 7428 (7,3K) [application/x-nix-narinfo] Sauvegarde en : « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo 100%[=====================================================================================================>] 7,25K --.-KB/s ds 0,02s 2022-04-05 19:36:40 (391 KB/s) — « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » sauvegardé [7428/7428] --8<---------------cut here---------------end--------------->8--- Could that be the cause of the issue?
Hi, Guillaume Le Vaillant <glv@posteo.net> skribis: > However for one of them it looks like the guix-publish server got stuck > on the request for several minutes (the second attempt worked): > > wget http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo > --2022-04-05 19:33:56-- http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo > Connexion à 192.168.0.22:8080… connecté. > requête HTTP transmise, en attente de la réponse… Erreur de lecture (Connexion ré-initialisée par le correspondant) dans les en-têtes. > Nouvel essai. > > --2022-04-05 19:36:40-- (essai : 2) http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo > Connexion à 192.168.0.22:8080… connecté. > requête HTTP transmise, en attente de la réponse… 200 OK > Taille : 7428 (7,3K) [application/x-nix-narinfo] > Sauvegarde en : « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » > > 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo 100%[=====================================================================================================>] 7,25K --.-KB/s ds 0,02s > > 2022-04-05 19:36:40 (391 KB/s) — « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » sauvegardé [7428/7428] > > Could that be the cause of the issue? Yes, it could be the issue. Is there any clue as to why ‘guix publish’ hung up first? Something in its log maybe? Ludo’.
Ludovic Courtès <ludo@gnu.org> skribis: > Hi, > > Guillaume Le Vaillant <glv@posteo.net> skribis: > >> However for one of them it looks like the guix-publish server got stuck >> on the request for several minutes (the second attempt worked): >> >> wget http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo >> --2022-04-05 19:33:56-- http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo >> Connexion à 192.168.0.22:8080… connecté. >> requête HTTP transmise, en attente de la réponse… Erreur de lecture (Connexion ré-initialisée par le correspondant) dans les en-têtes. >> Nouvel essai. >> >> --2022-04-05 19:36:40-- (essai : 2) http://192.168.0.22:8080/184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo >> Connexion à 192.168.0.22:8080… connecté. >> requête HTTP transmise, en attente de la réponse… 200 OK >> Taille : 7428 (7,3K) [application/x-nix-narinfo] >> Sauvegarde en : « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » >> >> 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo >> 100%[=====================================================================================================>] >> 7,25K --.-KB/s ds 0,02s >> >> 2022-04-05 19:36:40 (391 KB/s) — « 184b50qkcrkchc6dpwwcd7ypqb5yvrm3.narinfo » sauvegardé [7428/7428] >> >> Could that be the cause of the issue? > > Yes, it could be the issue. Is there any clue as to why ‘guix publish’ > hung up first? Something in its log maybe? > > Ludo’. There are 2 errors that occur a lot in the guix-publish log files: - "In procedure fport_write: Broken pipe" It happens when trying to write to a socket apparently. - "In procedure sign: gcrypt: Cannot allocate memory" The machine has 64 GiB of RAM, of which at least 50 GiB is free, so gcrypt should have enough to make a signature... Note: full log file with complete error messages in attachment.
Guillaume Le Vaillant <glv@posteo.net> skribis: > There are 2 errors that occur a lot in the guix-publish log files: > > - "In procedure fport_write: Broken pipe" > It happens when trying to write to a socket apparently. > > - "In procedure sign: gcrypt: Cannot allocate memory" > The machine has 64 GiB of RAM, of which at least 50 GiB is free, so > gcrypt should have enough to make a signature... I captured the network traffic between the "guix publish" server and a "guix upgrade" client to see if the "broken pipe" errors could come from real networking issues. According to wireshark the problem doesn't come from there, the TCP stream didn't have any error. However, looking at the full TCP stream in wireshark I saw that the "guix publish" server sends some bad narinfo responses. Sometimes some parts of the response are missing (here, Signature incomplete, URL and Compression fields missing): --8<---------------cut here---------------start------------->8--- HTTP/1.1 200 OK Content-Length: 959 Content-Type: application/x-nix-narinfo;charset=UTF-8 StorePath: /gnu/store/dxpaqmix7zixm8pwcvvmq8q969q50jpp-pngload-2.0.0-2.91f1d70-checkout NarHash: sha256:0s94fdbrbqj12qvgyn2g4lfwvz7qhhzbclrpz5ni7adwxgrmvxl1 NarSize: 245224 References: Deriver: ybdimrfjs090kzmimf5j1x5hs8y4d24p-pngload-2.0.0-2.91f1d70-checkout.drv Signature: 1;kitej;KHNpZ25hdHVyZSAKIChkYXRhIAogIChmbGFncyByZmM2OTc5KQogIChoYXNoIHNoYTI1NiAjNDY3NDk2RTJEOTZBMzc0QzFGN0M1MzJCNjc3MTM1NzVFOTkyRjQ0Qzc3MzQwRDUwQTcyRTkyMDJGRURDQkQxMyMpCiAgKQogKHNpZy12YWwgCiAgKGVjZHNhIAogICAociAjMDZEQTAwMkQyNjE3MEQ3ODVDNkM3NkMyMUEwM0UzNDlCMkUwMDc4MTUyQzFBQURFNjhFMEZGOUJDRkUyMUFDNSMpCiAgIChzICMwNjNDM0UyNjg2MEU2OTIzNDdEMjNGNTQ4RUM3RDJGRUZGQjc0Q0I4NjNEMjlDMUE3QjA4REFCQjEzQjZDRjAxIykKICAgKQogICkKIChwdWJsaWMta2V5IAogIC --8<---------------cut here---------------end--------------->8--- Sometimes the response looks like almost complete garbage: --8<---------------cut here---------------start------------->8--- HTTP/1.1 200 OK Content-Length: 970 Content-Type: application/x-nix-narinfo;charsetcharsetHTTP/=UTF-8 1 1 1 .S --8<---------------cut here---------------end--------------->8--- When the client receives these bad narinfos, it often makes it crash with errors like: - Wrong type (expecting exact integer): #f - unmatched line "1\r" - Wrong type argument in position 1 (expecting pair): () So it looks like the broken pipe problem comes from the "guix publish" server, or from Guile... And making the code reconstructing narinfos from HTTP responses more robust in case of bad input would be useful.
Hi, Guillaume Le Vaillant <glv@posteo.net> skribis: > However, looking at the full TCP stream in wireshark I saw that the > "guix publish" server sends some bad narinfo responses. > Sometimes some parts of the response are missing (here, Signature > incomplete, URL and Compression fields missing): > > HTTP/1.1 200 OK > Content-Length: 959 > Content-Type: application/x-nix-narinfo;charset=UTF-8 > > StorePath: /gnu/store/dxpaqmix7zixm8pwcvvmq8q969q50jpp-pngload-2.0.0-2.91f1d70-checkout > NarHash: sha256:0s94fdbrbqj12qvgyn2g4lfwvz7qhhzbclrpz5ni7adwxgrmvxl1 > NarSize: 245224 > References: > Deriver: ybdimrfjs090kzmimf5j1x5hs8y4d24p-pngload-2.0.0-2.91f1d70-checkout.drv > Signature: 1;kitej;KHNpZ25hdHVyZSAKIChkYXRhIAogIChmbGFncyByZmM2OTc5KQogIChoYXNoIHNoYTI1NiAjNDY3NDk2RTJEOTZBMzc0QzFGN0M1MzJCNjc3MTM1NzVFOTkyRjQ0Qzc3MzQwRDUwQTcyRTkyMDJGRURDQkQxMyMpCiAgKQogKHNpZy12YWwgCiAgKGVjZHNhIAogICAociAjMDZEQTAwMkQyNjE3MEQ3ODVDNkM3NkMyMUEwM0UzNDlCMkUwMDc4MTUyQzFBQURFNjhFMEZGOUJDRkUyMUFDNSMpCiAgIChzICMwNjNDM0UyNjg2MEU2OTIzNDdEMjNGNTQ4RUM3RDJGRUZGQjc0Q0I4NjNEMjlDMUE3QjA4REFCQjEzQjZDRjAxIykKICAgKQogICkKIChwdWJsaWMta2V5IAogIC > > > Sometimes the response looks like almost complete garbage: > > HTTP/1.1 200 OK > Content-Length: 970 > Content-Type: application/x-nix-narinfo;charsetcharsetHTTP/=UTF-8 > > 1 > 1 > > 1 > .S > > When the client receives these bad narinfos, it often makes it crash > with errors like: > - Wrong type (expecting exact integer): #f > - unmatched line "1\r" > - Wrong type argument in position 1 (expecting pair): () Woow. How do you build and run ‘guix publish’? Is it a distro package or is it coming straight from Guix? What command-line options are you passing? I’ve never seen this, although we have it running on several servers, notably ci.guix. I wonder what could cause this. Thanks, Ludo’.
Ludovic Courtès <ludo@gnu.org> skribis: > Woow. How do you build and run ‘guix publish’? Is it a distro package > or is it coming straight from Guix? What command-line options are you > passing? > > I’ve never seen this, although we have it running on several servers, > notably ci.guix. I wonder what could cause this. > > Thanks, > Ludo’. I'm using guix-publish-service-type in the operating-system definition to manage the "guix publish" server, using the on-the-fly mode and fast Zstandard compression: --8<---------------cut here---------------start------------->8--- (service guix-publish-service-type (guix-publish-configuration (host "0.0.0.0") (port 8080) (compression '(("zstd" 3))) (advertise? #t))) --8<---------------cut here---------------end--------------->8--- When booting the machine, shepherd starts the server with the following command-line options: --8<---------------cut here---------------start------------->8--- /gnu/store/059svbd32i4s0l9s5i7z0krcnl666bjy-guix-1.3.0-24.2fb4304/libexec/guix/guile \ /gnu/store/059svbd32i4s0l9s5i7z0krcnl666bjy-guix-1.3.0-24.2fb4304/bin/guix publish -u guix-publish -p 8080 -C zstd:3 --nar-path=nar --listen=0.0.0.0 --advertise --8<---------------cut here---------------end--------------->8--- There's another report about this at <https://issues.guix.gnu.org/53668> I had forgotten about, where Simon Streit and Maxim Cournoyer indicate that they have seen this issue too.
From 8d9a45b2f38809fb3acfacf6f83532b7b556e78c Mon Sep 17 00:00:00 2001 From: Guillaume Le Vaillant <glv@posteo.net> Date: Tue, 5 Apr 2022 11:50:48 +0200 Subject: [PATCH] narinfo: Check URI when verifying narinfo validity. * guix/narinfo.scm (valid-narinfo?): Check if the 'uri' field is valid. --- guix/narinfo.scm | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/guix/narinfo.scm b/guix/narinfo.scm index 4fc550aa6c..466ce20deb 100644 --- a/guix/narinfo.scm +++ b/guix/narinfo.scm @@ -209,11 +209,13 @@ (define %mandatory-fields (define* (valid-narinfo? narinfo #:optional (acl (current-acl)) #:key verbose?) - "Return #t if NARINFO's signature is not valid." + "Return #t if NARINFO's signature is valid." (let ((hash (narinfo-sha256 narinfo)) (signature (narinfo-signature narinfo)) - (uri (uri->string (first (narinfo-uris narinfo))))) - (and hash signature + (uri (if (null? (narinfo-uris narinfo)) + #f + (uri->string (first (narinfo-uris narinfo)))))) + (and hash signature uri (signature-case (signature hash acl) (valid-signature #t) (invalid-signature -- 2.35.1