Message ID | 20220503114301.9524-7-attila@lendvai.name |
---|---|
State | New |
Headers | show |
Series | None | expand |
Attila Lendvai schreef op di 03-05-2022 om 13:42 [+0200]: > + (let ((port (current-warning-port))) > + (format port "Unexpected error, will skip ~S.~%reason: " > + package-name) > + ;; Printing a backtrace here is not very useful: it is > + ;; cut off because GUARD unwinds. > + (print-exception port (stack-ref (make-stack #t) 1) > + c (exception-args c)) > + (display-backtrace (make-stack #t) port)) Why are unknown errors being catched here? Greetings, Maxime.
Attila Lendvai schreef op di 03-05-2022 om 13:42 [+0200]: > + (with-exception-handler > + (lambda (c) > + (when report-all-errors? > + (let ((port (current-warning-port))) > + (format port "*** exception while importing:~%") > + (print-exception port (stack-ref (make-stack #t) 1) > + c (exception-args c)) > + (format port "*** printing backtrace:~%") > + (display-backtrace (make-stack #t) port) > + ;; DISPLAY-BACKTRACE can fail, so it's better to make its > + ;; exit also visible. > + (format port "*** done printing backtrace~%"))) > + (raise-continuable c)) What's the '-continuable' for here? Is it to avoid extra 'raise- exception' entries in the backtrace, or does this actually make use of Scheme's continuable exceptions? Greetings, Maxime.
Attila Lendvai schreef op di 03-05-2022 om 13:42 [+0200]: > * guix/import/go.scm (go-module->guix-package*): Catch exceptions and retry in > case of network errors. Add ability to enable printing a full backtrace when > debugging. > (go-module->guix-package): Tolerate the inability to fetch the synopsis and > description. Ok, but couldn't this be done more generally, for all importers using HTTP? E.g., could we have a variant 'http-fetch/retrying' of 'http- fetch' that automatically retries a number of times? Then the retrying functionality could almost trivially be used in other HTTP-using importers as well. Greetings, Maxime.
Attila Lendvai schreef op di 03-05-2022 om 13:42 [+0200]: > + (warning (G_ "Failing to import package ~S. > +reason: ~A.~%") Why a warning instead of a 'report-error'? Greetings, Maxime.
> > + (let ((port (current-warning-port))) > > + (format port "Unexpected error, will skip ~S.~%reason: " > > + package-name) > > + ;; Printing a backtrace here is not very useful: it is > > + ;; cut off because GUARD unwinds. > > + (print-exception port (stack-ref (make-stack #t) 1) > > + c (exception-args c)) > > + (display-backtrace (make-stack #t) port)) > > > Why are unknown errors being catched here? *all* errors are caught, displayed, and swallowed, so that the importing can proceed to the rest of the packages. a possible scenario: a large import can take several minutes. if there's a transient network error, then this way it may finish with 99% of the packages, and the rest can be restarted by hand after inspecting the log output. another scenario is that the importer is simply buggy, and the user gets 99% of the packages imported, and has to do only one package by hand. > What's the '-continuable' for here? Is it to avoid extra 'raise- > exception' entries in the backtrace, or does this actually make use > of Scheme's continuable exceptions? honestly, i can't remember. which translates to: i should have commented on that! *shakes head and makes a mental note* from guile's manual: "If continuable? is true, the handler is invoked in tail position relative to the raise-exception call. Otherwise if the handler returns, a non-continuable exception of type &non-continuable is raised in the same dynamic environment as the handler." i.e. it should rather be called raise-continuably not raise-continuable. and i think the reason is that the stack is not unwound that way, so that the handlers in the guard can print a backtrace showing the entire stack. does this make sense? i'm still learning scheme's stack/exception handling... or maybe what i wanted is that the handlers in the guard can return with (values #f '())? but i think that should work with vanilla RAISE, too. > Why a warning instead of a 'report-error'? because i want the importer to be able to deal with a transitive closures of dependencies with > 400 entries. if it fails at the first error, then a `guix import go -r` becomes a hopeless endeavor for larger go modules. thanks for the review Maxime! i'll address the rest, too. -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “'Emergencies' have always been the pretext on which the safeguards of individual liberty have been eroded.” — F. A. Hayek (1899–1992)
Attila Lendvai schreef op di 03-05-2022 om 17:00 [+0000]: > a possible scenario: a large import can take several minutes. > if there's a transient network error, then this way it may finish > with 99% of the packages, and the rest can be restarted by hand after > inspecting the log output. Catching (presumably) transient network errors and reporting them with 'report-network-error' and the like: no problem. > another scenario is that the importer is simply buggy, ... but if there's some kind of bug, why not just report and fix it? > and the user gets 99% of the packages imported, and has to do only > one package by hand. If the bug is fixed, then no packages need to be done by hand and other people will benefit as well. Whereas if it's only for 1% and the backtrace only happens for recursive import and not for non-recursive or manual import, then I'd think that the chance that the user actually reports or fixes the bug decreases a lot. Greetings, Maxime.
> ... but if there's some kind of bug, why not just report and fix it? realistically? because there are years old issues, and long bitrotten patches in every project's issue tracker, Guix included. i think an importer, that does proper logging, and finishes with a 99% correct result, is worth much more than one that barks at the first error, and then refuses to continue processing the 100+ remaining entries. this was true even in my own case, when i also had the maintainer hat on, i.e. i was already neck deep into fixing the importer. > > and the user gets 99% of the packages imported, and has to do only > > one package by hand. > > > If the bug is fixed, then no packages need to be done by hand and other > people will benefit as well. Whereas if it's only for 1% and the > backtrace only happens for recursive import and not for non-recursive > or manual import, then I'd think that the chance that the user actually > reports or fixes the bug decreases a lot. the tradeoff is this: what leads to a lower level of overall annoyance of all the parties involved: 1) have a tool that is less useful, and through that it annoys the users to report issues, which then (hopefully) thrusts some people towards improving the tool... or 2) have a tool that does its best to produce the most useful output. short of hard data, this can be argued forever. my intuition is clearly pointing towards 2), because i doubt that the choking point of improving the importer is due to not having enough examples on which it breaks. does it make enough sense to justify my proposed behavior? -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “Communism doesn’t work because people like to own stuff.” — Frank Zappa (1940–1993)
Attila Lendvai schreef op ma 09-05-2022 om 12:34 [+0000]: > [...] does it make enough sense to justify my proposed behavior? (For a positive note, skip to the last paragraph!) If we do something like this (*) (this = catching errors for dependencies and skipping those packages), I don't want this to be something Go-specific. Could this be done in 'recursive-import' or something like that instead, such that all importers benefit and Go isn't some weird special case? Also, given that we appear to have a number of people effectively maintaining the Go importer ("git log guix/import.go.scm" mentions a few ‘Maxim Cournoyer’, ‘Ludovic Courtès’ ‘Xinglu Chen’ and ‘Sarah Morgensen’, and more recently maybe you?) and I don't see any open old Go importer bugs/patches on issues.guix.gnu.org, I don't think we have to worry about > realistically? because there are years old issues, and long bitrotten > patches in every project's issue tracker, Guix included. in this case. > short of hard data, this can be argued forever. my intuition is > clearly pointing towards 2), because i doubt that the choking point > of improving the importer is due to not having enough examples on > which it breaks. If you know of some examples on which the Go importer broke, could you add them to tests/go, to make sure it won't break again later (regression), in case of future changes? Anyway, let's take a look at for which kind of things I think of in case of ‘buggy importer’: 1. (Maybe [PATCH 04/10]?) The parser for a file format cannot parse <something>, so it throws an exception. (I consider this more a ‘limitation’ than a ‘bug’.) For this kind of thing, maybe we can just throw a ‘oops I don't understand the file format’ exception, report an appropriate error message (package name + maybe line number, but no need for a backtrace), skip that package and continue. That way, the Go importer will be to some degree robust to future changes. That's already done in this patch series, except for the ‘no need for a backtrace’ part and this patch series catching all exceptions instead of just networking errors, ‘I don't understand the file format’ errors, ... 2. The Go importer misinterprets some kind of field or such. These kind of things don't (typically?) cause errors, so skipping. (Still a bug though!) 3. The Go importer cannot produce a package definition because it doesn't support <some Go packaging method>. Similar (1). 4. The Go importer has some kind of type error or such. This case looks like an actual bug to me, where I think that catching the resulting exceptions is just sweeping things under the rug, implicitly teaching other people that it's ok to just ignore problems without making any attempt of fixing them or reporting them. Vaguely similar to ‘broken windows’ (<https://en.wikipedia.org/wiki/Broken_windows_theory>). Sure, for a single particular user and short term this makes the importer a bit less useful, but knowing bugs is a prerequisite for fixing them, and long term, by actually knowing about the bugs and fixing them, more users will benefit. That said, like "guix build" has a "--keep-going" option to temporarily ignore to some degree packages that fail to build, maybe "guix import" (when using --recursive) can have a "--keep-going" option. Greetings, Maxime.
> Also, given that we appear to have a number of people effectively > maintaining the Go importer ("git log guix/import.go.scm" mentions a > few ‘Maxim Cournoyer’, ‘Ludovic Courtès’ ‘Xinglu Chen’ and ‘Sarah > Morgensen’, and more recently maybe you?) and I don't see any open note that the list of people *using* the go importer is probably not all that much larger. > old Go importer bugs/patches on issues.guix.gnu.org, I don't think > we have to worry about the fact that the go importer bugs were not even reported into the issue tracker doesn't change much about the fact that i have experienced several problems when i first tried to recursively import a non-trivial go project (which even included issues with http-fetch). it means that all those bug were there, not hidden by exception handling, and yet they have *not* been reported into the issue tracker. or to put it another way, i think there are just too many unverified assumptions in this discussion about actual user behavior. my strategy in those situations is to expose as much info as feasible, and let the users respond intelligently to the situation. > 4. The Go importer has some kind of type error or such. > > This case looks like an actual bug to me, where I think that > catching the resulting exceptions is just sweeping things under > the rug, implicitly teaching other people that it's ok to just > ignore problems without making any attempt of fixing them or > reporting them. Vaguely similar to ‘broken windows’ > (https://en.wikipedia.org/wiki/Broken_windows_theory). note that this patch does not just catch errors, but handles them by emitting useful information about them, and continues only after that. i consider silently swallowing errors a major sin (read: code that will certainly eat up some portion of someone's life in the future). the only place where this patch silently swallows errors is in the extraction of descriprions and such, i.e. data that the user must review anyway. > Sure, for a single particular user and short term this makes the > importer a bit less useful, but knowing bugs is a prerequisite for > fixing them, and long term, by actually knowing about the bugs and > fixing them, more users will benefit. this patch does not hide any bugs, only enables the code to continue by skipping a package. and as a bonus: i wouldn't be surprised if it emitted *more* useful information than when allowing the exception to pass through to the caller, because error handling in the guix codebase could use a bit more love. > That said, like "guix build" has a "--keep-going" option to temporarily > ignore to some degree packages that fail to build, maybe "guix import" > (when using --recursive) can have a "--keep-going" option. ok, i'll look into putting this feature behind a CLI argument, and generalize it to all importers. as time allows i'll get back to the rest of the questions/suggestions, too. thanks for the feedback Maxime! -- • attila lendvai • PGP: 963F 5D5F 45C7 DFCD 0A39 -- “A great civilization is not conquered from without until it destroys itself from within.” — Will Durant
Attila Lendvai schreef op ma 09-05-2022 om 20:02 [+0000]: > note that this patch does not just catch errors, but handles them by emitting useful information about them, and continues only after that. I think such errors are easily overlooked, but yes, and I don't have any actual data on that. > i consider silently swallowing errors a major sin (read: code that will certainly eat up some portion of someone's life in the future). > > the only place where this patch silently swallows errors is in the extraction of descriprions and such, i.e. data that the user must review anyway. > [...] I think I've overstated the ‘swallows errors etc.’ a bit, given that if "guix import go --recursive <foo>" skips a package <bar>, then the user will have to package its dependency <bar> anyway, and for that the user could first try "guix import go <bar>" (reporting errors on bug-guix@ or #guix) before fixing/extending the importer or doing things manually. Greetings, Maxime.
diff --git a/guix/import/go.scm b/guix/import/go.scm index a08005d090..6871132a70 100644 --- a/guix/import/go.scm +++ b/guix/import/go.scm @@ -23,7 +23,15 @@ ;;; You should have received a copy of the GNU General Public License ;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>. +;;; goproxy protocol: +;;; https://golang.org/ref/mod#module-proxy +;;; https://docs.gomods.io/intro/protocol/ +;;; https://roberto.selbach.ca/go-proxies/ + (define-module (guix import go) + #:use-module (git) + #:use-module (git structs) + #:use-module (git errors) #:use-module (guix build-system go) #:use-module (guix git) #:use-module (guix hash) @@ -51,6 +59,8 @@ (define-module (guix import go) #:autoload (guix build utils) (mkdir-p dump-port) #:autoload (gcrypt hash) (hash-algorithm sha256) #:use-module (ice-9 format) + #:use-module (ice-9 control) + #:use-module (ice-9 exceptions) #:use-module (ice-9 match) #:use-module (ice-9 peg) #:use-module (ice-9 rdelim) @@ -629,10 +639,14 @@ (define (go-import->module-meta content-text) (make-module-meta root-path (string->symbol vcs) (strip-.git-suffix/maybe repo-url))))) ;; <meta name="go-import" content="import-prefix vcs repo-root"> - (let* ((meta-data (http-fetch* (format #f "https://~a?go-get=1" module-path))) + (let* ((url (format #f "https://~a?go-get=1" module-path)) + (meta-data (http-fetch* url #:accept-all-response-codes? #true)) (select (sxpath `(// (meta (@ (equal? (name "go-import")))) - // content)))) - (match (select (html->sxml meta-data #:strict? #t)) + // content))) + (sxml (html->sxml meta-data #:strict? #t)) + (selected (select sxml))) + (log.debug "The fetched meta-data from ~A is:~%~S" url selected) + (match selected (() #f) ;nothing selected ((('content content-text) ..1) (or @@ -809,22 +823,76 @@ (define* (go-module->guix-package module-path #:key dependencies+versions dependencies)))) -(define go-module->guix-package* - (lambda args - ;; Disable output buffering so that the following warning gets printed - ;; consistently. - (setvbuf (current-error-port) 'none) - (let ((package-name (match args ((name _ ...) name)))) - (guard (c ((http-get-error? c) - (warning (G_ "Failed to import package ~s. -reason: ~s could not be fetched: HTTP error ~a (~s). +(define (go-module->guix-package* . args) + ;; Disable output buffering so that the following warning gets printed + ;; consistently. + (setvbuf (current-error-port) 'none) + (setvbuf (current-warning-port) 'none) + (let* ((package-name (match args ((name _ ...) name))) + ;; Use report-all-errors? for debugging purposes only, because + ;; e.g. getaddrinfo is not reentrant and therefore we must unwind + ;; before retrying. + (report-all-errors? #false) + (report-network-error + (lambda (reason) + (warning (G_ "Failing to import package ~S. +reason: ~A.~%") + package-name reason)))) + (let loop ((attempts 0)) + (when (> attempts 0) + (sleep 3) + (log.info "~%Retrying, attempt ~s." attempts)) + (cond + ((> attempts 60) + (warning (G_ "Giving up on importing package ~s. This package and its dependencies won't be imported.~%") - package-name - (uri->string (http-get-error-uri c)) - (http-get-error-code c) - (http-get-error-reason c)) - (values #f '()))) - (apply go-module->guix-package args))))) + package-name) + (values #f '())) + (else + (guard (c ((http-get-error? c) + (report-network-error + (format #f "~s could not be fetched: HTTP error ~a (~s)" + (uri->string (http-get-error-uri c)) + (http-get-error-code c) + (http-get-error-reason c))) + (loop (+ 1 attempts))) + ((eq? (exception-kind c) + 'getaddrinfo-error) + (report-network-error "DNS lookup failed") + (loop (+ 1 attempts))) + ((and (eq? (exception-kind c) + 'git-error) + (eq? (git-error-class (first (exception-args c))) + GITERR_NET)) + (report-network-error "network error coming from git") + (loop (+ 1 attempts))) + (else + (let ((port (current-warning-port))) + (format port "Unexpected error, will skip ~S.~%reason: " + package-name) + ;; Printing a backtrace here is not very useful: it is + ;; cut off because GUARD unwinds. + (print-exception port (stack-ref (make-stack #t) 1) + c (exception-args c)) + (display-backtrace (make-stack #t) port)) + ;; give up on this entry + (values #f '()))) + (with-exception-handler + (lambda (c) + (when report-all-errors? + (let ((port (current-warning-port))) + (format port "*** exception while importing:~%") + (print-exception port (stack-ref (make-stack #t) 1) + c (exception-args c)) + (format port "*** printing backtrace:~%") + (display-backtrace (make-stack #t) port) + ;; DISPLAY-BACKTRACE can fail, so it's better to make its + ;; exit also visible. + (format port "*** done printing backtrace~%"))) + (raise-continuable c)) + (lambda () + (apply go-module->guix-package args)) + #:unwind? (not report-all-errors?)))))))) (define* (go-module-recursive-import package-name #:key (goproxy "https://proxy.golang.org")