Message ID | 20220303211326.19884-1-ludo@gnu.org |
---|---|
Headers |
Return-Path: <guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org> X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 2B50027BBEA; Thu, 3 Mar 2022 21:29:57 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id D53ED27BBE9 for <patchwork@mira.cbaines.net>; Thu, 3 Mar 2022 21:29:56 +0000 (GMT) Received: from localhost ([::1]:60416 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org>) id 1nPt1I-00013i-39 for patchwork@mira.cbaines.net; Thu, 03 Mar 2022 16:29:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34984) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1nPslv-0001vo-5J for guix-patches@gnu.org; Thu, 03 Mar 2022 16:14:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:50027) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1nPslu-0006qG-Sj for guix-patches@gnu.org; Thu, 03 Mar 2022 16:14:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1nPslt-0002CD-Qu; Thu, 03 Mar 2022 16:14:01 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#54241] [PATCH 0/4] 'github' importer gracefully handles rate limiting Resent-From: Ludovic =?utf-8?q?Court=C3=A8s?= <ludo@gnu.org> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org> Resent-CC: mail@nicolasgoaziou.fr, guix-patches@gnu.org Resent-Date: Thu, 03 Mar 2022 21:14:01 +0000 Resent-Message-ID: <handler.54241.B.16463420188387@debbugs.gnu.org> Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 54241 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 54241@debbugs.gnu.org Cc: Ludovic =?utf-8?q?Court=C3=A8s?= <ludo@gnu.org>, Nicolas Goaziou <mail@nicolasgoaziou.fr>, X-Debbugs-Original-To: guix-patches@gnu.org X-Debbugs-Original-Xcc: Nicolas Goaziou <mail@nicolasgoaziou.fr>, Received: via spool by submit@debbugs.gnu.org id=B.16463420188387 (code B ref -1); Thu, 03 Mar 2022 21:14:01 +0000 Received: (at submit) by debbugs.gnu.org; 3 Mar 2022 21:13:38 +0000 Received: from localhost ([127.0.0.1]:43923 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces@debbugs.gnu.org>) id 1nPslW-0002BD-EZ for submit@debbugs.gnu.org; Thu, 03 Mar 2022 16:13:38 -0500 Received: from lists.gnu.org ([209.51.188.17]:56718) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@gnu.org>) id 1nPslT-0002B4-Vl for submit@debbugs.gnu.org; Thu, 03 Mar 2022 16:13:37 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34888) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@gnu.org>) id 1nPslT-0000DC-PJ for guix-patches@gnu.org; Thu, 03 Mar 2022 16:13:35 -0500 Received: from [2001:470:142:3::e] (port=58802 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <ludo@gnu.org>) id 1nPslT-0006ny-Co; Thu, 03 Mar 2022 16:13:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:Subject:To:From:in-reply-to: references; bh=ZhT0X9ZeOlFlFSMn0oLBwuMcsNNYusHsyznJj450XNw=; b=r2MQLtOj9yNVA1 bD3y613EqoKHzjDBJQG15P+sFLDWDPReUuAJiH8eMYxhA9ektw+/tpR8AIA9Y/d9CkYSmtj91Yl92 eQlwNd6Upn2xPCyMSPyt1tG+8/qdrnsQExqAU3QZ3gjC731gXGkgRQgRs4tHt7HKGMfxe9h4wcuas fdQkdRJ+712RZJVuycN8+r8RsMFx1+m/mCqu/wGYSHNmzMpC26uRr0Z0lD/1be6lExckYZ9iU8hDV n0L/RymWGqKJkwHg9QssKMLvLW9LN6ozXwpMBBoqVXlT/chSUL7X4JVGlh3l28rLBRBwpoZ3N99bt IEx3vnLv8ih20zB/yNtg==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:50078 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <ludo@gnu.org>) id 1nPslS-0006DC-Ut; Thu, 03 Mar 2022 16:13:35 -0500 From: Ludovic =?utf-8?q?Court=C3=A8s?= <ludo@gnu.org> Date: Thu, 3 Mar 2022 22:13:26 +0100 Message-Id: <20220303211326.19884-1-ludo@gnu.org> X-Mailer: git-send-email 2.34.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: <guix-patches.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-patches>, <mailto:guix-patches-request@gnu.org?subject=unsubscribe> List-Archive: <https://lists.gnu.org/archive/html/guix-patches> List-Post: <mailto:guix-patches@gnu.org> List-Help: <mailto:guix-patches-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-patches>, <mailto:guix-patches-request@gnu.org?subject=subscribe> Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" <guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org> X-getmail-retrieved-from-mailbox: Patches |
Series |
'github' importer gracefully handles rate limiting
|
|
Message
Ludovic Courtès
March 3, 2022, 9:13 p.m. UTC
Hi Guix! These patches address a famous complaint about “the GitHub problem” that affects ‘guix refresh’¹, shown here in its naked awfulness: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix refresh gnu/packages/zig.scm:32:13: zig would be upgraded from 0.9.0 to 0.9.1 [...] In guix/scripts/refresh.scm: 578:14 5 (_ _) In srfi/srfi-1.scm: 634:9 4 (for-each #<procedure 7fe85c9a8e00 at guix/scripts/refresh.scm:578:24 (t-916fdc98f4be2f1-1d48)> _) In guix/scripts/refresh.scm: 378:2 3 (check-for-package-update #<package redshift-wayland@1.12-1.7da875d gnu/packages/xdisorg.scm:1425 7fe85879e790> (#<<upstream…>) …) In guix/import/github.scm: 232:12 2 (latest-release _) In ice-9/boot-9.scm: 1685:16 1 (raise-exception _ #:continuable? _) 1685:16 0 (raise-exception _ #:continuable? _) ice-9/boot-9.scm:1685:16: In procedure raise-exception: Error downloading release information through the GitHub API. This may be fixed by using an access token and setting the environment variable GUIX_GITHUB_TOKEN, for instance one procured from https://github.com/settings/tokens --8<---------------cut here---------------end--------------->8--- With this change, ‘guix refresh’ warns you when the GitHub rate limit is reached, but it keeps going, falling back to the ‘generic-git’ updater if it’s among the applicable updaters: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix refresh -t github,generic-git [...] guix refresh: warning: GitHub rate limit exceeded; disallowing requests for 1477 seconds hint: You can waive the rate limit by setting the `GUIX_GITHUB_TOKEN' environment variable to a token obtained from `https://github.com/settings/tokens' with your GitHub account. Alternatively, you can wait until your rate limit is reset, or use the `generic-git' updater instead. gnu/packages/zile.scm:113:14: warning: no tags were found for zile-on-guile gnu/packages/zig.scm:32:13: zig would be upgraded from 0.9.0 to 0.9.1 gnu/packages/xorg.scm:2830:7: warning: no valid tags found for xf86-video-freedreno gnu/packages/xml.scm:2132:13: java-kxml2 would be upgraded from 2.4.2 to 2.5.0 --8<---------------cut here---------------end--------------->8--- The GitHub updater becomes functional again once the rate limit has been reset. The code to deal with rate limiting is similar to that in (guix swh). Thoughts? Thanks, Ludo’. ¹ https://issues.guix.gnu.org/53818#50 Ludovic Courtès (4): http-client: Add response headers to '&http-get-error'. import: github: Gracefully handle rate limit exhaustion. http-client: Correctly handle redirects when #:keep-alive? #t. import: github: Reuse HTTP connection for the /tags URL fallback. .dir-locals.el | 1 + guix/http-client.scm | 39 +++++++++----- guix/import/github.scm | 119 +++++++++++++++++++++++++++++------------ 3 files changed, 112 insertions(+), 47 deletions(-) base-commit: be84fb701bf7a36a0eb50147ccbb988aa3f41209
Comments
Ludovic Courtès schreef op do 03-03-2022 om 22:13 [+0100]: > With this change, ‘guix refresh’ warns you when the GitHub rate limit > is reached, but it keeps going, falling back to the ‘generic-git’ > updater if it’s among the applicable updaters: > [...] WDYT of avoiding the rate limit by caching, using 'http-fetch/cached'? GitHub does not count requests setting If-Modified-Since against the rate limit (assuming the answer hasn't changed). Greetings, Maxime.
Hi, Maxime Devos <maximedevos@telenet.be> skribis: > Ludovic Courtès schreef op do 03-03-2022 om 22:13 [+0100]: >> With this change, ‘guix refresh’ warns you when the GitHub rate limit >> is reached, but it keeps going, falling back to the ‘generic-git’ >> updater if it’s among the applicable updaters: >> [...] > > WDYT of avoiding the rate limit by caching, using 'http-fetch/cached'? > GitHub does not count requests setting If-Modified-Since against the > rate limit (assuming the answer hasn't changed). My concern is that we’d end up caching one or two little files in ~/.cache for each candidate package, and (rate limit aside) the overhead of dealing with the cache might outweigh the benefits. I’d rather use ‘http-fetch/cached’ for bigger files, like in (guix cve). WDYT? My goal here was to ensure the ‘github’ updater doesn’t get in the way of those who don’t want to specify a token. Thanks, Ludo’.
Ludovic Courtès schreef op vr 04-03-2022 om 21:45 [+0100]: > My concern is that we’d end up caching one or two little files in > ~/.cache for each candidate package, and (rate limit aside) the overhead > of dealing with the cache might outweigh the benefits. I’d rather use > ‘http-fetch/cached’ for bigger files, like in (guix cve). > > WDYT? If the overhead of caching little files is a concern, then perhaps a SQLite (or GDBM) database could be used instead of the filesystem-based cache? The number of packages in Guix was about 150 000 IIRC, if we assume something around the magnitude of 200 bytes per package, then we end up with about 29 MiB for the entirity of Guix. And there might be some opportunities for compression, reducing this number. Something like this could be left for later though. Greetings, Maxime.
Hi, Maxime Devos <maximedevos@telenet.be> skribis: > Ludovic Courtès schreef op vr 04-03-2022 om 21:45 [+0100]: >> My concern is that we’d end up caching one or two little files in >> ~/.cache for each candidate package, and (rate limit aside) the overhead >> of dealing with the cache might outweigh the benefits. I’d rather use >> ‘http-fetch/cached’ for bigger files, like in (guix cve). >> >> WDYT? > > If the overhead of caching little files is a concern, then perhaps a > SQLite (or GDBM) database could be used instead of the filesystem-based > cache? The number of packages in Guix was about 150 000 IIRC, if we > assume something around the magnitude of 200 bytes per package, then > we end up with about 29 MiB for the entirity of Guix. And there might > be some opportunities for compression, reducing this number. I think this would be going overboard in terms of complexity :-), and it wouldn’t radically change the run-time overhead (you still potentially have to do an HTTP round trip with ‘If-Modified-Since’, you’re just saving a few hundred bytes on the response in the best case.) > Something like this could be left for later though. Yup! Ludo’.
Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]: > [...] and it wouldn’t radically change the run-time overhead (you still > potentially have to do an HTTP round trip with ‘If-Modified-Since’, > you’re just saving a few hundred bytes on the response in the best case.) IIUC, when the TTL hasn't been exceeded, then the file from the file system is served without contacting the remote server at all. So in the best case, you only ‘round-trip’ to the disk instead of the HTTP server. So I think there's some potential benefits to be had here. That assumes a sufficiently large TTL though. Greetings, Maxime.
Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]:
> I think this would be going overboard in terms of complexity :-)
There's some complexity here, but assuming a sufficient amount of
tests, I believe it would be worth it if it allows side-stepping the
rate limit to some degree. And the extra complexity would mostly
disappear if the overhead of tiny files was accepted (*).
There are also some other benefits, e.g. a kind of ‘download
resumption’ but for linters, reducing network traffic after retrying
"guix lint" on a lossy network (or because the terminal tab was closed
to early, etc.).
All stuff that can be left for later though!
Greetings,
Maxime.
(*) Assuming 150 000 packages and 1 KiB per package (this would be
file-system dependent!), I end up with 150 MiB. That's a bit on the
large size though ...
Maxime Devos <maximedevos@telenet.be> skribis: > Ludovic Courtès schreef op za 05-03-2022 om 22:58 [+0100]: >> I think this would be going overboard in terms of complexity :-) > > There's some complexity here, but assuming a sufficient amount of > tests, I believe it would be worth it if it allows side-stepping the > rate limit to some degree. What should also be taken into account is the usefulness of the ‘github’ updater—investment should be proportionate. I suspect it’s much less useful now that we have the ‘generic-git’ updater. Maybe, maybe it gives slightly more accurate data in some cases, maybe it can be slightly faster, but that’s not entirely clear to me.