From patchwork Fri Oct 20 16:15:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ludovic_Court=C3=A8s?= X-Patchwork-Id: 55077 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id B1A9627BBEA; Fri, 20 Oct 2023 17:17:02 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id CD01227BBE2 for ; Fri, 20 Oct 2023 17:17:01 +0100 (BST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qtsAz-0000Zx-KB; Fri, 20 Oct 2023 12:16:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtsAw-0000ZO-9V for guix-patches@gnu.org; Fri, 20 Oct 2023 12:16:38 -0400 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qtsAw-0004cK-0F; Fri, 20 Oct 2023 12:16:38 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qtsBK-0004hJ-1O; Fri, 20 Oct 2023 12:17:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#66650] [PATCH] git: Shell out to =?utf-8?b?4oCYZ2l0IGdj4oCZ?= when necessary. Resent-From: Ludovic =?utf-8?q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix@cbaines.net, dev@jpoiret.xyz, ludo@gnu.org, othacehe@gnu.org, rekado@elephly.net, zimon.toutoune@gmail.com, me@tobias.gr, guix-patches@gnu.org Resent-Date: Fri, 20 Oct 2023 16:17:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 66650 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 66650@debbugs.gnu.org Cc: Ludovic =?utf-8?q?Court=C3=A8s?= , 65720@debbugs.gnu.org, Josselin Poiret , Simon Tournier , Christopher Baines , Josselin Poiret , Ludovic =?utf-8?q?Court=C3=A8s?= , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice X-Debbugs-Original-To: guix-patches@gnu.org X-Debbugs-Original-Xcc: Christopher Baines , Josselin Poiret , Ludovic =?utf-8?q?Court=C3=A8s?= , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Received: via spool by submit@debbugs.gnu.org id=B.169781857317976 (code B ref -1); Fri, 20 Oct 2023 16:17:01 +0000 Received: (at submit) by debbugs.gnu.org; 20 Oct 2023 16:16:13 +0000 Received: from localhost ([127.0.0.1]:41305 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qtsAW-0004fs-VW for submit@debbugs.gnu.org; Fri, 20 Oct 2023 12:16:13 -0400 Received: from lists.gnu.org ([2001:470:142::17]:49828) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qtsAV-0004f0-F5 for submit@debbugs.gnu.org; Fri, 20 Oct 2023 12:16:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qts9t-0008NU-Dy for guix-patches@gnu.org; Fri, 20 Oct 2023 12:15:33 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qts9r-0004Rf-PD; Fri, 20 Oct 2023 12:15:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:In-Reply-To:Date:Subject:To: From; bh=guykA3Jje4B/fdh8d1YKTVE8F+9iXXIl3FUKarfygvY=; b=jF0iDL/GbSiYleynVOTG /8r/H3EkXts9gzZHBgd9R8nNRQDQE9k6G5S40965qhyaaKSTNmvZmEP05W378rPNL287CkVM+Ibc7 7E32AmsdJM3Am6miqMoPX9YkU5o+Q5z+orHS8Uflfk+Q6GcnHJFgV54Bg6XcAw2I1w6n8BhwLD8XH Xc6+syD3NXx+2uTgwx7sgIiSqx8Q8TyYiO+e+VCDOrgg9PMq1m8eYZ6M32W5h/fNQYw3CoJwQsjTq N7T/JL+bxKLs4dAkgwLr8izbta894kjMEC5GTNjZ2vBkHDJJ991iz+BbSwyZrUfYucrv//2aKtBun iam1wsZbcvK/hA==; From: Ludovic =?utf-8?q?Court=C3=A8s?= Date: Fri, 20 Oct 2023 18:15:12 +0200 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: <87jzswsrlt.fsf@gnu.org> References: <87jzswsrlt.fsf@gnu.org> MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches Fixes . This fixes a bug whereby libgit2-managed checkouts would keep growing as we fetch. * guix/git.scm (packs-in-git-repository, maybe-run-git-gc): New procedures. (update-cached-checkout): Use it. --- guix/git.scm | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) Hi! This is a radical fix/workaround for the unbounded Git checkout growth problem, shelling out to ‘git gc’ when it’s likely needed (“too many” pack files around). I thought we might be able to implement a ‘git gc’ approximation using the libgit2 “packbuilder” interface, but I haven’t got around to doing it: . Once again, shelling out is not my favorite option, but it’s a bug we should fix sooner rather than later, hence this compromise. Thoughts? Ludo’. base-commit: 6b0a32196982a0a2f4dbb59d35e55833a5545ac6 diff --git a/guix/git.scm b/guix/git.scm index b7182305cf..d704b62333 100644 --- a/guix/git.scm +++ b/guix/git.scm @@ -1,6 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2017, 2020 Mathieu Othacehe -;;; Copyright © 2018-2022 Ludovic Courtès +;;; Copyright © 2018-2023 Ludovic Courtès ;;; Copyright © 2021 Kyle Meyer ;;; Copyright © 2021 Marius Bakke ;;; Copyright © 2022 Maxime Devos @@ -29,15 +29,16 @@ (define-module (guix git) #:use-module (guix cache) #:use-module (gcrypt hash) #:use-module ((guix build utils) - #:select (mkdir-p delete-file-recursively)) + #:select (mkdir-p delete-file-recursively invoke/quiet)) #:use-module (guix store) #:use-module (guix utils) #:use-module (guix records) #:use-module (guix gexp) #:autoload (guix git-download) (git-reference-url git-reference-commit git-reference-recursive?) + #:autoload (guix config) (%git) #:use-module (guix sets) - #:use-module ((guix diagnostics) #:select (leave warning)) + #:use-module ((guix diagnostics) #:select (leave warning info)) #:use-module (guix progress) #:autoload (guix swh) (swh-download commit-id?) #:use-module (rnrs bytevectors) @@ -428,6 +429,35 @@ (define (delete-checkout directory) (rename-file directory trashed) (delete-file-recursively trashed))) +(define (packs-in-git-repository directory) + "Return the number of pack files under DIRECTORY, a Git checkout." + (catch 'system-error + (lambda () + (let ((directory (opendir (in-vicinity directory ".git/objects/pack")))) + (let loop ((count 0)) + (match (readdir directory) + ((? eof-object?) + (closedir directory) + count) + (str + (loop (if (string-suffix? ".pack" str) + (+ 1 count) + count))))))) + (const 0))) + +(define (maybe-run-git-gc directory) + "Run 'git gc' in DIRECTORY if needed." + ;; XXX: As of libgit2 1.3.x (used by Guile-Git), there's no support for GC. + ;; Each time a checkout is pulled, a new pack is created, which eventually + ;; takes up a lot of space (lots of small, poorly-compressed packs). As a + ;; workaround, shell out to 'git gc' when the number of packs in a + ;; repository has become "too large", potentially wasting a lot of space. + ;; See . + (when (> (packs-in-git-repository directory) 25) + (info (G_ "compressing cached Git repository at '~a'...~%") + directory) + (invoke/quiet %git "-C" directory "gc"))) + (define* (update-cached-checkout url #:key (ref '()) @@ -515,6 +545,9 @@ (define* (update-cached-checkout url seconds seconds nanoseconds nanoseconds)))) + ;; Run 'git gc' if needed. + (maybe-run-git-gc cache-directory) + ;; When CACHE-DIRECTORY is a sub-directory of the default cache ;; directory, remove expired checkouts that are next to it. (let ((parent (dirname cache-directory)))