From patchwork Tue Dec 15 09:38:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ludovic_Court=C3=A8s?= X-Patchwork-Id: 25790 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 094C927BC05; Tue, 15 Dec 2020 09:39:15 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 88D0327BC04 for ; Tue, 15 Dec 2020 09:39:14 +0000 (GMT) Received: from localhost ([::1]:39900 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kp6nZ-000238-Mn for patchwork@mira.cbaines.net; Tue, 15 Dec 2020 04:39:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59558) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kp6nO-00021I-DU for guix-patches@gnu.org; Tue, 15 Dec 2020 04:39:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:44090) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kp6nO-00065s-54 for guix-patches@gnu.org; Tue, 15 Dec 2020 04:39:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kp6nO-0006gr-1Z for guix-patches@gnu.org; Tue, 15 Dec 2020 04:39:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#45253] [PATCH 0/6] Pipeline substitute integrity check, deduplication, and canonicalization Resent-From: Ludovic =?utf-8?q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Tue, 15 Dec 2020 09:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 45253 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 45253@debbugs.gnu.org X-Debbugs-Original-To: guix-patches@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.160802512025684 (code B ref -1); Tue, 15 Dec 2020 09:39:01 +0000 Received: (at submit) by debbugs.gnu.org; 15 Dec 2020 09:38:40 +0000 Received: from localhost ([127.0.0.1]:55636 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kp6n1-0006gC-Qv for submit@debbugs.gnu.org; Tue, 15 Dec 2020 04:38:40 -0500 Received: from lists.gnu.org ([209.51.188.17]:33292) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kp6n0-0006g4-FH for submit@debbugs.gnu.org; Tue, 15 Dec 2020 04:38:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59500) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kp6n0-0001ie-8P for guix-patches@gnu.org; Tue, 15 Dec 2020 04:38:38 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48484) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kp6mz-0005zb-As; Tue, 15 Dec 2020 04:38:37 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=44308 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kp6my-0008Uv-HK; Tue, 15 Dec 2020 04:38:37 -0500 From: Ludovic =?utf-8?q?Court=C3=A8s?= Date: Tue, 15 Dec 2020 10:38:30 +0100 Message-Id: <20201215093830.10322-1-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches Hello Guix! This is a followup to . It is meant to be applied on top of . Until now, guix-daemon would check the hash of store items just substituted, reset timestamps/permissions, and deduplicate. This would lead to extra I/O: the whole set of files is traversed three times by the daemon and read two times. This patch series is about delegating that work to ‘guix substitute’, which it can do directly as it restores file, thereby reducing I/O to the minimum necessary. I tested with substitutes that contain many files: guix build pipewire@0.2 ffmpeg ungoogled-chromium vim-full \ emacs-no-x emacs-no-x-toolkit On my laptop with an SSD, the wall-clock time is almost unchanged when fetching lzip substitutes. You can see that the throughput displayed while downloading is slightly lower than before, which is consistent because lzip downloads are CPU-bound¹, but this is compensated by the lack of processing time between substitutes. With gzip substitutes, I see a 10% speedup on the wall-clock time on my laptop. Ludo’. ¹ https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html Ludovic Courtès (6): tests: Check the build trace for hash mismatches on substitutes. daemon: Let 'guix substitute' perform hash checks. tests: Check the mtime and permissions of substituted items. daemon: Do not reset timestamps and permissions on substituted items. tests: Make sure substituted items are deduplicated. daemon: Delegate deduplication to 'guix substitute'. guix/scripts/substitute.scm | 70 +++++++++++++++++++++++++----- guix/serialization.scm | 8 +++- nix/libstore/build.cc | 85 ++++++++++++++++++++----------------- tests/store.scm | 82 +++++++++++++++++++++++++++++++++++ tests/substitute.scm | 58 ++++++++++++++++++++++--- 5 files changed, 248 insertions(+), 55 deletions(-)