From patchwork Tue Jun 22 09:08:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Ludovic_Court=C3=A8s?= X-Patchwork-Id: 30624 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 0E4B627BC81; Tue, 22 Jun 2021 10:10:31 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id A974527BC78 for ; Tue, 22 Jun 2021 10:10:30 +0100 (BST) Received: from localhost ([::1]:50606 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lvcQP-000359-Pc for patchwork@mira.cbaines.net; Tue, 22 Jun 2021 05:10:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59190) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lvcQ0-0002aC-Of for guix-patches@gnu.org; Tue, 22 Jun 2021 05:10:05 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:54519) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lvcPy-0003hd-WF for guix-patches@gnu.org; Tue, 22 Jun 2021 05:10:04 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lvcPy-0007qc-Ro for guix-patches@gnu.org; Tue, 22 Jun 2021 05:10:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#49169] [PATCH 08/11] utils: Add 'go-to-location' with source location caching. Resent-From: Ludovic =?utf-8?q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Tue, 22 Jun 2021 09:10:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 49169 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 49169@debbugs.gnu.org Cc: Ludovic =?utf-8?q?Court=C3=A8s?= Received: via spool by 49169-submit@debbugs.gnu.org id=B49169.162435294530051 (code B ref 49169); Tue, 22 Jun 2021 09:10:02 +0000 Received: (at 49169) by debbugs.gnu.org; 22 Jun 2021 09:09:05 +0000 Received: from localhost ([127.0.0.1]:37823 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lvcP3-0007oL-8h for submit@debbugs.gnu.org; Tue, 22 Jun 2021 05:09:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:32866) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lvcOq-0007lR-BZ for 49169@debbugs.gnu.org; Tue, 22 Jun 2021 05:08:53 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52824) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lvcOl-0002qD-6N; Tue, 22 Jun 2021 05:08:47 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=49370 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lvcOk-0000B8-Ut; Tue, 22 Jun 2021 05:08:47 -0400 From: Ludovic =?utf-8?q?Court=C3=A8s?= Date: Tue, 22 Jun 2021 11:08:27 +0200 Message-Id: <20210622090830.15561-8-ludo@gnu.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210622090830.15561-1-ludo@gnu.org> References: <20210622090830.15561-1-ludo@gnu.org> MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches * guix/utils.scm (%source-location-map): New variable. (go-to-location): New procedure. (edit-expression): Use it instead of custom loop. * guix/packages.scm (package-field-location)[goto]: Remove. Use 'go-to-location' instead of 'goto'. --- guix/packages.scm | 8 +----- guix/utils.scm | 66 ++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 63 insertions(+), 11 deletions(-) diff --git a/guix/packages.scm b/guix/packages.scm index 4ac1624ce2..d15a17edc0 100644 --- a/guix/packages.scm +++ b/guix/packages.scm @@ -514,12 +514,6 @@ object." (define (package-field-location package field) "Return the source code location of the definition of FIELD for PACKAGE, or #f if it could not be determined." - (define (goto port line column) - (unless (and (= (port-column port) (- column 1)) - (= (port-line port) (- line 1))) - (unless (eof-object? (read-char port)) - (goto port line column)))) - (match (package-location package) (($ file line column) (match (search-path %load-path file) @@ -529,7 +523,7 @@ object." ;; In general we want to keep relative file names for modules. (call-with-input-file file-found (lambda (port) - (goto port line column) + (go-to-location port line column) (match (read port) (('package inits ...) (let ((field (assoc field inits))) diff --git a/guix/utils.scm b/guix/utils.scm index a13b13c4fa..f8f6672bb1 100644 --- a/guix/utils.scm +++ b/guix/utils.scm @@ -49,6 +49,7 @@ #:use-module (ice-9 match) #:use-module (ice-9 format) #:use-module ((ice-9 iconv) #:prefix iconv:) + #:use-module (ice-9 vlist) #:autoload (zlib) (make-zlib-input-port make-zlib-output-port) #:use-module (system foreign) #:re-export ( ;for backwards compatibility @@ -117,6 +118,7 @@ cache-directory readlink* + go-to-location edit-expression filtered-port @@ -337,6 +339,65 @@ a list of command-line arguments passed to the compression program." (unless (every (compose zero? cdr waitpid) pids) (error "compressed-output-port failure" pids)))))) +(define %source-location-map + ;; Maps inode/device tuples to "source location maps" used by + ;; 'go-to-location'. + (make-hash-table)) + +(define (go-to-location port line column) + "Jump to LINE and COLUMN (both one-indexed) in PORT. Maintain a source +location map such that this can boil down to seek(2) and a few read(2) calls, +which can drastically speed up repetitive operations on large files." + (let* ((stat (stat port)) + (key (list (stat:ino stat) (stat:dev stat))) + (stamp (list (stat:mtime stat) (stat:mtimensec stat) + (stat:size stat))) + + ;; Look for an up-to-date source map for KEY. The map is a vlist + ;; where each entry gives the byte offset of the beginning of a line: + ;; element 0 is the offset of the first line, element 1 the offset of + ;; the second line, etc. The map is filled lazily. + (source-map (match (hash-ref %source-location-map key) + (#f + (vlist-cons 0 vlist-null)) + ((cache-stamp ... map) + (if (equal? cache-stamp stamp) ;invalidate? + map + (vlist-cons 0 vlist-null))))) + (last (vlist-length source-map))) + ;; Jump to LINE, ideally via SOURCE-MAP. + (if (<= line last) + (seek port (vlist-ref source-map (- line 1)) SEEK_SET) + (let ((target line) + (offset (vlist-ref source-map (- last 1)))) + (seek port offset SEEK_SET) + (let loop ((source-map (vlist-reverse source-map)) + (line last)) + (if (< line target) + (match (read-char port) + (#\newline + (loop (vlist-cons (ftell port) source-map) + (+ 1 line))) + ((? eof-object?) + (error "unexpected end of file" port line)) + (chr (loop source-map line))) + (hash-set! %source-location-map key + `(,@stamp + ,(vlist-reverse source-map))))))) + + ;; Read up to COLUMN. + (let ((target column)) + (let loop ((column 1)) + (when (< column target) + (match (read-char port) + (#\newline (error "unexpected end of line" port)) + (#\tab (loop (+ 8 column))) + (chr (loop (+ 1 column))))))) + + ;; Update PORT's position info. + (set-port-line! port (- line 1)) + (set-port-column! port (- column 1)))) + (define* (edit-expression source-properties proc #:key (encoding "UTF-8")) "Edit the expression specified by SOURCE-PROPERTIES using PROC, which should be a procedure that takes the original expression in string and returns a new @@ -350,10 +411,7 @@ This procedure returns #t on success." (call-with-input-file file (lambda (in) (let* ( ;; The start byte position of the expression. - (start (begin (while (not (and (= line (port-line in)) - (= column (port-column in)))) - (when (eof-object? (read-char in)) - (error (format #f "~a: end of file~%" in)))) + (start (begin (go-to-location in (+ 1 line) (+ 1 column)) (ftell in))) ;; The end byte position of the expression. (end (begin (read in) (ftell in))))