From patchwork Tue Apr 1 19:31:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Trofimov X-Patchwork-Id: 41141 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id E29C027BBE9; Tue, 1 Apr 2025 20:33:48 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_CERTIFIED,RCVD_IN_VALIDITY_RPBL,RCVD_IN_VALIDITY_SAFE, SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id A01FB27BBE2 for ; Tue, 1 Apr 2025 20:33:47 +0100 (BST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tzhMF-00079F-Mc; Tue, 01 Apr 2025 15:33:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tzhMC-00078Q-TI for guix-patches@gnu.org; Tue, 01 Apr 2025 15:33:09 -0400 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tzhMB-0000xG-BV; Tue, 01 Apr 2025 15:33:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:From:To:In-Reply-To:References:Subject; bh=84B+HaKIkCtXiSnzduMypItwc6yngjgrkSnqfB23XGQ=; b=s6OVAjtITOD3LeYSTyA879wtVfZPKAAvk1vFSc4QkoHdC33LbslNTFE+iRpkhSt53zA/mZ+o5piXTj8WRZ8UWdP3NErbNn7vP4zG5YuANkt0C9+8vloqI0R2eHmEHpNM8jevk88MXx/0i5LlqHDfxfKI4TlmUH45aW+z82YiQKiyTogYdYaIjf8r9kapXHuUgj0rkoGJXPBlpuGnxNWvSiZI13ZJE9B8u/7wCTFu75b7m4wnWba715Ak0yI9KoGYku3NYHmbZVUOXKcDhCMEjAxR7GDTQ+4lW3KDlnLyxSupC559iNCKvCBmUnRS//ngRyLPmZuOvvefb56pN+LH6Q==; Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tzhM6-0003T2-DR; Tue, 01 Apr 2025 15:33:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#77387] [PATCH v1 1/2] man-db: Parse man macro arguments better. References: In-Reply-To: Resent-From: Sergey Trofimov Original-Sender: "Debbugs-submit" Resent-CC: sarg@sarg.org.ru, ludo@gnu.org, guix@cbaines.net, dev@jpoiret.xyz, othacehe@gnu.org, zimon.toutoune@gmail.com, me@tobias.gr, guix-patches@gnu.org Resent-Date: Tue, 01 Apr 2025 19:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 77387 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 77387@debbugs.gnu.org Cc: Sergey Trofimov , Sergey Trofimov , Ludovic =?utf-8?q?Court=C3=A8s?= , Christopher Baines , Josselin Poiret , Mathieu Othacehe , Simon Tournier , Tobias Geerinckx-Rice X-Debbugs-Original-Xcc: Sergey Trofimov , Ludovic =?utf-8?q?Court=C3=A8s?= , Christopher Baines , Josselin Poiret , Mathieu Othacehe , Simon Tournier , Tobias Geerinckx-Rice Received: via spool by 77387-submit@debbugs.gnu.org id=B77387.174353594713063 (code B ref 77387); Tue, 01 Apr 2025 19:33:02 +0000 Received: (at 77387) by debbugs.gnu.org; 1 Apr 2025 19:32:27 +0000 Received: from localhost ([127.0.0.1]:51387 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tzhLW-0003OU-42 for submit@debbugs.gnu.org; Tue, 01 Apr 2025 15:32:26 -0400 Received: from mail-ej1-x634.google.com ([2a00:1450:4864:20::634]:50442) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1tzhLT-0003NM-Lq for 77387@debbugs.gnu.org; Tue, 01 Apr 2025 15:32:25 -0400 Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-ac339f53df9so1047733566b.1 for <77387@debbugs.gnu.org>; Tue, 01 Apr 2025 12:32:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sarg.org.ru; s=google; t=1743535937; x=1744140737; darn=debbugs.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=84B+HaKIkCtXiSnzduMypItwc6yngjgrkSnqfB23XGQ=; b=idEBLc69R1J5wDfKuHB8QGeYXmUQTAj7f52mEf6d7BOYweOurbhnpWaWyCp+z5NTXw /BHdZdLdofoZ+95Tvto9ShrlbRoyj0iYzLZjOgZb1VxDeGfE/AHKnR3wNao9HOT3IAYZ CyAZZWNQ0HzRmDBSkbQvgl2L69FS5aLKIwpuxNxX8vSGdeXzKXkmdchlFNsVM8NMQMUR UJn+rmjJiuPQDH94QFxcMqvnqoV0pgtDiAtPkuH2ZZyMDukNjHfkNm0BOCyXYrUBYGfk 1DGcvFPPfXYFfm3j1FyGZ3g48iYY7t/WmBwyhLMQbkry4JEXan59iI3mBIZMaJbg+gYf Yfyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743535937; x=1744140737; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=84B+HaKIkCtXiSnzduMypItwc6yngjgrkSnqfB23XGQ=; b=l8bH10go+CfBw5vWMCyfXOKUuKXAgUMtv1IENRWEIwj0gvPNZR1rzDc/kXDkVi5OZG yU7cB6aUxj0nGcJDZYDSOUQkdkdh1TopQmJG3n5I2ux74fpmrnBQvRZRBJCcQZl1FKXn ZxpfGHXHwye4CV0wXC2uIWo9AvgdaVYBS0WqeRmTdRnJTfj42IHTBlKMRfYKeU6Z/j/0 Eejck9XN58K2oPICXS/q4g0IKV1dt4atePDukInQT5MjDmjwYuqG+O2jXnfdT2RtflJo av1sFOAoMDyL0dzA7HnyzKblUpjmRL8PkVFFwIyjZTiRQ4vJ+YhbecV3rFhEml/fBj4t KKzA== X-Gm-Message-State: AOJu0Yygs0BgoCbSrZeKptLElUYusodn92ER+JYF9Lrbp5dVHpYJPFXN SSwyKKhUKWUar6pqeEaE4EXyOWtosUJuFEk24OhPI+x3wCyb3GkuKp5ZUmBWAWHM8dUEZyWvNok JYJk= X-Gm-Gg: ASbGncs5tHW9jK8qi+5y9KhcXaV2F8wlbx9DSF05mDr1Oh2MlncEynyx8OK7Mc2fcq1 lm0HgmV+syVCDggVnymbUEBnnVTr7vN/gndJkN+q0CBzzHacQ4Fd7vbNhe2waARw+1qswyVXrXK w62nZUl6qCRPtEe0PFZTKArrScNPnsbeUUHW7jvvTqwZsWal6U59TNkRC93vhgCtCZdAGJhGMDD 6kVrvErj2xVosd6u8DfaB2dp7PO4XvRHjsZXoVqQL+8BciJfZlYhrxEjLCqSvITC7stIX18jEBX yD8cz1ecd4DKCZMGYvJ9Zfuo4HU4/DTXhOW8Yo2jLQ== X-Google-Smtp-Source: AGHT+IEqMwPmDSPU8HHmcL9LMU/JzTkgRFHo1CXTSWqQ+/F26XMBvTzUwqO+Dh0vu2TtBaKWgT/JyA== X-Received: by 2002:a17:906:c144:b0:ac1:def4:ce20 with SMTP id a640c23a62f3a-ac7389e6c75mr1343218166b.18.1743535936743; Tue, 01 Apr 2025 12:32:16 -0700 (PDT) Received: from localhost ([2a02:2454:a095:5600:a64e:31ff:fe38:fd6c]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ac71967fcf4sm796746366b.129.2025.04.01.12.32.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Apr 2025 12:32:16 -0700 (PDT) Date: Tue, 1 Apr 2025 21:31:59 +0200 Message-ID: <109f10a155f45adf26420b11787153cf96bbdc8b.1743535900.git.sarg@sarg.org.ru> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Sergey Trofimov X-ACL-Warn: , Sergey Trofimov via Guix-patches X-Patchwork-Original-From: Sergey Trofimov via Guix-patches via From: Sergey Trofimov Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches * guix/man-db.scm (man-macro-tokenize): New procedure to parse man macros. (man-page->entry): Parse macro line using man-macro-tokenize. Change-Id: Iea0ffbc65290757df746138e0a6174646b5a3eb8 --- guix/man-db.scm | 56 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 47 insertions(+), 9 deletions(-) base-commit: 5735c278e16517d9be5e26235fe68dea9bae3527 prerequisite-patch-id: f9cc903b8048c8c6fde576fbf38ab110263020e3 prerequisite-patch-id: 220ddf11addf3a6c7ab3b349077bca6849241556 prerequisite-patch-id: fc7d254c8dc198bc2f083e1c8aea18960c73b165 prerequisite-patch-id: b6d30068ce4971d4d8e67517229916df4e76c529 diff --git a/guix/man-db.scm b/guix/man-db.scm index bba90ed473..94231264f0 100644 --- a/guix/man-db.scm +++ b/guix/man-db.scm @@ -161,16 +161,52 @@ (define (read-synopsis port) (line (loop (cons line lines)))))) +(define (man-macro-tokenize input) + "Split INPUT string, a man macro invocation, into a list containing the macro's +name followed by its arguments." + (let loop ((pos 0) + (tokens '()) + (characters '()) + (in-string? #f)) + (if (>= pos (string-length input)) + ;; End of input + (unless in-string? + (reverse (if (null? characters) + tokens + (cons (list->string (reverse characters)) tokens)))) + (let ((c (string-ref input pos))) + (cond + ;; Inside a string + (in-string? + (if (char=? c #\") + (if (and (< (+ pos 1) (string-length input)) + (char=? (string-ref input (+ pos 1)) #\")) + ;; Double quote inside string + (loop (+ pos 2) tokens (cons #\" characters) #t) + ;; End of string + (loop (+ pos 1) (cons (list->string (reverse characters)) tokens) '() #f)) + ;; Regular character in string + (loop (+ pos 1) tokens (cons c characters) #t))) + + ;; Whitespace outside string + ((char-whitespace? c) + (if (null? characters) + (loop (+ pos 1) tokens '() #f) + (loop (+ pos 1) (cons (list->string (reverse characters)) tokens) '() #f))) + + ;; Start of string + ((char=? c #\") + (if (null? characters) + (loop (+ pos 1) tokens '() #t) + (loop pos (cons (list->string (reverse characters)) tokens) '() #f))) + + ;; Symbol character + (else + (loop (+ pos 1) tokens (cons c characters) #f))))))) + (define* (man-page->entry file #:optional (resolve identity)) "Parse FILE, a gzip or zstd compressed man page, and return a for it." - (define (string->number* str) - (if (and (string-prefix? "\"" str) - (> (string-length str) 1) - (string-suffix? "\"" str)) - (string->number (string-drop (string-drop-right str 1) 1)) - (string->number str))) - (define call-with-input-port* (cond ((gzip-compressed? file) call-with-gzip-input-port) @@ -189,8 +225,10 @@ (define* (man-page->entry file #:optional (resolve identity)) (if (eof-object? line) (mandb-entry file name (or section 0) (or synopsis "") kind) - (match (string-tokenize line) - ((".TH" name (= string->number* section) _ ...) + ;; man 7 groff groff_mdoc groff_man + ;; look for metadata in macro invocations (lines starting with .) + (match (and (string-prefix? "." line) (man-macro-tokenize line)) + ((".TH" name (= string->number section) _ ...) (loop name section synopsis kind)) ((".SH" (or "NAME" "\"NAME\"")) (loop name section (read-synopsis port) kind))