From patchwork Mon Jul 25 12:16:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Philip McGrath X-Patchwork-Id: 40936 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 5E16D27BBEA; Mon, 25 Jul 2022 13:19:59 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id A34FB27BBE9 for ; Mon, 25 Jul 2022 13:19:58 +0100 (BST) Received: from localhost ([::1]:49700 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oFx41-0000fs-PR for patchwork@mira.cbaines.net; Mon, 25 Jul 2022 08:19:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47784) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oFx3A-0008N0-L0 for guix-patches@gnu.org; Mon, 25 Jul 2022 08:19:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:59709) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oFx3A-0005Ow-9l for guix-patches@gnu.org; Mon, 25 Jul 2022 08:19:04 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oFx3A-0002yI-5C for guix-patches@gnu.org; Mon, 25 Jul 2022 08:19:04 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#56759] [PATCH 16/20] gnu: Add anystyle. Resent-From: Philip McGrath Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Mon, 25 Jul 2022 12:19:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56759 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 56759@debbugs.gnu.org Cc: Philip McGrath Received: via spool by 56759-submit@debbugs.gnu.org id=B56759.165875151911307 (code B ref 56759); Mon, 25 Jul 2022 12:19:04 +0000 Received: (at 56759) by debbugs.gnu.org; 25 Jul 2022 12:18:39 +0000 Received: from localhost ([127.0.0.1]:49442 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oFx2k-0002wC-Rl for submit@debbugs.gnu.org; Mon, 25 Jul 2022 08:18:39 -0400 Received: from mail-qv1-f49.google.com ([209.85.219.49]:40643) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oFx2a-0002vL-4Y for 56759@debbugs.gnu.org; Mon, 25 Jul 2022 08:18:28 -0400 Received: by mail-qv1-f49.google.com with SMTP id i4so8241462qvv.7 for <56759@debbugs.gnu.org>; Mon, 25 Jul 2022 05:18:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=philipmcgrath.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XhqF0zJ6FVd6rJiScNgqVNBWoCBGB8wtP3QTyLmvYuo=; b=fGzSFe+HIRJUt63jfklyxL9xc9Aja+9A6rd8O8hBeuzHnDx2GXG+5GFM24tQhDfkDe V4yLgIDiVPuUo2Uh6OTER7pl/tOUax3SpbFXQzr3LMpydCtc3G09d9DGMqdVKxWN8g4B p4YaE5i0RCeebejwaxC9uPXDtQ7kCgFp5t5Pu+JPTUKIRWRuZ/hbYnvZXiAjaUs0m5kC 7M4lK0gPUgVvx3mchdyYJ56Pl0rULJ1vBqIMvmXZYy+JBh13izjRG84KVWalrrOYfnlp 7gKLoL0eOCwsK0XlP7592erUlwWrv41FagTT+6QFcy8umvB+oxJw2AcmWSUwjBUrvGDR v0AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XhqF0zJ6FVd6rJiScNgqVNBWoCBGB8wtP3QTyLmvYuo=; b=iUUm33BfPBipyvI+qWfurUPRojYVyz53BhtfiJl7oev24Kl+xFJz3ATPaDw7ISsUqy pRCin022vnjcxO5YlHRMYoUkQ1GxfRbqCk77nfeEP0+/s1zJtJ1BKA6Y6Zt0zoCS1+yk Gn8wpCbHbRu79aEFRB1iWQedaa+J9hlzG08WyqGytBtyOKyOTyD+G14WQMpiNF5nuZlX x1jK7s3BJj46dvk089g+5xDonOIazzRdPkLvSqkR0UCi+dw1s1O3YT3LQLnKX2Prs3EL 5DELCL9DXM2gHkLDxl30Qg61wJkRZ+DwPvODPrhP426BONjVkUjy1YXiLk9veKkKhjve L5PA== X-Gm-Message-State: AJIora/vyMxTuFCJgRShAx6Fd/ltAwo7NmueudowthXf7wbIPMwbUn9y +0Sup8lGHpDquZybMPoRxGlV1BVV04Zr2eoo X-Google-Smtp-Source: AGRyM1uOBGwQ3lMassTJgjOHrC5SKz9mJ5IlMW0oN261eQLsaAGGbBgbmw+V9DKqb48ej01iovi0aQ== X-Received: by 2002:a05:6214:e4e:b0:474:6bf:5f51 with SMTP id o14-20020a0562140e4e00b0047406bf5f51mr10312718qvc.50.1658751502566; Mon, 25 Jul 2022 05:18:22 -0700 (PDT) Received: from localhost (c-73-125-98-51.hsd1.fl.comcast.net. [73.125.98.51]) by smtp.gmail.com with UTF8SMTPSA id y11-20020a05622a004b00b003051f450049sm7800261qtw.8.2022.07.25.05.18.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Jul 2022 05:18:22 -0700 (PDT) From: Philip McGrath Date: Mon, 25 Jul 2022 08:16:31 -0400 Message-Id: <272fe1f23b2adaf54276b105268ca7179e00dcd5.1658750358.git.philip@philipmcgrath.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches * gnu/packages/ruby.scm (anystyle): New variable. (ruby-anystyle)[description]: Mention it. --- gnu/packages/ruby.scm | 125 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 124 insertions(+), 1 deletion(-) diff --git a/gnu/packages/ruby.scm b/gnu/packages/ruby.scm index 90f269e247..3feb07dcdc 100644 --- a/gnu/packages/ruby.scm +++ b/gnu/packages/ruby.scm @@ -66,6 +66,7 @@ (define-module (gnu packages ruby) #:use-module (gnu packages libidn) #:use-module (gnu packages linux) #:use-module (gnu packages lsof) + #:use-module (gnu packages man) #:use-module (gnu packages maths) #:use-module (gnu packages ncurses) #:use-module (gnu packages networking) @@ -13547,5 +13548,127 @@ (define-public ruby-anystyle "AnyStyle is a very fast and smart parser for academic reference lists and bibliographies. AnyStyle uses powerful machine learning heuristics based on Conditional Random Fields and aims to make it easy to train the model with -data that is relevant to your parsing needs.") +data that is relevant to your parsing needs. + +This package provides the Ruby module @code{AnyStyle}. AnyStyle can also be +used via the @command{anystyle} command-line utility or a web application, +though the later has not yet been packaged for Guix.") (license license:bsd-2)))) + +(define-public anystyle + (package + (name "anystyle") + (version "1.3.1") + (source (origin + (method git-fetch) + (uri (git-reference + (url "https://github.com/inukshuk/anystyle-cli") + (commit version))) + (sha256 + (base32 + "1bazzms04cra8516q7vydmcm31yd0a7si1pxk4waffqy7lh0pksg")) + (file-name (git-file-name name version)))) + (build-system ruby-build-system) + (propagated-inputs + (list ruby-anystyle + ruby-bibtex-ruby + ruby-gli)) + (native-inputs + (list txt2man)) + (arguments + (list + #:modules + `((guix build ruby-build-system) + (ice-9 popen) + (guix build utils)) + #:phases + #~(modify-phases %standard-phases + (add-after 'extract-gemspec 'less-strict-dependencies + (lambda args + (substitute* "anystyle-cli.gemspec" + (("'bibtex-ruby', '[^']*'") + "'bibtex-ruby'")))) + (delete 'check) ;; there are no upstream tests + (add-after 'wrap 'check-cli + (lambda* (#:key tests? outputs #:allow-other-keys) + (when tests? + (with-output-to-file "check-cli.in" + (lambda () + (for-each + display + '("Derrida, J. (1967). L’écriture et la différence " + "(1 éd.). Paris: Éditions du Seuil.\n")))) + (invoke (search-input-file outputs "/bin/anystyle") + "parse" + "check-cli.in")))) + (add-after 'wrap 'generate-man-page + ;; generating a man page also tests that the command actually runs + (lambda args + (define (run-with-output-file file command . args) + (format (current-output-port) + "running: ~s\nwith output to: ~s\n" + (cons command args) + file) + (unless (zero? + (with-output-to-file file + (lambda () + (status:exit-val + (close-pipe + (apply open-pipe* OPEN_WRITE command args)))))) + (error "command failed"))) + (let ((anystyle (string-append #$output "/bin/anystyle"))) + (run-with-output-file "intro.txt" + anystyle "--help") + (for-each (lambda (cmd) + (let ((file (string-append cmd ".txt"))) + (run-with-output-file file + anystyle cmd "--help") + ;; indent headings to create subsections + (substitute* file + (("^[A-Z]" orig) + (string-append " " orig))) + ;; generate a section heading + (call-with-output-file + (string-append "section-" file) + (lambda (out) + (format out "\n\n~a COMMAND\n\n" + (string-upcase cmd)))))) + '("check" "find" "parse" "train")) + (substitute* `("intro.txt" + "check.txt" "find.txt" "parse.txt" "train.txt") + ;; format "tag list" for txt2man" + ((" - ") + " ") + ;; restore formatting of the "name" sections + (("(anystyle|check|find|parse|train) ([A-Z])" _ cmd post) + (string-append cmd " - " post))) + (run-with-output-file "anystyle.txt" + "cat" + "intro.txt" + "section-check.txt" "check.txt" + "section-find.txt" "find.txt" + "section-parse.txt" "parse.txt" + "section-train.txt" "train.txt") + (run-with-output-file + "anystyle.1" + "txt2man" + "-v" "General Commands Manual" "-t" "anystyle" "-s" "1" + "-r" #$(string-append "anystyle-cli " + (package-version this-package)) + "-B" "check" "-B" "find" "-B" "parse" "-B" "train" + "anystyle.txt") + (install-file "anystyle.1" + (string-append #$output "/share/man/man1")))))))) + (home-page "https://anystyle.io") + (synopsis "Fast and smart citation reference parsing") + (description + "AnyStyle is a very fast and smart parser for academic reference lists +and bibliographies. AnyStyle uses powerful machine learning heuristics based +on Conditional Random Fields and aims to make it easy to train the model with +data that is relevant to your parsing needs. + +This package provides the @command{anystyle} command-line utility. AnyStyle +can also be used as a Ruby library or as a web application, though the later +has not yet been packaged for Guix.") + (license license:bsd-2) + (properties `((upstream-name . "anystyle-cli")))))