diff mbox series

[bug#47930] gnu: Add pbgzip.

Message ID ad09ffe1-b06b-8c25-0cf8-a6dbf094bacb@gnu.org
State Accepted
Headers show
Series [bug#47930] gnu: Add pbgzip. | expand

Checks

Context Check Description
cbaines/submitting builds success
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch fail View Laminar job
cbaines/issue success View issue

Commit Message

Roel Janssen April 29, 2021, 12:22 p.m. UTC
On 4/29/21 9:29 AM, Efraim Flashner wrote:
> On Thu, Apr 22, 2021 at 06:40:46PM +0200, Maxime Devos wrote:
>> Xinglu Chen schreef op wo 21-04-2021 om 23:45 [+0200]:
>>> On Wed, Apr 21 2021, Roel Janssen wrote:
>>>
>>>> [...]
>>>> +      (arguments
>>>> +       `(#:phases
>>>> +         (modify-phases %standard-phases
>>>> +           (add-after 'unpack 'autogen
>>>> +             (lambda _
>>>> +               (zero? (system* "sh" "autogen.sh")))))))
>>> IIRC, phases don’t have to return #t, so you could remove ‘zero?’.
>> Try running (system* "does-not-exist").  It will fail by returning
>> something non-zero.  If I recall how to call "invoke" correctly,
>> I would recommend (invoke "sh" "autogen.sh") here.  "invoke" raises
>> an exception when the command fails, instead of returning something.
> While we're at it, can this phase replace 'bootstrap? It seems to me we
> shouldn't need both phases.
This indeed seems to be the best thing to do.  I attached a new patch.

I had to leave autoconf and automake in the native-inputs because
otherwise the command "aclocal" and "autom4te" couldn't be found.

Thanks all for the feedback!  I hope this new patch is fine.

Kind regards,
Roel Janssen

Comments

Xinglu Chen April 30, 2021, 8:30 a.m. UTC | #1
On Thu, Apr 29 2021, Roel Janssen wrote:

> +(define-public pbgzip
> +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
> +    (package
> +      (name "pbgzip")
> +      (version (string-take commit 7))

Maybe you missed my previous suggestions?

  https://issues.guix.gnu.org/47930#2
  
> +      (source (origin
> +                (method git-fetch)
> +                (uri (git-reference
> +                      (url "https://github.com/nh13/pbgzip")
> +                      (commit commit)))
> +                (file-name (string-append name "-" version))
> +                (sha256
> +                 (base32
> +                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
> +      (build-system gnu-build-system)
> +      (native-inputs
> +       `(("autoconf" ,autoconf)
> +         ("automake" ,automake)))
> +      (inputs
> +       `(("zlib" ,zlib)))
> +      (home-page "https://github.com/nh13/pbgzip")
> +      (synopsis "Parallel Block GZIP")
> +      (description "This package implements parallel block gzip.  For many
> +formats, in particular genomics data formats, data are compressed in
> +fixed-length blocks such that they can be easily indexed based on a (genomic)
> +coordinate order, since typically each block is sorted according to this order.
> +This allows for each block to be individually compressed (deflated), or more
> +importantly, decompressed (inflated), with the latter enabling random retrieval
> +of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
> +to any particular format, but certain features are tailored to genomics data
> +formats when enabled.  Parallel decompression is somewhat faster, but truly the
                                                                     ^^^^^^^^^^^^^
> +speedup comes during compression.")
   ^^^^^^^

“but the true speedup” instead?
diff mbox series

Patch

From b03f8d8926cdd6a28502f2bdc6db74854144f050 Mon Sep 17 00:00:00 2001
From: Roel Janssen <roel@gnu.org>
Date: Thu, 29 Apr 2021 14:18:30 +0200
Subject: [PATCH] gnu: Add pbgzip.

* gnu/packages/bioinformatics.scm (pbgzip): New variable.
---
 gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 83ebfc2d8f..8c4d0fc649 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -3,7 +3,7 @@ 
 ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft <donttrustben@gmail.com>
 ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins <pjotr.guix@thebird.nl>
 ;;; Copyright © 2015 Andreas Enge <andreas@enge.fr>
-;;; Copyright © 2016, 2020 Roel Janssen <roel@gnu.org>
+;;; Copyright © 2016, 2020, 2021 Roel Janssen <roel@gnu.org>
 ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner <efraim@flashner.co.il>
 ;;; Copyright © 2016, 2020 Marius Bakke <mbakke@fastmail.com>
 ;;; Copyright © 2016, 2018 Raoul Bonnal <ilpuccio.febo@gmail.com>
@@ -571,6 +571,40 @@  input and output BAMs must adhere to the PacBio BAM format specification.
 Non-PacBio BAMs will cause exceptions to be thrown.")
     (license license:bsd-3)))
 
+(define-public pbgzip
+  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974"))
+    (package
+      (name "pbgzip")
+      (version (string-take commit 7))
+      (source (origin
+                (method git-fetch)
+                (uri (git-reference
+                      (url "https://github.com/nh13/pbgzip")
+                      (commit commit)))
+                (file-name (string-append name "-" version))
+                (sha256
+                 (base32
+                  "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd"))))
+      (build-system gnu-build-system)
+      (native-inputs
+       `(("autoconf" ,autoconf)
+         ("automake" ,automake)))
+      (inputs
+       `(("zlib" ,zlib)))
+      (home-page "https://github.com/nh13/pbgzip")
+      (synopsis "Parallel Block GZIP")
+      (description "This package implements parallel block gzip.  For many
+formats, in particular genomics data formats, data are compressed in
+fixed-length blocks such that they can be easily indexed based on a (genomic)
+coordinate order, since typically each block is sorted according to this order.
+This allows for each block to be individually compressed (deflated), or more
+importantly, decompressed (inflated), with the latter enabling random retrieval
+of data in large files (gigabytes to terabytes).  @code{pbgzip} is not limited
+to any particular format, but certain features are tailored to genomics data
+formats when enabled.  Parallel decompression is somewhat faster, but truly the
+speedup comes during compression.")
+      (license license:expat))))
+
 (define-public blasr-libcpp
   (package
     (name "blasr-libcpp")
-- 
2.31.1