From patchwork Fri Apr 30 11:48:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Roel Janssen X-Patchwork-Id: 29007 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 9DA5127BC7D; Fri, 30 Apr 2021 14:34:00 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 0FDA227BC7C for ; Fri, 30 Apr 2021 14:34:00 +0100 (BST) Received: from localhost ([::1]:50272 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lcTHL-0003XH-2w for patchwork@mira.cbaines.net; Fri, 30 Apr 2021 09:33:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46718) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lcRej-0004YP-Vz for guix-patches@gnu.org; Fri, 30 Apr 2021 07:50:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:46133) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lcRej-0008KH-Nn for guix-patches@gnu.org; Fri, 30 Apr 2021 07:50:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lcRej-0006bH-Lh for guix-patches@gnu.org; Fri, 30 Apr 2021 07:50:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#47930] [PATCH] gnu: Add pbgzip. Resent-From: Roel Janssen Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 30 Apr 2021 11:50:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 47930 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Xinglu Chen , Efraim Flashner , Maxime Devos Cc: 47930@debbugs.gnu.org Received: via spool by 47930-submit@debbugs.gnu.org id=B47930.161978334625303 (code B ref 47930); Fri, 30 Apr 2021 11:50:01 +0000 Received: (at 47930) by debbugs.gnu.org; 30 Apr 2021 11:49:06 +0000 Received: from localhost ([127.0.0.1]:57679 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lcRdl-0006Zz-TG for submit@debbugs.gnu.org; Fri, 30 Apr 2021 07:49:06 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38184) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lcRdj-0006ZV-Ob for 47930@debbugs.gnu.org; Fri, 30 Apr 2021 07:49:00 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:42837) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lcRdd-0007h7-2J; Fri, 30 Apr 2021 07:48:53 -0400 Received: from 2001-1c02-0b18-2900-222e-248e-9586-e52d.cable.dynamic.v6.ziggo.nl ([2001:1c02:b18:2900:222e:248e:9586:e52d]:46314) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lcRdc-0003nY-FJ; Fri, 30 Apr 2021 07:48:52 -0400 Message-ID: <052ee880cea08e4e1627a2181f7173ab9587b6c8.camel@gnu.org> From: Roel Janssen Date: Fri, 30 Apr 2021 13:48:48 +0200 In-Reply-To: <87czucnphi.fsf@yoctocell.xyz> References: <874kfz71ni.fsf@yoctocell.xyz> <87czucnphi.fsf@yoctocell.xyz> User-Agent: Evolution 3.40.0 (3.40.0-1.fc34) MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches On Fri, 2021-04-30 at 10:30 +0200, Xinglu Chen wrote: > On Thu, Apr 29 2021, Roel Janssen wrote: > > > +(define-public pbgzip > > +  (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974")) > > +    (package > > +      (name "pbgzip") > > +      (version (string-take commit 7)) > > Maybe you missed my previous suggestions? > >   https://issues.guix.gnu.org/47930#2 > I'm sorry, I forgot to adapt. >   > > +      (source (origin > > +                (method git-fetch) > > +                (uri (git-reference > > +                      (url "https://github.com/nh13/pbgzip") > > +                      (commit commit))) > > +                (file-name (string-append name "-" version)) > > +                (sha256 > > +                 (base32 > > +                  > > "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd")))) > > +      (build-system gnu-build-system) > > +      (native-inputs > > +       `(("autoconf" ,autoconf) > > +         ("automake" ,automake))) > > +      (inputs > > +       `(("zlib" ,zlib))) > > +      (home-page "https://github.com/nh13/pbgzip") > > +      (synopsis "Parallel Block GZIP") > > +      (description "This package implements parallel block gzip.  > > For many > > +formats, in particular genomics data formats, data are compressed > > in > > +fixed-length blocks such that they can be easily indexed based on > > a (genomic) > > +coordinate order, since typically each block is sorted according > > to this order. > > +This allows for each block to be individually compressed > > (deflated), or more > > +importantly, decompressed (inflated), with the latter enabling > > random retrieval > > +of data in large files (gigabytes to terabytes).  @code{pbgzip} is > > not limited > > +to any particular format, but certain features are tailored to > > genomics data > > +formats when enabled.  Parallel decompression is somewhat faster, > > but truly the >                                                                      > ^^^^^^^^^^^^^ > > +speedup comes during compression.") >    ^^^^^^^ > > “but the true speedup” instead? Sure. I usually don't change descriptions as given by the creators of the software, but I applied your suggestion. Thank you for the elaborate suggestions! I attached another version of the patch, which I hope is fine now. :) Kind regards, Roel Janssen From 1af29f66980ba19740e05a27135f141e23b7fd3f Mon Sep 17 00:00:00 2001 From: Roel Janssen Date: Fri, 30 Apr 2021 13:47:43 +0200 Subject: [PATCH] gnu: Add pbgzip. * gnu/packages/bioinformatics.scm (pbgzip): New variable. --- gnu/packages/bioinformatics.scm | 36 ++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm index 83ebfc2d8f..cd2dae05d5 100644 --- a/gnu/packages/bioinformatics.scm +++ b/gnu/packages/bioinformatics.scm @@ -3,7 +3,7 @@ ;;; Copyright © 2015, 2016, 2017, 2018 Ben Woodcroft ;;; Copyright © 2015, 2016, 2018, 2019, 2020 Pjotr Prins ;;; Copyright © 2015 Andreas Enge -;;; Copyright © 2016, 2020 Roel Janssen +;;; Copyright © 2016, 2020, 2021 Roel Janssen ;;; Copyright © 2016, 2017, 2018, 2019, 2020, 2021 Efraim Flashner ;;; Copyright © 2016, 2020 Marius Bakke ;;; Copyright © 2016, 2018 Raoul Bonnal @@ -571,6 +571,40 @@ input and output BAMs must adhere to the PacBio BAM format specification. Non-PacBio BAMs will cause exceptions to be thrown.") (license license:bsd-3))) +(define-public pbgzip + (let ((commit "2b09f97b5f20b6d83c63a5c6b408d152e3982974")) + (package + (name "pbgzip") + (version (git-version "0.0.0" "0" commit)) + (source (origin + (method git-fetch) + (uri (git-reference + (url "https://github.com/nh13/pbgzip") + (commit commit))) + (file-name (git-file-name name version)) + (sha256 + (base32 + "1mlmq0v96irbz71bgw5zcc43g1x32zwnxx21a5p1f1ch4cikw1yd")))) + (build-system gnu-build-system) + (native-inputs + `(("autoconf" ,autoconf) + ("automake" ,automake))) + (inputs + `(("zlib" ,zlib))) + (home-page "https://github.com/nh13/pbgzip") + (synopsis "Parallel Block GZIP") + (description "This package implements parallel block gzip. For many +formats, in particular genomics data formats, data are compressed in +fixed-length blocks such that they can be easily indexed based on a (genomic) +coordinate order, since typically each block is sorted according to this order. +This allows for each block to be individually compressed (deflated), or more +importantly, decompressed (inflated), with the latter enabling random retrieval +of data in large files (gigabytes to terabytes). @code{pbgzip} is not limited +to any particular format, but certain features are tailored to genomics data +formats when enabled. Parallel decompression is somewhat faster, but the true +speedup comes during compression.") + (license license:expat)))) + (define-public blasr-libcpp (package (name "blasr-libcpp") -- 2.31.1