diff mbox series

[bug#38505] gnu: Add fast-screen.

Message ID 20191205212114.4971-1-madalinionel.patrascu@mdc-berlin.de
State Accepted
Headers show
Series [bug#38505] gnu: Add fast-screen. | expand

Commit Message

Mădălin Ionel Patrașcu Dec. 5, 2019, 9:21 p.m. UTC
* gnu/packages/bioinformatics.scm (fast-screen): New variable.
---
 gnu/packages/bioinformatics.scm | 53 +++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

Comments

Efraim Flashner Dec. 11, 2019, 2:16 a.m. UTC | #1
It's not clear to me what perl is used for in this package.
Ricardo Wurmus Dec. 16, 2019, 10:41 p.m. UTC | #2
Hi Mădălin,

> * gnu/packages/bioinformatics.scm (fast-screen): New variable.

I don’t think this package actually produces a usable output.

Frustratingly, this is a Perl script which calls out to tools that
happen to be on the user’s PATH, such as Bismark (which is written in
the same style, so it may be enlightening to read its package
definition), bwa, or bowtie.

Simply copying the script to the store won’t yield a usable tool I’m
afraid.

For Bismark I talked to the authors in the past in the hopes of
simplifying configuration at build time, but they were not interested in
changing the tool to accomodate any other case than the one Bismark was
designed for: to be unpacked in an already suitable environment.

This means that we can’t count on upstream to change this and patch the
source file by ourselves.  One way is to patch every invocation of an
external command; another is to wrap the script itself (with
“wrap-script”) in PATH and PERL5PATH to provide a suitable environment
at runtime.  (Wrapping PERL5PATH may be necessary anyway to ensure that
Perl can find the required modules.)

Good luck!

--
Ricardo
Maxim Cournoyer March 18, 2022, 4:13 a.m. UTC | #3
Hello,

Ricardo Wurmus <rekado@elephly.net> writes:

> Hi Mădălin,
>
>> * gnu/packages/bioinformatics.scm (fast-screen): New variable.
>
> I don’t think this package actually produces a usable output.
>
> Frustratingly, this is a Perl script which calls out to tools that
> happen to be on the user’s PATH, such as Bismark (which is written in
> the same style, so it may be enlightening to read its package
> definition), bwa, or bowtie.
>
> Simply copying the script to the store won’t yield a usable tool I’m
> afraid.
>
> For Bismark I talked to the authors in the past in the hopes of
> simplifying configuration at build time, but they were not interested in
> changing the tool to accomodate any other case than the one Bismark was
> designed for: to be unpacked in an already suitable environment.
>
> This means that we can’t count on upstream to change this and patch the
> source file by ourselves.  One way is to patch every invocation of an
> external command; another is to wrap the script itself (with
> “wrap-script”) in PATH and PERL5PATH to provide a suitable environment
> at runtime.  (Wrapping PERL5PATH may be necessary anyway to ensure that
> Perl can find the required modules.)
>
> Good luck!

Some 2 years later, are you still up to the challenge hinted at by
Ricardo? :-)

Thanks,

Maxim
diff mbox series

Patch

diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index 74a44874ee..53e4c7296f 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -15341,3 +15341,56 @@  methylation metrics from them.  MethylDackel requires an indexed fasta file
 containing the reference genome as well.")
     ;; See https://github.com/dpryan79/MethylDackel/issues/85
     (license license:expat)))
+
+(define-public fastq-screen
+  (package
+    (name "fastq-screen")
+    (version "0.14.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (string-append "https://www.bioinformatics.babraham.ac.uk/projects/"
+			   "fastq_screen/fastq_screen_v" version ".tar.gz"))
+       (sha256
+        (base32
+         "0m7n9b1pr8rk1pd3va0mr69pd7gddcsvrvlk2s7907i02wkc1say"))))
+    (build-system trivial-build-system)
+    (arguments
+     ;; it is just an extraction processs
+     `(#:modules ((guix build utils))
+       #:builder
+       (begin
+	 (use-modules (guix build utils))
+	 (let* ((tar (assoc-ref %build-inputs "tar"))
+	        (gzip (assoc-ref %build-inputs "gzip"))
+	        (out (assoc-ref %outputs "out"))
+	        (doc (string-append out "/share/doc"))
+	        (bin (string-append out "/bin")))
+	   (setenv "PATH" (string-append tar "/bin:" gzip "/bin"))
+	   (invoke "tar" "xvf" (assoc-ref %build-inputs "source"))
+	   (chdir (string-append "fastq_screen_v" ,version))
+           (install-file "fastq_screen" bin)
+	   (install-file "fastq_screen.conf.example" doc)
+	   (install-file "fastq_screen_documentation.md" doc)
+	   (install-file "RELEASE_NOTES.txt" doc)
+	   #t))))
+    (inputs
+     `(("perl" ,perl)))
+    (native-inputs
+     `(("gzip" ,gzip)
+       ("tar" ,tar)))
+    (home-page "https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/")
+    (synopsis "FastQ Screen allows to search a large sequence dataset")
+    (description
+     "FastQ Screen is an application which allows you to search a large sequence
+dataset against a panel of different databases to build up a picture of where
+the sequences in your data originate.  The program was built as a quality
+control check for sequencing pipelines but may also have uses in metagenomics
+studies where mixed samples are expected.  The application generates both text
+and graphical output to inform you what proportion of your library was able to
+map, either uniquely or to more than one location, against each of your
+specified reference genomes.  The user should therefore be able to identify a
+clean sequencing experiment in which the overwhelming majority of reads are
+probably derived from a single genomic origin.")
+    (license license:gpl3+)))
+