diff mbox series

[bug#51198] gnu: Add b2sum.

Message ID YXlyYGZAOg7GhC5m@jasmine.lan
State Accepted
Headers show
Series [bug#51198] gnu: Add b2sum. | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch fail View Laminar job
cbaines/issue success View issue

Commit Message

Leo Famulari Oct. 27, 2021, 3:38 p.m. UTC
On Wed, Oct 27, 2021 at 10:44:45AM +0200, Nicolò Balzarotti wrote:
> This blog post [fn:1] on guix-hpc address the "Pre-built binaries
> vs. performance" dilemma.
>
> [fn:1] https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/

It's helpful, but it doesn't look like we are using that method in Guix
anymore. It was removed in this commit:

https://git.savannah.gnu.org/cgit/guix.git/commit/?id=969adb235ee34decb65255e1ea821ff0e221ed3d

I guess it learned how to do runtime feature detection?

> I guess the easiest way is to provide a variant (b2sum-avx or something
> like that) with avx enabled.  Else, I'd just go with the unoptimized
> version as it happens for many other packages, but let's hear from
> others.

The easiest thing is require local building, since it's an extremely
cheap build. Only 1.5 seconds on my laptop, total (not just the build
phase). With blis, one had to consider the lengthy build time.

Because performance is critical for a tool like this, and because it's
cheap to build, I've attached a patch to require local building. This is
easier for me than creating a set of package variants that will need to
be expanded for each new generation of CPUs :)
From 8b862425310cf631c30c823eb0fa2bfd79d36823 Mon Sep 17 00:00:00 2001
From: Leo Famulari <leo@famulari.name>
Date: Wed, 27 Oct 2021 11:29:40 -0400
Subject: [PATCH] gnu: b2sum: Build on the local machine.

* gnu/packages/crypto.scm (b2sum)[arguments]: Set '#:substitutable? #f'.
---
 gnu/packages/crypto.scm | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Ludovic Courtès Oct. 27, 2021, 3:56 p.m. UTC | #1
Hi!

Leo Famulari <leo@famulari.name> skribis:

> On Wed, Oct 27, 2021 at 10:44:45AM +0200, Nicolò Balzarotti wrote:
>> This blog post [fn:1] on guix-hpc address the "Pre-built binaries
>> vs. performance" dilemma.
>>
>> [fn:1] https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/
>
> It's helpful, but it doesn't look like we are using that method in Guix
> anymore. It was removed in this commit:
>
> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=969adb235ee34decb65255e1ea821ff0e221ed3d
>
> I guess it learned how to do runtime feature detection?

BLIS has run-time/load-time feature detection now (which is the main
approach the blog argues for).

>> I guess the easiest way is to provide a variant (b2sum-avx or something
>> like that) with avx enabled.  Else, I'd just go with the unoptimized
>> version as it happens for many other packages, but let's hear from
>> others.
>
> The easiest thing is require local building, since it's an extremely
> cheap build. Only 1.5 seconds on my laptop, total (not just the build
> phase). With blis, one had to consider the lengthy build time.
>
> Because performance is critical for a tool like this, and because it's
> cheap to build, I've attached a patch to require local building. This is
> easier for me than creating a set of package variants that will need to
> be expanded for each new generation of CPUs :)
>
>>From 8b862425310cf631c30c823eb0fa2bfd79d36823 Mon Sep 17 00:00:00 2001
> From: Leo Famulari <leo@famulari.name>
> Date: Wed, 27 Oct 2021 11:29:40 -0400
> Subject: [PATCH] gnu: b2sum: Build on the local machine.
>
> * gnu/packages/crypto.scm (b2sum)[arguments]: Set '#:substitutable? #f'.
> ---
>  gnu/packages/crypto.scm | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm
> index 3acd147f25..de8bcf1d70 100644
> --- a/gnu/packages/crypto.scm
> +++ b/gnu/packages/crypto.scm
> @@ -858,9 +858,18 @@ (define-public b2sum
>                   (base32 "04z631v0vzl52g73v390ask5fnzi5wg83lcjkjhpmmymaz0jn152"))))
>        (build-system gnu-build-system)
>        (arguments
> -       `(#:make-flags (list (string-append "CC=" ,(cc-for-target))
> -                            (string-append "PREFIX=" (assoc-ref %outputs "out")))
> +       `(;; By default, b2sum uses the compiler to generate instructions
> +         ;; tailored to the CPU of the running machine, using "-march=native".
> +         ;; This gives a ~1.5x speedup on a Core i5-6300U with a large dataset
> +         ;; paged in, whereas compilation of b2sum takes ~1.5 seconds.
> +         ;; b2sum does not support run-time feature detection:
> +         ;; https://github.com/BLAKE2/BLAKE2/issues/1
> +         ;; For more information, see the discussion beginning here:
> +         ;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51198#13
> +         #:substitutable? #f

It’s okay but not entirely sufficient: on a cluster setup, you typically
talk to a daemon that’s on another machine, so the CPU features it’ll
detect there may be different from those you’ll use.  Likewise, as
discussed on IRC, you’d also need #:local-build? #t.

Anyway, I’d suggest using ‘guix hash -H blake2s-256’ or similar (it uses
libgcrypt, which does the right thing).  I think the latest Coreutils
provide a generic ‘cksum’, too, that probably does the right thing.

Ludo’.
diff mbox series

Patch

diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm
index 3acd147f25..de8bcf1d70 100644
--- a/gnu/packages/crypto.scm
+++ b/gnu/packages/crypto.scm
@@ -858,9 +858,18 @@  (define-public b2sum
                  (base32 "04z631v0vzl52g73v390ask5fnzi5wg83lcjkjhpmmymaz0jn152"))))
       (build-system gnu-build-system)
       (arguments
-       `(#:make-flags (list (string-append "CC=" ,(cc-for-target))
-                            (string-append "PREFIX=" (assoc-ref %outputs "out")))
+       `(;; By default, b2sum uses the compiler to generate instructions
+         ;; tailored to the CPU of the running machine, using "-march=native".
+         ;; This gives a ~1.5x speedup on a Core i5-6300U with a large dataset
+         ;; paged in, whereas compilation of b2sum takes ~1.5 seconds.
+         ;; b2sum does not support run-time feature detection:
+         ;; https://github.com/BLAKE2/BLAKE2/issues/1
+         ;; For more information, see the discussion beginning here:
+         ;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51198#13
+         #:substitutable? #f
          #:tests? #f ; No test suite
+         #:make-flags (list (string-append "CC=" ,(cc-for-target))
+                            (string-append "PREFIX=" (assoc-ref %outputs "out")))
          #:phases
          (modify-phases %standard-phases
            (add-before 'build 'change-directory