Message ID | YXlyYGZAOg7GhC5m@jasmine.lan |
---|---|
State | Accepted |
Headers | show |
Series | [bug#51198] gnu: Add b2sum. | expand |
Context | Check | Description |
---|---|---|
cbaines/comparison | success | View comparision |
cbaines/git branch | success | View Git branch |
cbaines/applying patch | fail | View Laminar job |
cbaines/issue | success | View issue |
Hi! Leo Famulari <leo@famulari.name> skribis: > On Wed, Oct 27, 2021 at 10:44:45AM +0200, Nicolò Balzarotti wrote: >> This blog post [fn:1] on guix-hpc address the "Pre-built binaries >> vs. performance" dilemma. >> >> [fn:1] https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/ > > It's helpful, but it doesn't look like we are using that method in Guix > anymore. It was removed in this commit: > > https://git.savannah.gnu.org/cgit/guix.git/commit/?id=969adb235ee34decb65255e1ea821ff0e221ed3d > > I guess it learned how to do runtime feature detection? BLIS has run-time/load-time feature detection now (which is the main approach the blog argues for). >> I guess the easiest way is to provide a variant (b2sum-avx or something >> like that) with avx enabled. Else, I'd just go with the unoptimized >> version as it happens for many other packages, but let's hear from >> others. > > The easiest thing is require local building, since it's an extremely > cheap build. Only 1.5 seconds on my laptop, total (not just the build > phase). With blis, one had to consider the lengthy build time. > > Because performance is critical for a tool like this, and because it's > cheap to build, I've attached a patch to require local building. This is > easier for me than creating a set of package variants that will need to > be expanded for each new generation of CPUs :) > >>From 8b862425310cf631c30c823eb0fa2bfd79d36823 Mon Sep 17 00:00:00 2001 > From: Leo Famulari <leo@famulari.name> > Date: Wed, 27 Oct 2021 11:29:40 -0400 > Subject: [PATCH] gnu: b2sum: Build on the local machine. > > * gnu/packages/crypto.scm (b2sum)[arguments]: Set '#:substitutable? #f'. > --- > gnu/packages/crypto.scm | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm > index 3acd147f25..de8bcf1d70 100644 > --- a/gnu/packages/crypto.scm > +++ b/gnu/packages/crypto.scm > @@ -858,9 +858,18 @@ (define-public b2sum > (base32 "04z631v0vzl52g73v390ask5fnzi5wg83lcjkjhpmmymaz0jn152")))) > (build-system gnu-build-system) > (arguments > - `(#:make-flags (list (string-append "CC=" ,(cc-for-target)) > - (string-append "PREFIX=" (assoc-ref %outputs "out"))) > + `(;; By default, b2sum uses the compiler to generate instructions > + ;; tailored to the CPU of the running machine, using "-march=native". > + ;; This gives a ~1.5x speedup on a Core i5-6300U with a large dataset > + ;; paged in, whereas compilation of b2sum takes ~1.5 seconds. > + ;; b2sum does not support run-time feature detection: > + ;; https://github.com/BLAKE2/BLAKE2/issues/1 > + ;; For more information, see the discussion beginning here: > + ;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51198#13 > + #:substitutable? #f It’s okay but not entirely sufficient: on a cluster setup, you typically talk to a daemon that’s on another machine, so the CPU features it’ll detect there may be different from those you’ll use. Likewise, as discussed on IRC, you’d also need #:local-build? #t. Anyway, I’d suggest using ‘guix hash -H blake2s-256’ or similar (it uses libgcrypt, which does the right thing). I think the latest Coreutils provide a generic ‘cksum’, too, that probably does the right thing. Ludo’.
diff --git a/gnu/packages/crypto.scm b/gnu/packages/crypto.scm index 3acd147f25..de8bcf1d70 100644 --- a/gnu/packages/crypto.scm +++ b/gnu/packages/crypto.scm @@ -858,9 +858,18 @@ (define-public b2sum (base32 "04z631v0vzl52g73v390ask5fnzi5wg83lcjkjhpmmymaz0jn152")))) (build-system gnu-build-system) (arguments - `(#:make-flags (list (string-append "CC=" ,(cc-for-target)) - (string-append "PREFIX=" (assoc-ref %outputs "out"))) + `(;; By default, b2sum uses the compiler to generate instructions + ;; tailored to the CPU of the running machine, using "-march=native". + ;; This gives a ~1.5x speedup on a Core i5-6300U with a large dataset + ;; paged in, whereas compilation of b2sum takes ~1.5 seconds. + ;; b2sum does not support run-time feature detection: + ;; https://github.com/BLAKE2/BLAKE2/issues/1 + ;; For more information, see the discussion beginning here: + ;; https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51198#13 + #:substitutable? #f #:tests? #f ; No test suite + #:make-flags (list (string-append "CC=" ,(cc-for-target)) + (string-append "PREFIX=" (assoc-ref %outputs "out"))) #:phases (modify-phases %standard-phases (add-before 'build 'change-directory