mbox series

[bug#43679,0/5] Add '--with-toolchain' package transformation option

Message ID 20200928195305.30096-1-ludo@gnu.org
Headers show
Series Add '--with-toolchain' package transformation option | expand

Message

Ludovic Courtès Sept. 28, 2020, 7:53 p.m. UTC
From: Ludovic Courtès <ludovic.courtes@inria.fr>

Hello!

This patch series adds the ‘--with-toolchain’ option.  I’ve
tested it with gcc-toolchain@10 and clang-toolchain, and I can say
it works as advertised.  :-)

One thing I wasn’t entirely sure about: ‘--with-toolchain’ changes
the toolchain of the specified package, not that of its dependents.
This assumes that the toolchains all follow the same ABI.  This is
the case for C, apparently, maybe not for C++.  Should it instead
change to toolchain of the package’s dependents as well?

Something like:

  guix build guile --with-toolchain=guile@3.0.4=clang-toolchain 

generates working code.

Another issue is that since we use ‘package-input-rewriting/spec’,
we can’t change the toolchain of core packages like Guile or Perl
without rebuilding the world.  For example, if we omit “@3.0.4”
in the example above, we rebuild a “guile” package deep down and
everything that follows (aka. “the world”).

Another option I considered was to graft the package that
‘--with-toolchain’ targets instead of rebuilding its dependents.
Again that’d only work if the resulting binaries are ABI-compatible,
but maybe that’s a reasonable assumption.  It would definitely save
build time.  Should it be grafted, or should there be a separate
option to do that?  Thoughts?

Last, when doing ‘--with-toolchain=foo=gcc-toolchain’, I noticed
that ‘foo’ would keep a reference to ‘gcc-toolchain’ for some obscure
reasons:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build hello --with-toolchain=hello=gcc-toolchain
/gnu/store/qi7pqqsxhbwmy75hl43j7l0aw1xr7r42-hello-2.10
$ grep -r $(guix build gcc-toolchain | head -2 |tail -1) /gnu/store/qi7pqqsxhbwmy75hl43j7l0aw1xr7r42-hello-2.10
Duuma dosiero /gnu/store/qi7pqqsxhbwmy75hl43j7l0aw1xr7r42-hello-2.10/bin/hello kongruas
$ strings /gnu/store/qi7pqqsxhbwmy75hl43j7l0aw1xr7r42-hello-2.10/bin/hello | grep $(guix build gcc-toolchain | head -2 |tail -1)
/gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib:/gnu/store/qj38f3vi4q1d7z30hkpaxyajv49rwamb-gcc-10.2.0-lib/lib:/gnu/store/qj38f3vi4q1d7z30hkpaxyajv49rwamb-gcc-10.2.0-lib/lib/gcc/x86_64-unknown-linux-gnu/10.2.0/../../..:/gnu/store/pknm43xsza6nlc7bn27djip8fis92akd-gcc-toolchain-10.2.0/lib
--8<---------------cut here---------------end--------------->8---

Not a showstopper but would be nice to address.

Feedback welcome!

Ludo’.

Ludovic Courtès (5):
  gnu: gcc-toolchain: Add 'GUIX_LOCPATH' to the search paths.
  gnu: clang-toolchain: Add 'GUIX_LOCPATH' to the search paths.
  gnu: clang-toolchain: Create 'cc' and 'c++' symlinks.
  packages: Add 'package-with-toolchain'.
  guix build: Add '--with-toolchain'.

 doc/guix.texi                 | 61 +++++++++++++++++++++++++++++++++++
 gnu/packages/commencement.scm |  8 +++--
 gnu/packages/llvm.scm         | 12 ++++++-
 guix/build-system.scm         | 35 ++++++++++++++++++--
 guix/packages.scm             |  9 ++++++
 guix/scripts/build.scm        | 40 +++++++++++++++++++++++
 tests/packages.scm            | 20 ++++++++++++
 tests/scripts-build.scm       | 30 +++++++++++++++++
 8 files changed, 210 insertions(+), 5 deletions(-)

Comments

Simon Tournier Sept. 29, 2020, 10:44 a.m. UTC | #1
Hi,

On Mon, 28 Sep 2020 at 21:53, Ludovic Courtès <ludo@gnu.org> wrote:
> From: Ludovic Courtès <ludovic.courtes@inria.fr>

> One thing I wasn’t entirely sure about: ‘--with-toolchain’ changes
> the toolchain of the specified package, not that of its dependents.
> This assumes that the toolchains all follow the same ABI.  This is
> the case for C, apparently, maybe not for C++.  Should it instead
> change to toolchain of the package’s dependents as well?
>
> Something like:
>
>   guix build guile --with-toolchain=guile@3.0.4=clang-toolchain 
>
> generates working code.

Really cool!  Playing yesterday with the new ’package-mapping’ &
co. (checking ’package-with-explicit-ocaml’), a kind of new
’–with-toolchain’ option was my conclusion. :-)


However, ’–with-toolchain’ can be misleading since it is
’gnu-build-system’ and C/C++ software specific.  I mean, the patch #4
adding ’build-system-with-toolchain’ contains:

--8<---------------cut here---------------start------------->8---
+  (define toolchain-packages
+    ;; These are the GNU toolchain packages pulled in by GNU-BUILD-SYSTEM and
+    ;; all the build systems that inherit from it.  Keep the list in sync with
+    ;; 'standard-packages' in (guix build-system gnu).
+    '("gcc" "binutils" "libc" "libc:static" "ld-wrapper"))
+
+  (define (lower* . args)
+    (let ((lowered (apply lower args)))
+      (bag
+        (inherit lowered)
+        (build-inputs
+         (append (fold alist-delete
+                       (bag-build-inputs lowered)
+                       toolchain-packages)
+                 toolchain)))))
--8<---------------cut here---------------end--------------->8---

And for example, it will not remove ’default-ocaml’ and
’default-findlib’ in the ’ocaml-build-system’.  Even if it would be easy
to specify the options “–with-input=ocaml=ocaml-variant
–with-input=findlib=findlib-variant”.  But for the
’clojure-build-system’ it is 3 packages.

Another example a bit out-of-scope is to rebuild all the Emacs stack
using the package ’emacs-next’ instead of ’emacs’.  The
’emacs-build-system’ depends on ’emacs-minimal’ but some packages (see
’emacs-magit’) rewrite that using instead ’emacs-no-x’.  It could be
nice to be able to write:

  guix build -m manifest.m --with-toolchain=emacs-next-toolchain



In summary, does it make sense, either:

  - change the ’–with-toolchain’ to ’–with-gcc-toolchain’
or
  - tweak ’build-system-with-toolchain’ to pass ’toolchain-packages’ as
  parameter somehow and be able to run:
  
     guix build coq --with-toolchain=coq=ocaml-toolchain4.07
     
?     


> Another issue is that since we use ‘package-input-rewriting/spec’,
> we can’t change the toolchain of core packages like Guile or Perl
> without rebuilding the world.  For example, if we omit “@3.0.4”
> in the example above, we rebuild a “guile” package deep down and
> everything that follows (aka. “the world”).

Yeah but that’s maybe what people want: rebuild the world with another
toolchain, probably optimized for some specific machine (HPC cluster).


> Another option I considered was to graft the package that
> ‘--with-toolchain’ targets instead of rebuilding its dependents.
> Again that’d only work if the resulting binaries are ABI-compatible,
> but maybe that’s a reasonable assumption.  It would definitely save
> build time.  Should it be grafted, or should there be a separate
> option to do that?  Thoughts?

From my perspective, it should be another option.  For example, I
imagine people want to rebuild all the stack with Name-It© compiler.  Or
the Name-It© compiler could be not-ABI compatible.


All the best,
simon
Ludovic Courtès Sept. 30, 2020, 8:46 a.m. UTC | #2
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> On Mon, 28 Sep 2020 at 21:53, Ludovic Courtès <ludo@gnu.org> wrote:
>> From: Ludovic Courtès <ludovic.courtes@inria.fr>
>
>> One thing I wasn’t entirely sure about: ‘--with-toolchain’ changes
>> the toolchain of the specified package, not that of its dependents.
>> This assumes that the toolchains all follow the same ABI.  This is
>> the case for C, apparently, maybe not for C++.  Should it instead
>> change to toolchain of the package’s dependents as well?
>>
>> Something like:
>>
>>   guix build guile --with-toolchain=guile@3.0.4=clang-toolchain 
>>
>> generates working code.

[...]

> However, ’–with-toolchain’ can be misleading since it is
> ’gnu-build-system’ and C/C++ software specific.  I mean, the patch #4
> adding ’build-system-with-toolchain’ contains:
>
> +  (define toolchain-packages
> +    ;; These are the GNU toolchain packages pulled in by GNU-BUILD-SYSTEM and
> +    ;; all the build systems that inherit from it.  Keep the list in sync with
> +    ;; 'standard-packages' in (guix build-system gnu).
> +    '("gcc" "binutils" "libc" "libc:static" "ld-wrapper"))
> +
> +  (define (lower* . args)
> +    (let ((lowered (apply lower args)))
> +      (bag
> +        (inherit lowered)
> +        (build-inputs
> +         (append (fold alist-delete
> +                       (bag-build-inputs lowered)
> +                       toolchain-packages)
> +                 toolchain)))))

Yeah this option is meant for C/C++ as I wrote above and (I think) in
the documentation.

> Another example a bit out-of-scope is to rebuild all the Emacs stack
> using the package ’emacs-next’ instead of ’emacs’.  The
> ’emacs-build-system’ depends on ’emacs-minimal’ but some packages (see
> ’emacs-magit’) rewrite that using instead ’emacs-no-x’.  It could be
> nice to be able to write:
>
>   guix build -m manifest.m --with-toolchain=emacs-next-toolchain

Here you’d use ‘--with-input’, though package transformation options
have no effect when using a manifest.

> In summary, does it make sense, either:
>
>   - change the ’–with-toolchain’ to ’–with-gcc-toolchain’

‘--with-gcc-toolchain=clang-toolchain’ would look strange.  :-)

>   - tweak ’build-system-with-toolchain’ to pass ’toolchain-packages’ as
>   parameter somehow and be able to run:
>   
>      guix build coq --with-toolchain=coq=ocaml-toolchain4.07

Can’t you use ‘--with-input=ocamlX.Y=ocamlA.B’ in this case?  If not, we
could devise a separate option rather than overload this one.

>> Another issue is that since we use ‘package-input-rewriting/spec’,
>> we can’t change the toolchain of core packages like Guile or Perl
>> without rebuilding the world.  For example, if we omit “@3.0.4”
>> in the example above, we rebuild a “guile” package deep down and
>> everything that follows (aka. “the world”).
>
> Yeah but that’s maybe what people want: rebuild the world with another
> toolchain, probably optimized for some specific machine (HPC cluster).

Yes, though it doesn’t necessarily make sense.  :-)

But yeah, perhaps rebuilding everything above the given package would be
more in line with what people expect.

>> Another option I considered was to graft the package that
>> ‘--with-toolchain’ targets instead of rebuilding its dependents.
>> Again that’d only work if the resulting binaries are ABI-compatible,
>> but maybe that’s a reasonable assumption.  It would definitely save
>> build time.  Should it be grafted, or should there be a separate
>> option to do that?  Thoughts?
>
> From my perspective, it should be another option.  For example, I
> imagine people want to rebuild all the stack with Name-It© compiler.  Or
> the Name-It© compiler could be not-ABI compatible.

I’m not interested in proprietary compilers if that’s what you have in
mind.  Besides, the SysV ABI is defined for C, so normally all C
compilers produce ABI-compatible code.  There are exceptions such as
OpenMP (Clang is moving to their own libomp, I think, whereas GCC has
libgomp.)

Thanks for your feedback!

Ludo’.
Simon Tournier Sept. 30, 2020, 1:32 p.m. UTC | #3
Hi,

On Wed, 30 Sep 2020 at 10:46, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

> > However, ’–with-toolchain’ can be misleading since it is
> > ’gnu-build-system’ and C/C++ software specific.  I mean, the patch #4
> > adding ’build-system-with-toolchain’ contains:

[...]

> Yeah this option is meant for C/C++ as I wrote above and (I think) in
> the documentation.

Yes in the manual, not in the command line helper.

Without bikeshedding, I find '--with-toolchain' a bad name since it is
only 'gnu-build-system' related.  And from my point of view, it is
also a bad name for the procedures 'build-system-with-toolchain' and
'package-with-toolchain' -- but it does not matter since they are not
written in stone, contrary to command line options harder to change.


> >   - change the ’–with-toolchain’ to ’–with-gcc-toolchain’
>
> ‘--with-gcc-toolchain=clang-toolchain’ would look strange.  :-)

Why not? :-)
My point is: '--with-toolchain' is not specific enough.  Maybe
'--with-gnu-toolchain'?


> >   - tweak ’build-system-with-toolchain’ to pass ’toolchain-packages’ as
> >   parameter somehow and be able to run:
> >
> >      guix build coq --with-toolchain=coq=ocaml-toolchain4.07
>
> Can’t you use ‘--with-input=ocamlX.Y=ocamlA.B’ in this case?  If not, we
> could devise a separate option rather than overload this one.

No, in this case one should use:

   guix build coq \
          --with-input=ocaml=ocaml@4.07 \
          --with-input=ocaml-findlib=ocaml4.07-findlib

to recompile the package 'coq' with the 4.07 'ocaml-build-system'.
For the 'clojure-build-system', there are 3 inputs to specify.  (I
have not checked all the build systems :-)).  And note it works only
if the tools used by the build system are not hidden.

For consistency, it appears to me easier to have one "toolchain" per
build system, say ocaml-toolchain, gcc-toolchain, haskell-toolchain,
and then provides this toolchain to the option '--with-toolchain'.
However, it is complicated to remove the 'build-inputs' since they are
not hard coded -- as it is the case in 'build-system-with-toolchain'.
Or another option is to have one command line option per build system:
--with-gnu-toolchain, --with-ocaml-toolchain, --with-cargo-toolchain,
etc..


> > Yeah but that’s maybe what people want: rebuild the world with another
> > toolchain, probably optimized for some specific machine (HPC cluster).
>
> Yes, though it doesn’t necessarily make sense.  :-)

Sadly!

> But yeah, perhaps rebuilding everything above the given package would be
> more in line with what people expect.

Yeah...maybe providing "what people expect" could reduce the gap in
the HPC community.


> >> Another option I considered was to graft the package that
> >> ‘--with-toolchain’ targets instead of rebuilding its dependents.
> >> Again that’d only work if the resulting binaries are ABI-compatible,
> >> but maybe that’s a reasonable assumption.  It would definitely save
> >> build time.  Should it be grafted, or should there be a separate
> >> option to do that?  Thoughts?
> >
> > From my perspective, it should be another option.  For example, I
> > imagine people want to rebuild all the stack with Name-It© compiler.  Or
> > the Name-It© compiler could be not-ABI compatible.
>
> I’m not interested in proprietary compilers if that’s what you have in
> mind.  Besides, the SysV ABI is defined for C, so normally all C
> compilers produce ABI-compatible code.  There are exceptions such as
> OpenMP (Clang is moving to their own libomp, I think, whereas GCC has
> libgomp.)

It was what I have in mind. :-)
But do the exceptions you point not imply another option?

All the best,
simon
Ludovic Courtès Sept. 30, 2020, 4:58 p.m. UTC | #4
zimoun <zimon.toutoune@gmail.com> skribis:

> On Wed, 30 Sep 2020 at 10:46, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
>
>> > However, ’–with-toolchain’ can be misleading since it is
>> > ’gnu-build-system’ and C/C++ software specific.  I mean, the patch #4
>> > adding ’build-system-with-toolchain’ contains:
>
> [...]
>
>> Yeah this option is meant for C/C++ as I wrote above and (I think) in
>> the documentation.
>
> Yes in the manual, not in the command line helper.
>
> Without bikeshedding, I find '--with-toolchain' a bad name since it is
> only 'gnu-build-system' related.  And from my point of view, it is
> also a bad name for the procedures 'build-system-with-toolchain' and
> 'package-with-toolchain' -- but it does not matter since they are not
> written in stone, contrary to command line options harder to change.

I agree that C/C++ don’t have a monopoly on tool chains, no argument
here.  The term “tool chain” is widely used for C/C++ though, much less
for other languages (often the “tool chain” is a single package for
other languages).

We could change the name to ‘--with-c-toolchain’ maybe?  Then someone
might come and suggest that this doesn’t account for C++, Objective-C,
and FORTRAN.

>> Can’t you use ‘--with-input=ocamlX.Y=ocamlA.B’ in this case?  If not, we
>> could devise a separate option rather than overload this one.
>
> No, in this case one should use:
>
>    guix build coq \
>           --with-input=ocaml=ocaml@4.07 \
>           --with-input=ocaml-findlib=ocaml4.07-findlib

Hmm I think the second one is unnecessary since
‘--with-input=ocaml=ocaml@4.07’ effectively gives an ‘ocaml-findlib’
built against OCaml 4.07.

Anyway, we’re drifting off-topic; let’s address OCaml separately if
something needs to be addressed.

> For consistency, it appears to me easier to have one "toolchain" per
> build system, say ocaml-toolchain, gcc-toolchain, haskell-toolchain,
> and then provides this toolchain to the option '--with-toolchain'.
> However, it is complicated to remove the 'build-inputs' since they are
> not hard coded -- as it is the case in 'build-system-with-toolchain'.
> Or another option is to have one command line option per build system:
> --with-gnu-toolchain, --with-ocaml-toolchain, --with-cargo-toolchain,
> etc..

If there’s a need for that, yes.  We’ll see!

>> I’m not interested in proprietary compilers if that’s what you have in
>> mind.  Besides, the SysV ABI is defined for C, so normally all C
>> compilers produce ABI-compatible code.  There are exceptions such as
>> OpenMP (Clang is moving to their own libomp, I think, whereas GCC has
>> libgomp.)
>
> It was what I have in mind. :-)
> But do the exceptions you point not imply another option?

We can’t completely prevent people from shooting themselves in the foot
with transformations, but yeah, maybe we should rebuild everything
higher in the stack with the same toolchain.

Thanks,
Ludo’.
Ludovic Courtès Oct. 9, 2020, 9:12 a.m. UTC | #5
Hi!

This is v2 of this patch, with these changes:

  1. ‘with-toolchain’ is replaced by ‘with-c-toolchain’ everywhere,
     with the understanding that it’s about the C/C++ toolchain
     in practice.  In the end I’m sympathetic with the argument
     that C/C++ don’t have a monopoly on toolchains.  ;-)

  2. ‘--with-c-toolchain=PACKAGE=TOOLCHAIN’ rebuilds not just
     PACKAGE with TOOLCHAIN, but also everything above PACKAGE
     with TOOLCHAIN (in v1, only PACKAGE was rebuilt with TOOLCHAIN
     but everything above it had to be rebuilt anyway.)

     The main motivation here is to reduce the changes that we’re
     introducing ABI incompatibilities that users would have to work
     around by passing on ‘--with-c-toolchain’ for each package in
     the chain.  I think it also more closely matches user
     expectations: when you see things are being rebuilt, you’re
     likely to think that’s because they’re rebuilt with the new
     toolchain, not the default one.

Feedback welcome!

Ludo’.

Ludovic Courtès (5):
  gnu: gcc-toolchain: Add 'GUIX_LOCPATH' to the search paths.
  gnu: clang-toolchain: Add 'GUIX_LOCPATH' to the search paths.
  gnu: clang-toolchain: Create 'cc' and 'c++' symlinks.
  packages: Add 'package-with-c-toolchain'.
  guix build: Add '--with-c-toolchain'.

 doc/guix.texi                 | 70 +++++++++++++++++++++++++++++
 gnu/packages/commencement.scm |  8 +++-
 gnu/packages/llvm.scm         | 12 ++++-
 guix/build-system.scm         | 35 ++++++++++++++-
 guix/packages.scm             |  9 ++++
 guix/scripts/build.scm        | 84 +++++++++++++++++++++++++++++++++++
 tests/packages.scm            | 20 +++++++++
 tests/scripts-build.scm       | 82 ++++++++++++++++++++++++++++++++++
 8 files changed, 315 insertions(+), 5 deletions(-)