mbox series

[bug#64188,0/8] More package tuning

Message ID cover.1687247150.git.efraim@flashner.co.il
Headers show
Series More package tuning | expand

Message

Efraim Flashner June 20, 2023, 7:48 a.m. UTC
with gcc-11, gcc gained support for using -march=x86_64-v{1,2,3,4},
which I'm calling 'generic options,' as opposed to the more targeted
tuning we have with specific architectures.

Unfortunately there doesn't seem to be a way to map our current
architectures to these ones, making it harder to swap between them. The
best I've found is that any architecture which can use the Haswell
architecture optimizations can also use x86_64-v3. I suppose we could
create a mapping so that, as a fallback, anything haswell or higher
would use x86_64-v3, anything else would use x86_64-v1. This would help
with --tune=native.

For the second patch, 'Add inexact cpu matching,' I'm unsure about
needing to check for the avx512f flag; I don't have any hardware to
test with.

Patches 3 and 4 I tested on my machine (with commenting out the AMD
branch) and it successfully decided I should use x86_64-v3.

go cpu tuning targets: I mostly used the chart¹ on the go website, and I
also checked the source code for go-1.18. I put in arm{5,6,7} as arm and
not armhf since armhf only works with armv7 and with go programs, since
they're statically linked, they can just be copied to other machines.

The 6th patch, adjusting the transformations to also do go packages, I
tested with syncthing.  with '--tune' it failed during the added phase,
trying to tune for znver2, with '--tune=x86_64-v3' it worked without
problems.

Having written this out, I think our best bet would be to use a
generalized >=haswell -> x86_64-v3, else x86_64-v1, and not worry about
individual micro-architectures and specific chipsets. That would ensure
'--tune=native' should just work with go packages.

As far as tuning go packages, my understanding is that pretty much every
non-trivial go package can benefit from tuning.

EDIT:

I added two more patches on-top of the initial 6 to implement
gcc-architecture->generic-architecture and then use it. For patch 7, the
other option I had instead of returning gcc-architecture on no-match
would be to return "generic".

¹ https://github.com/golang/go/wiki/MinimumRequirements#microarchitecture-support

Efraim Flashner (8):
  gnu: %gcc-11-x86_64-micro-architectures: Add generic options.
  guix: cpu: Add inexact CPU matching.
  guix: cpu: Rewrite fallback for x86_64 cpu->gcc-architecture.
  guix: cpu: Refactor cpu->gcc-architecture.
  gnu: go: Add CPU tuning targets.
  transformations: Allow tuning go packages.
  guix: cpu: Add gcc-architecture->generic-architecture mapping.
  transformations: Allow autotuning for go packages.

 gnu/packages/gcc.scm     |   4 +-
 gnu/packages/golang.scm  |  23 ++++++-
 guix/cpu.scm             | 128 ++++++++++++++++++++++-----------------
 guix/transformations.scm |  43 +++++++++++--
 4 files changed, 135 insertions(+), 63 deletions(-)


base-commit: d884fc9e2efecfba09af4694f5a13ad7fc6f704f

Comments

Ludovic Courtès June 25, 2023, 8:47 p.m. UTC | #1
Hello Efraim,

Efraim Flashner <efraim@flashner.co.il> skribis:

> with gcc-11, gcc gained support for using -march=x86_64-v{1,2,3,4},
> which I'm calling 'generic options,' as opposed to the more targeted
> tuning we have with specific architectures.

I don’t think these x86_64 psABI “architecture levels” should be treated
specially:

  • From the point of view of ‘--tune’, they’re just another value that
    may be passed to ‘-march’.

  • My understanding is that those levels don’t match reality: as
    discussed in the original ‘--tune’ patch¹, CPUs actually produced
    don’t follow a pattern of strictly including features of one set.
    They’re really just a simplification to get more memorizable names,
    but it’s hard to tell whether a given CPU really covers the set of
    features of a given level.

Overall, my take on this would be to add supported levels to
‘%gcc-11-x86_64-micro-architectures’ & co., without going further.

WDYT?

¹ https://issues.guix.gnu.org/52283#0-lineno48

[...]

> go cpu tuning targets: I mostly used the chart¹ on the go website, and I
> also checked the source code for go-1.18. I put in arm{5,6,7} as arm and
> not armhf since armhf only works with armv7 and with go programs, since
> they're statically linked, they can just be copied to other machines.

Now if Go uses those names, (guix cpu) can provide helpers.

Thanks,
Ludo’.
Efraim Flashner June 26, 2023, 8:34 a.m. UTC | #2
On Sun, Jun 25, 2023 at 10:47:42PM +0200, Ludovic Courtès wrote:
> Hello Efraim,
> 
> Efraim Flashner <efraim@flashner.co.il> skribis:
> 
> > with gcc-11, gcc gained support for using -march=x86_64-v{1,2,3,4},
> > which I'm calling 'generic options,' as opposed to the more targeted
> > tuning we have with specific architectures.
> 
> I don’t think these x86_64 psABI “architecture levels” should be treated
> specially:
> 
>   • From the point of view of ‘--tune’, they’re just another value that
>     may be passed to ‘-march’.
> 
>   • My understanding is that those levels don’t match reality: as
>     discussed in the original ‘--tune’ patch¹, CPUs actually produced
>     don’t follow a pattern of strictly including features of one set.
>     They’re really just a simplification to get more memorizable names,
>     but it’s hard to tell whether a given CPU really covers the set of
>     features of a given level.

They're also useful for glibc-hwcaps, so that we could build each
package multiple times and install libraries built for the psABI levels
into $prefix/lib/glibc-hwcaps/x86-64-v[234]/, but I agree that, for our
uses so far, they're not really useful.

> Overall, my take on this would be to add supported levels to
> ‘%gcc-11-x86_64-micro-architectures’ & co., without going further.
> 
> WDYT?

I could see keeping the code from cpu->generic-architecture (renamed
cpu->psABI) either as a non-exported function or simply moved into the
fallback for x86_64.

> ¹ https://issues.guix.gnu.org/52283#0-lineno48
> 
> [...]
> 
> > go cpu tuning targets: I mostly used the chart¹ on the go website, and I
> > also checked the source code for go-1.18. I put in arm{5,6,7} as arm and
> > not armhf since armhf only works with armv7 and with go programs, since
> > they're statically linked, they can just be copied to other machines.
> 
> Now if Go uses those names, (guix cpu) can provide helpers.

Go also uses power8 and power9 as PPC64(le) options, so that's also a
possible use-case I was trying to also prepare for. That was my plan for
the gcc-architecture->generic-architecture function, to allow for --tune
to work without needing to pass different values to different packages.