mbox series

[bug#54539,0/6] Start breaking up import cycles

Message ID 5a87d6f772ff7424cb6fccea7c45276bef7797aa.camel@telenet.be
Headers show
Series Start breaking up import cycles | expand

Message

M March 23, 2022, 6:46 p.m. UTC
Import cycles make some packaging things harder and prevent some
proposed optimisations to "guix pull", let's start eliminating them.
TBC ...

Comments

M March 23, 2022, 6:49 p.m. UTC | #1
Maxime Devos schreef op wo 23-03-2022 om 19:46 [+0100]:
> Import cycles make some packaging things harder and prevent some
> proposed optimisations to "guix pull", let's start eliminating them.
> TBC ...

The copyright lines are based on a mix of "git blame" and "git log --
grep the-package"; I might have missed some people ...
Liliana Marie Prikler March 24, 2022, 7:22 a.m. UTC | #2
Hi Maxime,

Am Mittwoch, dem 23.03.2022 um 19:49 +0100 schrieb Maxime Devos:
> Maxime Devos schreef op wo 23-03-2022 om 19:46 [+0100]:
> > Import cycles make some packaging things harder and prevent some
> > proposed optimisations to "guix pull", let's start eliminating
> > them.
> > TBC ...
> 
> The copyright lines are based on a mix of "git blame" and "git log --
> grep the-package"; I might have missed some people ...
I agree that breaking up cycles is a good thing, but I disagree with
some of the decisions you've made here.  For instance, I oppose the use
of single-package modules, because those more often than not simply
clutter the file system.

I'm not sure if Guile's #:autoload could do anything to fix these
issues (I suppose not), but long term I think guile modules should
support a style that is basically (resolve-interface) + (module-ref) in
the manner Guix needs, but declaratively.  While we do not have that in
place yet, I suggest something like the following:

(define (check-package-ref pkg)
  (module-ref (resolve-interface '(gnu packages check)) pkg))
...

or 

(define check-package-ref
  (let ((iface (resolve-interface '(gnu packages check))))
    (lambda (pkg) (resolve-interface iface pkg))))

I'm not sure if the second will have the intended effect.  It appears
to me as though the key to breaking these cycles is moving them into a
context that is not evaluated on the top of the file, e.g. a thunked
field or in the case of my first suggestion a procedure.

In either case, declaring all these lazy dependencies near the module
definition would have the added benefit, that people could see them
being lazily imported and thus no longer need the #:use-modules
comment.

WDYT?
M March 24, 2022, 3:05 p.m. UTC | #3
Liliana Marie Prikler schreef op do 24-03-2022 om 08:22 [+0100]:
> I agree that breaking up cycles is a good thing, but I disagree with
> some of the decisions you've made here.  For instance, I oppose the use
> of single-package modules, because those more often than not simply
> clutter the file system.

There are some other sound applications in (gnu packages audio)
and (gnu packages music), so maybe I can make a (gnu packages audio-
apps) module where 'audacity' and other applications like 'calf' can
reside?

For reducing the contribution of (gnu packages compression) on the
cycle issue, I've (in not yet submitted patched) separated many things
into (gnu packages compression-xyz), perhaps I can merge
(gnu packages patools) into (gnu packages compression-xyz)?

Greetings,
Maxime.
Liliana Marie Prikler March 24, 2022, 3:38 p.m. UTC | #4
Am Donnerstag, dem 24.03.2022 um 16:05 +0100 schrieb Maxime Devos:
> Liliana Marie Prikler schreef op do 24-03-2022 om 08:22 [+0100]:
> > I agree that breaking up cycles is a good thing, but I disagree
> > with some of the decisions you've made here.  For instance, I
> > oppose the use of single-package modules, because those more often
> > than not simply clutter the file system.
> 
> There are some other sound applications in (gnu packages audio)
> and (gnu packages music), so maybe I can make a (gnu packages audio-
> apps) module where 'audacity' and other applications like 'calf' can
> reside?
I'm not sure.  IIUC, audio should be for audio systems, codecs, etc.
whereas music sounds like a particular niche containing music players
etc.  Perhaps the cycle could more appropriately been broken by moving
stuff from music to audio?  Alternatively, declaring music as lazy
import as shown in my previous mail might help making these
interdependencies both "cycle-free" and visible.  (I'm pretty sure we
also have "sound" lying around somewhere to put more oil into the
fire.)

> For reducing the contribution of (gnu packages compression) on the
> cycle issue, I've (in not yet submitted patched) separated many
> things into (gnu packages compression-xyz), perhaps I can merge
> (gnu packages patools) into (gnu packages compression-xyz)?
Here too, I think a classification into compression algorithms in
compression and backup/archival tools in another file (we do have
backup IIRC) would make the most sense.  Though obviously, we'd have to
do compression algorithms implemented in Rust in a special rust-
compression file to avoid circles or use the cycle killer lambda trick.
I'm not sure if "compression-xyz" would be a helpful label, and it
might just become the next root of circular dependencies if abused.

Cheers
M March 24, 2022, 3:46 p.m. UTC | #5
Liliana Marie Prikler schreef op do 24-03-2022 om 16:38 [+0100]:
> > For reducing the contribution of (gnu packages compression) on the
> > cycle issue, I've (in not yet submitted patched) separated many
> > things into (gnu packages compression-xyz), perhaps I can merge
> > (gnu packages patools) into (gnu packages compression-xyz)?
> Here too, I think a classification into compression algorithms in
> compression and backup/archival tools in another file (we do have
> backup IIRC) would make the most sense.  Though obviously, we'd have
> to
> do compression algorithms implemented in Rust in a special rust-
> compression file to avoid circles or use the cycle killer lambda
> trick.
> I'm not sure if "compression-xyz" would be a helpful label, and it
> might just become the next root of circular dependencies if abused.

Except for zpaq, there don't appear to be any backup tools in (gnu
packages compression).  There are no rust compression things in
(gnu packages compression) (yet!).   But yes, there's some opportunity
for abuse here.  There _are_ archival tools in (gnu packages
compression), e.g. tar and unzip.   Moving them to (gnu packages
backup) wouldn't bring any benefit though, since they are used by
practically everything.

Anyway, I'll continue trying to break cycles ...

Greetings,
Maxime.
Simon Tournier March 24, 2022, 4:58 p.m. UTC | #6
Hi,

On Thu, 24 Mar 2022 at 08:23, Liliana Marie Prikler
<liliana.prikler@ist.tugraz.at> wrote:

> I agree that breaking up cycles is a good thing, but I disagree with
> some of the decisions you've made here.  For instance, I oppose the use
> of single-package modules, because those more often than not simply
> clutter the file system.

Well, instead of opinions in the vacuum of matter, we need to profile
and decide on performance report.  The number of files and the number
of package per file should consider the performance of:

 - compilation by developer
 - guix pull
 - guix search (or any other)
 - guix search --load-path
 - etc.

> In either case, declaring all these lazy dependencies near the module
> definition would have the added benefit, that people could see them
> being lazily imported and thus no longer need the #:use-modules
> comment.

I agree that lazyness is a good thing and a good direction.  However,
let be pragmatic with what we have now. :-)  What are the performance
comparison between breaking many cycles as Maxime is proposing vs
using many 'module-ref' + 'resolve-interface' instead of break?


Cheers,
simon
Leo Famulari March 24, 2022, 5:05 p.m. UTC | #7
On Thu, Mar 24, 2022 at 08:22:09AM +0100, Liliana Marie Prikler wrote:
> I agree that breaking up cycles is a good thing, but I disagree with
> some of the decisions you've made here.  For instance, I oppose the use
> of single-package modules, because those more often than not simply
> clutter the file system.

The file system can hold many files. The Guix codebase is nowhere near
the limit...
M March 24, 2022, 6:07 p.m. UTC | #8
zimoun schreef op do 24-03-2022 om 17:58 [+0100]:
> I agree that lazyness is a good thing and a good direction.  However,
> let be pragmatic with what we have now. :-)  What are the performance
> comparison between breaking many cycles as Maxime is proposing vs
> using many 'module-ref' + 'resolve-interface' instead of break?

Currently this patch series does not improve anything much, according
to "guix graph --type=module hello | wc --lines".  I'm now introducing
some module-ref+resolve-interface --- it's very convenient, but I'm not
yet at a significant result.

Anyway, restructuring modules and lazy loading can be complementary,
whatever's convenient.

Greetings,
Maxime.
M March 24, 2022, 9:49 p.m. UTC | #9
Maxime Devos schreef op wo 23-03-2022 om 19:46 [+0100]:
> Import cycles make some packaging things harder and prevent some
> proposed optimisations to "guix pull", let's start eliminating them.
> TBC ...

Status update: previously the output of
"guix graph -t module hello --max-depth=9 |wc --lines" reports ~1040
lines.  With WIP patches, it now reports 280 lines.  I'm gradually
increasing the max-depth and each time investigating how to leave xorg,
gtk, ... out of the output.

Greetings,
Maxime.
Liliana Marie Prikler March 25, 2022, 8:44 a.m. UTC | #10
Am Donnerstag, dem 24.03.2022 um 19:07 +0100 schrieb Maxime Devos:
> zimoun schreef op do 24-03-2022 om 17:58 [+0100]:
> > I agree that lazyness is a good thing and a good direction. 
> > However, let be pragmatic with what we have now. :-)  What are the
> > performance comparison between breaking many cycles as Maxime is
> > proposing vs using many 'module-ref' + 'resolve-interface' instead
> > of break?
> 
> Currently this patch series does not improve anything much, according
> to "guix graph --type=module hello | wc --lines".  I'm now
> introducing some module-ref+resolve-interface --- it's very
> convenient, but I'm not yet at a significant result.
For the record, my suggestion to declare lazily loaded modules near the
top is based on the fact that Maxime's current patch set uses them to
break up cycles in a manner that also requires a comment in the define-
module clause for the sake of clarity.  As a nice side effect, it makes
it so that two-liners in the inputs field become one-liners.

The question is (on a per-module basis) whether we consider this cheat
fine or whether we want to move things into different files (and
which).  I so far haven't heard a good argument for the case of
audacity I raised.  "It breaks cycles" is not good enough when we
consider the potential existence of other cuts (e.g. "audio-apps",
although perhaps a more specific "audio-editors" similar to how we have
"image-viewers" might make more sense), as well as the cheat of lazy
imports.

simon, you raise some important performance metrics, but there is such
a thing as optimizing for the wrong metric.  There are other variables
to consider, like time to grep, "does it make sense that X belongs to Y
and Z doesn't", etc., when it comes to ease of contributing.  Declaring
some modules banned for a given other module has an adverse effect
here, in my opinion, and thus I claim that we need easily accessible
ways of using those supposedly banned modules.

Btw. regarding style, I think declaring a function @PACKAGE_MODULE,
i.e. a literal '@ followed by the last symbol in the module's name,
would be the easiest, as one could read (@PACKAGE_MODULE arg) as a
shorthand for (@ (gnu packages PACKAGE_MODULE) arg).  Somewhat off-
topic, what's the rationale behind not using @ syntax?  Does @ have
different semantics from resolve-interface + module-ref?

Cheers
Liliana Marie Prikler March 25, 2022, 8:51 a.m. UTC | #11
Am Donnerstag, dem 24.03.2022 um 13:05 -0400 schrieb Leo Famulari:
> On Thu, Mar 24, 2022 at 08:22:09AM +0100, Liliana Marie Prikler
> wrote:
> > I agree that breaking up cycles is a good thing, but I disagree
> > with some of the decisions you've made here.  For instance, I
> > oppose the use of single-package modules, because those more often
> > than not simply clutter the file system.
> 
> The file system can hold many files. The Guix codebase is nowhere
> near the limit...
That's not a good argument, though.  Theoretically, you could have one
file per package, or even one folder per package as the Gentoo folks
do.  Clearly, this is not an appeal to file system limits, but to
limitations of humans and their tools.  Have you tried exploring
/gnu/store?  Merely loading it takes a considerable amount of time.

Cheers
M March 25, 2022, 10:26 a.m. UTC | #12
Liliana Marie Prikler schreef op do 24-03-2022 om 16:38 [+0100]:
> Am Donnerstag, dem 24.03.2022 um 16:05 +0100 schrieb Maxime Devos:
> > Liliana Marie Prikler schreef op do 24-03-2022 om 08:22 [+0100]:
> > > I agree that breaking up cycles is a good thing, but I disagree
> > > with some of the decisions you've made here.  For instance, I
> > > oppose the use of single-package modules, because those more often
> > > than not simply clutter the file system.
> > 
> > There are some other sound applications in (gnu packages audio)
> > and (gnu packages music), so maybe I can make a (gnu packages audio-
> > apps) module where 'audacity' and other applications like 'calf' can
> > reside?
> I'm not sure.  IIUC, audio should be for audio systems, codecs, etc.
> whereas music sounds like a particular niche containing music players
> etc.
> [...]

I'm not sure where the ‘I'm not sure’ comes from --- audacity is not an
audio system like pulseaudio or alsa, not a codec implementation like
libvorbis, and it is a sound player (and editor), so IIUC, audacity
does not fit into (gnu packages audio).

While it can be used for modifying and playing music, it is more
general than that, hence the suggestion of (gnu packages audio-apps)
(where some other packages can be moved to as well, maybe 'gnaural'?).

Though granted, it's difficult to make a strict distinction between
audio and music.

>   Perhaps the cycle could more appropriately been broken by moving
> stuff from music to audio?

(gnu packages music) is full of applications using gtk+ or gles or the
like.  Moving them to (gnu packages audio) would make (gnu packages
audio) depend on (gnu packages gtk) and friends, which seems counter-
productive to me.

Greetings,
Maxime.
Liliana Marie Prikler March 25, 2022, 11:47 a.m. UTC | #13
Am Freitag, dem 25.03.2022 um 11:26 +0100 schrieb Maxime Devos:
> Liliana Marie Prikler schreef op do 24-03-2022 om 16:38 [+0100]:
> > Am Donnerstag, dem 24.03.2022 um 16:05 +0100 schrieb Maxime Devos:
> > 
> > > Liliana Marie Prikler schreef op do 24-03-2022 om 08:22 [+0100]:
> > > > I agree that breaking up cycles is a good thing, but I disagree
> > > > with some of the decisions you've made here.  For instance, I
> > > > oppose the use of single-package modules, because those more
> > > > often than not simply clutter the file system.
> > > 
> > > There are some other sound applications in (gnu packages audio)
> > > and (gnu packages music), so maybe I can make a (gnu packages
> > > audio-apps) module where 'audacity' and other applications like
> > > 'calf' can reside?
> > I'm not sure.  IIUC, audio should be for audio systems, codecs,
> > etc. whereas music sounds like a particular niche containing music
> > players etc.
> > [...]
> 
> I'm not sure where the ‘I'm not sure’ comes from --- audacity is not
> an audio system like pulseaudio or alsa, not a codec implementation
> like libvorbis, and it is a sound player (and editor), so IIUC,
> audacity does not fit into (gnu packages audio).
It's not particularly specific to audacity in this case, it's that I
think that "-app" does not make for a useful distinction.  Consider
fluidsynth.  Is it an app, a library, something else?  If fluidsynth
was causing circular imports and moving it to audio-synthesizers fixed
things, that'd be fine by me.  gnaural could also fit into
synthesizers.  However, putting both in the same file might still be an
issue from a cycle perspective, because the latter needs gtk, whereas
the former is content with having just glib (although that one appears
benign on surface, as at the very least gtk does not audio directly).

> While it can be used for modifying and playing music, it is more
> general than that, hence the suggestion of (gnu packages audio-apps)
> (where some other packages can be moved to as well, maybe
> 'gnaural'?).
> 
> Though granted, it's difficult to make a strict distinction between
> audio and music.
> 
> >   Perhaps the cycle could more appropriately been broken by moving
> > stuff from music to audio?
> 
> (gnu packages music) is full of applications using gtk+ or gles or
> the like.  Moving them to (gnu packages audio) would make (gnu
> packages audio) depend on (gnu packages gtk) and friends, which seems
> counter-productive to me.
That's the status quo (through gnaural and audacity for example).  To
make a more educated guess, which cycle do we aim to address here?  Is
there a meaningful cut that can be made (e.g. the offending packages
are all "audio editors", "audio synthesizers", etc.) or do we have to
separate good and bad based on their inputs?

Cheers
M March 25, 2022, 2:12 p.m. UTC | #14
Liliana Marie Prikler schreef op vr 25-03-2022 om 12:47 [+0100]:
> That's the status quo (through gnaural and audacity for example).  To
> make a more educated guess, which cycle do we aim to address here?  Is
> there a meaningful cut that can be made (e.g. the offending packages
> are all "audio editors", "audio synthesizers", etc.) or do we have to
> separate good and bad based on their inputs?

Some cycles:

(gnu packages pulseaudio) -> (gnu packages audio) -> (gnu packages gtk)
/(gnu packages qt)-> (gnu packages pulseaudio) + the world

(gnu packages pulseaudio) -> (gnu packages audio) -> (gnu packages
webkit) -> (gnu packages gstreamer) + the world -> (gnu packages
pulseaudio) + the world

Suggested cut: audio libraries like flac, libogg, libvorbis, opus,
wildmidi, vo-aacenc, tinyalsa ... can go in (gnu packages audio), other
things go somewhere else, especially if they need expensive imports.

Greetings,
Maxime.
Liliana Marie Prikler March 25, 2022, 2:27 p.m. UTC | #15
Am Freitag, dem 25.03.2022 um 15:12 +0100 schrieb Maxime Devos:
> Liliana Marie Prikler schreef op vr 25-03-2022 om 12:47 [+0100]:
> > That's the status quo (through gnaural and audacity for example). 
> > To make a more educated guess, which cycle do we aim to address
> > here? 
> > Is there a meaningful cut that can be made (e.g. the offending
> > packages are all "audio editors", "audio synthesizers", etc.) or do
> > we have to separate good and bad based on their inputs?
> 
> Some cycles:
> 
> (gnu packages pulseaudio) -> (gnu packages audio) -> (gnu packages
> gtk)
> /(gnu packages qt)-> (gnu packages pulseaudio) + the world
> 
> (gnu packages pulseaudio) -> (gnu packages audio) -> (gnu packages
> webkit) -> (gnu packages gstreamer) + the world -> (gnu packages
> pulseaudio) + the world
> 
> Suggested cut: audio libraries like flac, libogg, libvorbis, opus,
> wildmidi, vo-aacenc, tinyalsa ... can go in (gnu packages audio),
> other things go somewhere else, especially if they need expensive
> imports.
Hmm, is it all codecs?  In that case, I'd suggest making a smaller (gnu
packages audio-codecs), that can be used by (gnu packages audio-
systems) [including (tiny)alsa, pulseaudio, jack, ...] that can be used
by the rest of the world.  Having that, we could move "the rest" into
audio-xyz (or let it simply remain "audio").  Would that be actionable?

(Note: There might still be debate w.r.t. the above split when
considering synthesizers, as they are technically not codecs, but we
still need to distinguish between low-level synths like fluidsynth and
wildmidi vs. full-on sound stations.)

(Note2: Of course, this assumes that neither audio-codecs nor audio-
systems will ever need to import any of the rust stuff.  *sigh*)


Cheers
Simon Tournier March 25, 2022, 5:05 p.m. UTC | #16
Hi Liliana and Maxime,

On Fri, 25 Mar 2022 at 09:44, Liliana Marie Prikler
<liliana.prikler@ist.tugraz.at> wrote:

> The question is (on a per-module basis) whether we consider this cheat
> fine or whether we want to move things into different files (and
> which).  I so far haven't heard a good argument for the case of
> audacity I raised.  "It breaks cycles" is not good enough when we
> consider the potential existence of other cuts (e.g. "audio-apps",
> although perhaps a more specific "audio-editors" similar to how we have
> "image-viewers" might make more sense), as well as the cheat of lazy
> imports.
>
> simon, you raise some important performance metrics, but there is such
> a thing as optimizing for the wrong metric.  There are other variables
> to consider, like time to grep, "does it make sense that X belongs to Y
> and Z doesn't", etc., when it comes to ease of contributing.  Declaring
> some modules banned for a given other module has an adverse effect
> here, in my opinion, and thus I claim that we need easily accessible
> ways of using those supposedly banned modules.

To be honest, I am not sure to understand the aim of reorganizing the
modules... I mean, to me, the only important metrics is the
performance of the end-user.  If there is no performance improvement
when cutting cycle, then it appears to me pointless to cut cycles. :-)

Moreover, set an arbitrary boundary between packages is... arbitrary.
You can spend close to eternity for discussing "does it make sense
that X belongs to Y and Z doesn't".  To me, such activity is like
"tagging" (assign a specific word belonging to a finite set of words),
it is usually a lot of effort and energy for, at the end, few, if not
none, pragmatic outcomes.

Last, for classification (assign a package to one module depending on
the affinity with the other packages of that module), well, it could
almost arbitrary (manual depending of human choice) as it is now or it
could be self-organized depending on the data themself. From my point
of view, it could be interesting to apply some kind of self-organized
map (SOM) and other related things.  It could be help for many other
issues as "search".  Pointers for what they are worth:

https://lists.gnu.org/archive/html/guix-devel/2019-07/msg00252.html
https://lists.gnu.org/archive/html/guix-devel/2019-12/msg00160.html


Cheers,
simon
M March 25, 2022, 5:46 p.m. UTC | #17
zimoun schreef op vr 25-03-2022 om 18:05 [+0100]:
> To be honest, I am not sure to understand the aim of reorganizing the
> modules... I mean, to me, the only important metrics is the
> performance of the end-user.  If there is no performance improvement
> when cutting cycle, then it appears to me pointless to cut cycles. :-)

FWIW, there are three goals here:

  * Allowing writing stuff like

    (use-modules (gnu packages ncurses))
    (package
      (name "some-terminal-app")
      [...]
      ;; Work-around the ‘search path of dependencies not propagated’ bug.
      (native-search-paths (package-native-search-paths ncurses)))

    without getting 'unbound variable' errors.
    Alternatively, the search-path-specification could be defined in,
    say, (guix search-paths) next to $PATH but that proposal seems
    to have been rejected.

  * Making "compute-guix-derivation" faster by reducing the number of
    (uncompiled!) package module files it needs to load.
    
  * Eventually making the ‘incremental compilation’ by fine-grained derivations
    proposal from the ‘Faster "guix pull" by incremental compilation and
    non-circular modules?’ thread [0] feasible.

As a side benefit, it makes the output of "guix graph --type=module foo"
less cluttered and presumably reduces the module closure (a likely
side-effect of breaking cycles), making "guix show foo" (*) a bit faster.

(*) FWIW, on my machine, "guix show guix" takes 1.6s.

[0]: <https://lists.nongnu.org/archive/html/guix-devel/2022-02/msg00193.html>

Greetings,
Maxime.
Simon Tournier March 25, 2022, 7:33 p.m. UTC | #18
Hi Maxime,

On Fri, 25 Mar 2022 at 18:46, Maxime Devos <maximedevos@telenet.be> wrote:
> zimoun schreef op vr 25-03-2022 om 18:05 [+0100]:

> > To be honest, I am not sure to understand the aim of reorganizing the
> > modules... I mean, to me, the only important metrics is the
> > performance of the end-user.  If there is no performance improvement
> > when cutting cycle, then it appears to me pointless to cut cycles. :-)
>
> FWIW, there are three goals here:
>
>   * Allowing writing stuff like
>
>     (use-modules (gnu packages ncurses))
>     (package
>       (name "some-terminal-app")
>       [...]
>       ;; Work-around the ‘search path of dependencies not propagated’ bug.
>       (native-search-paths (package-native-search-paths ncurses)))
>
>     without getting 'unbound variable' errors.
>     Alternatively, the search-path-specification could be defined in,
>     say, (guix search-paths) next to $PATH but that proposal seems
>     to have been rejected.
>
>   * Making "compute-guix-derivation" faster by reducing the number of
>     (uncompiled!) package module files it needs to load.
>
>   * Eventually making the ‘incremental compilation’ by fine-grained derivations
>     proposal from the ‘Faster "guix pull" by incremental compilation and
>     non-circular modules?’ thread [0] feasible.

Thanks for the detailed and clear explanations.  It was my initial
understanding that cutting cycles can improve the performances, and
IMHO, timings are required for comparing apple to apple; as I tried to
explain [1].  Then the thread have let me the impression that the
performance improvement was not the aim -- thanks for clarifying.

1: <https://issues.guix.gnu.org/54539#12>

> (*) FWIW, on my machine, "guix show guix" takes 1.6s.

To be precise, "guix show guix" could be drastically improved by
adapting the already existing package.cache, i.e., resume the lengthy
work of <https://issues.guix.gnu.org/39258>. :-)

However, such cache would be useless for "guix show -L path/to/others
foo" where performance can be really poor; especially on spinning hard
disk.  Well, thanks for working on that by trying to tackle the cycle
of modules.


Cheers,
simon
Ludovic Courtès April 19, 2022, 9:17 a.m. UTC | #19
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> skribis:

> Import cycles make some packaging things harder and prevent some
> proposed optimisations to "guix pull", let's start eliminating them.
> TBC ...

Sorry for the late reply.

Some of the changes you propose may make sense (and should be applied),
but we shouldn’t overplay the role of such changes.

If you follow the logic, breaking up import cycles would mean, in the
end, having one file per package.

But would that be enough?  Probably not, because low-level packages are
bound to depend on high-level packages—e.g., glibc depends on Python,
some other low-level tool might depend on Pandoc (GHC), librsvg depends
on Rust, and so on.

IOW, since the graph of build dependency really is a graph, and not a
tree, there’ll always be import cycles.

(guix self), the module that ‘guix pull’ uses, already automatically
splits package modules into two groups.  It’s not as modular as we’d
like, but it’s a start.  What would be useful is to come up with metrics
and tools to reduce the closure of the “guix-packages-base” group.

WDYT?

Thanks,
Ludo’.
M April 19, 2022, 9:40 a.m. UTC | #20
Ludovic Courtès schreef op di 19-04-2022 om 11:17 [+0200]:
> If you follow the logic, breaking up import cycles would mean, in the
> end, having one file per package.

Not necessarily, (gnu packages minetest) has multiple packages
(minetest and some of its mods) but it doesn't cause any cycles (no
other module, except sort-of (guix build-system minetest), imports it.)

That one appears to be, at least currently, a bit of a special case
though.

> But would that be enough?  Probably not, because low-level packages
> are bound to depend on high-level packages—e.g., glibc depends on
> Python, some other low-level tool might depend on Pandoc (GHC),
> librsvg depends on Rust, and so on.
>
> IOW, since the graph of build dependency really is a graph, and not a
> tree, there’ll always be import cycles.

The graph of build dependencies (in terms of derivations) is a tree,
the build daemon doesn't allow cyclic derivations.  So I think that by
letting the module graph be a coarser version of the derivation graph
but still a tree (except for the bootstrap packages gcc, sed, ... whose
modules may import each other).

> (guix self), the module that ‘guix pull’ uses, already automatically
> splits package modules into two groups.  It’s not as modular as we’d
> like, but it’s a start.  What would be useful is to come up with metrics
> and tools to reduce the closure of the “guix-packages-base” group.
> 
> WDYT?

Maybe:

  a tool that determines a minimal set of (importing module ->
  imported module tuples) that needs to be lazified to reduce the
  closure size (in number of modules) in guix-packages-base by N

and:

  extend "guix style" to perform these changes

Maybe the ‘number of imports lazified -> number of modules in guix-
packages-base’ function has some sweet spot somewhere.

I think it would be easier though to work our way up before going to
"guix pull" -- first "hello", then "util-linux, then "guile-avahi",
then "guile-ssh", then "sqlite" ... and only eventually guix itself.

Also, even if the closure of "guix-packages-base" cannot be reduced,
making it (mostly) a tree would allow splitting the group into multiple
parts (see ‘Faster "guix pull" by incremental compilation and non-
circular modules?’).

Alternative:

  * make _all_ package module imports lazy -- #:autoload everything!

    guix-packages-base might then need to be set manually though ...

Greetings,
Maxime.
M April 19, 2022, 3:31 p.m. UTC | #21
Ludovic Courtès schreef op di 19-04-2022 om 11:17 [+0200]:
> (guix self), the module that ‘guix pull’ uses, already automatically
> splits package modules into two groups.  It’s not as modular as we’d
> like, but it’s a start.  What would be useful is to come up with metrics
> and tools to reduce the closure of the “guix-packages-base” group.
> 
> WDYT?

Maybe as a first step, "guix style" could be taught to trim unused
imports?  When writing the patches it turned out that some imports were
unnecessary and could therefore be removed ...

Greetings,
Maxime.
Ludovic Courtès April 27, 2022, 8:59 p.m. UTC | #22
Hi!

Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op di 19-04-2022 om 11:17 [+0200]:
>> (guix self), the module that ‘guix pull’ uses, already automatically
>> splits package modules into two groups.  It’s not as modular as we’d
>> like, but it’s a start.  What would be useful is to come up with metrics
>> and tools to reduce the closure of the “guix-packages-base” group.
>> 
>> WDYT?
>
> Maybe as a first step, "guix style" could be taught to trim unused
> imports?

Yes, that would be nice.

Ludo’.
Ludovic Courtès April 27, 2022, 9:04 p.m. UTC | #23
Maxime Devos <maximedevos@telenet.be> skribis:

> Ludovic Courtès schreef op di 19-04-2022 om 11:17 [+0200]:
>> If you follow the logic, breaking up import cycles would mean, in the
>> end, having one file per package.
>
> Not necessarily, (gnu packages minetest) has multiple packages
> (minetest and some of its mods) but it doesn't cause any cycles (no
> other module, except sort-of (guix build-system minetest), imports it.)
>
> That one appears to be, at least currently, a bit of a special case
> though.

I think so.  All the historical package modules started that way.

>> But would that be enough?  Probably not, because low-level packages
>> are bound to depend on high-level packages—e.g., glibc depends on
>> Python, some other low-level tool might depend on Pandoc (GHC),
>> librsvg depends on Rust, and so on.
>>
>> IOW, since the graph of build dependency really is a graph, and not a
>> tree, there’ll always be import cycles.
>
> The graph of build dependencies (in terms of derivations) is a tree,

It’s a directed acyclic graph (DAG), not a tree.

> the build daemon doesn't allow cyclic derivations.  So I think that by
> letting the module graph be a coarser version of the derivation graph
> but still a tree (except for the bootstrap packages gcc, sed, ... whose
> modules may import each other).

I thought so, but came to the conclusion that it’s hardly feasible in
practice.

>> (guix self), the module that ‘guix pull’ uses, already automatically
>> splits package modules into two groups.  It’s not as modular as we’d
>> like, but it’s a start.  What would be useful is to come up with metrics
>> and tools to reduce the closure of the “guix-packages-base” group.
>> 
>> WDYT?
>
> Maybe:
>
>   a tool that determines a minimal set of (importing module ->
>   imported module tuples) that needs to be lazified to reduce the
>   closure size (in number of modules) in guix-packages-base by N

Currently ‘source-module-closure’ considers #:autoloaded modules as part
of the closure; we could change that though and indeed, that might prove
helpful in this case.

> and:
>
>   extend "guix style" to perform these changes
>
> Maybe the ‘number of imports lazified -> number of modules in guix-
> packages-base’ function has some sweet spot somewhere.

Could be.

> I think it would be easier though to work our way up before going to
> "guix pull" -- first "hello", then "util-linux, then "guile-avahi",
> then "guile-ssh", then "sqlite" ... and only eventually guix itself.
>
> Also, even if the closure of "guix-packages-base" cannot be reduced,
> making it (mostly) a tree would allow splitting the group into multiple
> parts (see ‘Faster "guix pull" by incremental compilation and non-
> circular modules?’).
>
> Alternative:
>
>   * make _all_ package module imports lazy -- #:autoload everything!
>
>     guix-packages-base might then need to be set manually though ...

I don’t know, having spent some time on this, I feel like there’s no
easy solution.  But it could be that using autoloads at least in the
right places would help shrink ‘guix-packages-base’.  Worth a try!

Ludo’.