diff mbox series

[bug#50755,v3] import: Generate list of importers based on available modules

Message ID 87mtnxswwq.fsf@airmail.cc
State New
Headers show
Series [bug#50755,v3] import: Generate list of importers based on available modules | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

pinoaffe Sept. 27, 2021, 6:20 p.m. UTC
User-agent: mu4e 1.4.15; emacs 27.2
* guix/scripts/import.scm (importers): Generate a list of all importers by
looping over available guile modules, allowing for extensibility.
---
 guix/scripts/import.scm | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Simon Tournier Sept. 27, 2021, 8:09 p.m. UTC | #1
Hi,

On Mon, 27 Sept 2021 at 20:21, pinoaffe <pinoaffe@airmail.cc> wrote:

> +(define importers (delete-duplicates

This fixes my first point...

> +                   (filter-map (lambda (module)
> +                                 (match (module-name module)
> +                                   (`(guix scripts import ,importer)
> +                                    (symbol->string importer))
> +                                   ( #t #f)))
> +                               (all-modules (map (lambda (entry)
> +                                                   `(,entry . "guix/scripts/import"))
> +                                                 %load-path)))))

...and it means it is walking more than needed.  Therefore, what is
the performance loss?

For instance, on my machine and hot cache, it is 4x slower.  And, this
readibility improvement is not worth, IMHO.
On cold cache, I do not have meaningful numbers because it requires to
run it several times and then compute an average.  What are the
numbers of your machine?

All the best,
simon
M Sept. 28, 2021, 9:51 a.m. UTC | #2
zimoun schreef op ma 27-09-2021 om 22:09 [+0200]:
> Hi,
> 
> On Mon, 27 Sept 2021 at 20:21, pinoaffe <pinoaffe@airmail.cc> wrote:
> 
> > +(define importers (delete-duplicates
> 
> This fixes my first point...
> 
> > +                   (filter-map (lambda (module)
> > +                                 (match (module-name module)
> > +                                   (`(guix scripts import ,importer)
> > +                                    (symbol->string importer))
> > +                                   ( #t #f)))
> > +                               (all-modules (map (lambda (entry)
> > +                                                   `(,entry . "guix/scripts/import"))
> > +                                                 %load-path)))))
> 
> ...and it means it is walking more than needed.  Therefore, what is
> the performance loss?
> 
> For instance, on my machine and hot cache, it is 4x slower.

FWIW, calling ./pre-inst-env guix ... will always be slower than guix ...,
because the former will have a longer %load-path (IIRC) and possibly
other reasons.

Using "guix pull --profile=..." "time .../guix import ..." might be a better test.

To only measure the time required for defiing 'importers', wrap delete-duplicates
in a call to 'time' from (ice-9 time).

Greetings,
Maxime.
pinoaffe Sept. 28, 2021, 2:39 p.m. UTC | #3
Hi, 

Maxime Devos writes:
> To only measure the time required for defiing 'importers', wrap
> delete-duplicates in a call to 'time' from (ice-9 time).

Running

(time (for-each (lambda (_)
                  (delete-duplicates (filter-map (lambda (module)
                    (match (module-name module)
                           (`(guix scripts import ,importer)
                            (symbol->string importer))
                            (#t #f)))
                    (all-modules (map (lambda (entry)
                                   `(,entry . "guix/scripts/import"))
                                   %load-path)))))
                (iota 1000)))

in a guix repl on my system results in

clock utime stime cutime cstime gctime
 0.96  1.67  0.07   0.00   0.00   1.19

If I'm interpreting that correctly that would amount to a couple of
thousands of a second per run

Kind regards,
pinoaffe
M Sept. 29, 2021, 8:59 p.m. UTC | #4
pinoaffe schreef op di 28-09-2021 om 16:39 [+0200]:
> Hi, 
> 
> Maxime Devos writes:
> > To only measure the time required for defiing 'importers', wrap
> > delete-duplicates in a call to 'time' from (ice-9 time).
> 
> Running
> 
> (time (for-each (lambda (_)
>                   (delete-duplicates (filter-map (lambda (module)
>                     (match (module-name module)
>                            (`(guix scripts import ,importer)
>                             (symbol->string importer))
>                             (#t #f)))
>                     (all-modules (map (lambda (entry)
>                                    `(,entry . "guix/scripts/import"))
>                                    %load-path)))))
>                 (iota 1000)))
> 
> in a guix repl on my system results in
> 
> clock utime stime cutime cstime gctime
>  0.96  1.67  0.07   0.00   0.00   1.19
> 
> If I'm interpreting that correctly that would amount to a couple of
> thousands of a second per run

These numbers turn out to be misleading, because 'scheme-modules'
(indirectly called from all-modules) calls 'resolve-interface' on every module
name.  For a module name, the first 'resolve-module' incurs disk I/O and some
CPU for loading the module, but the second 'resolve-module' on the same module
name would be free, as the module is already loaded.

Greetings,
Maxime
pinoaffe Sept. 30, 2021, 8:17 a.m. UTC | #5
Maxime Devos writes:
> These numbers turn out to be misleading, because 'scheme-modules'
> (indirectly called from all-modules) calls 'resolve-interface' on every module
> name.  For a module name, the first 'resolve-module' incurs disk I/O and some
> CPU for loading the module, but the second 'resolve-module' on the same module
> name would be free, as the module is already loaded.
okay, the first incantation of 

(time (for-each (lambda (_)
                   (delete-duplicates (filter-map (lambda (module)
                     (match (module-name module)
                            (`(guix scripts import ,importer)
                             (symbol->string importer))
                             (#t #f)))
                     (all-modules (map (lambda (entry)
                                    `(,entry . "guix/scripts/import"))
                                    %load-path)))))
                 (iota 1)))

on a "fresh" guix repl on my system results in 

clock utime stime cutime cstime gctime
 1.28  0.76  0.13   0.00   0.00   0.16

which is indeed a significant amount of time, though I don't think it'd
be much of an issue considering that it's not likely that users will run
lots of `guix import` shell commands in rapid succession.

kind regards,
pinoaffe
M Sept. 30, 2021, 8:37 a.m. UTC | #6
pinoaffe schreef op do 30-09-2021 om 10:17 [+0200]:
> Maxime Devos writes:
> > These numbers turn out to be misleading, because 'scheme-modules'
> > (indirectly called from all-modules) calls 'resolve-interface' on every module
> > name.  For a module name, the first 'resolve-module' incurs disk I/O and some
> > CPU for loading the module, but the second 'resolve-module' on the same module
> > name would be free, as the module is already loaded.
> okay, the first incantation of 
> 
> (time (for-each (lambda (_)
>                    (delete-duplicates (filter-map (lambda (module)
>                      (match (module-name module)
>                             (`(guix scripts import ,importer)
>                              (symbol->string importer))
>                              (#t #f)))
>                      (all-modules (map (lambda (entry)
>                                     `(,entry . "guix/scripts/import"))
>                                     %load-path)))))
>                  (iota 1)))
> 
> on a "fresh" guix repl on my system results in 
> 
> clock utime stime cutime cstime gctime
>  1.28  0.76  0.13   0.00   0.00   0.16

On my fresh guix repl, it's a bit longer:

clock utime stime cutime cstime gctime
 9.54  1.79  0.31   0.00   0.00   0.53

(9 or 10 seconds)

If I restart the guix repl and run it again,
I get about half a second:

clock utime stime cutime cstime gctime
 0.47  0.57  0.02   0.00   0.00   0.19

> which is indeed a significant amount of time, though I don't think it'd
> be much of an issue considering that it's not likely that users will run
> lots of `guix import` shell commands in rapid succession.

The list of importers is only needed for two purposes, right?

   1. to print a list of importers when "guix import --help" is run
   2. to verify the string actually specifies an importer

Then 'guix import SOME-IMPORTER STUFF' could be optimised:

reolve-importer and guix-import could be modified to skip the validation
step and let resolve-importer print the error if the module couldn't be
found.  Possibly (resolve-module '(the possibly undefined module) #:ensure #f)
might be useful. Then 'importers' would only be required for purpose (1),
so it could be wrapped in a promise, such that if "guix import some-importer stuff"
is called, only the required importer module is loaded.

Greetings,
Maxime.
Simon Tournier Oct. 11, 2021, 11:51 a.m. UTC | #7
Hi,

On Thu, 30 Sept 2021 at 10:37, Maxime Devos <maximedevos@telenet.be> wrote:

> The list of importers is only needed for two purposes, right?
>
>    1. to print a list of importers when "guix import --help" is run
>    2. to verify the string actually specifies an importer
>
> Then 'guix import SOME-IMPORTER STUFF' could be optimised:
>
> reolve-importer and guix-import could be modified to skip the validation
> step and let resolve-importer print the error if the module couldn't be
> found.  Possibly (resolve-module '(the possibly undefined module) #:ensure #f)
> might be useful. Then 'importers' would only be required for purpose (1),
> so it could be wrapped in a promise, such that if "guix import some-importer stuff"
> is called, only the required importer module is loaded.

My comment is about the elegance vs the performance loss.  On old
machines, Guix is becoming unpractical for many operations (almost all
the operations indeed) and I would not add another slowness.  I am
fine to sacrifice some performances if it is worth.  However, the
balance is always: what is gain and what is loss?  Here the gain is
small code elegance against the performance lost for end-user.  The
question: does it worth?  From my point of view, no, this change is
not worth.  For what my opinion is worth here. ;-)

Cheers,
simon
diff mbox series

Patch

diff --git a/guix/scripts/import.scm b/guix/scripts/import.scm
index 40fa6759ae..71ee8cc00b 100644
--- a/guix/scripts/import.scm
+++ b/guix/scripts/import.scm
@@ -23,6 +23,7 @@ 
 
 (define-module (guix scripts import)
   #:use-module (guix ui)
+  #:use-module (guix discovery)
   #:use-module (guix scripts)
   #:use-module (guix utils)
   #:use-module (srfi srfi-1)
@@ -78,9 +79,15 @@  rather than \\n."
 ;;; Entry point.
 ;;;
 
-(define importers '("gnu" "pypi" "cpan" "hackage" "stackage" "egg" "elpa"
-                    "gem" "go" "cran" "crate" "texlive" "json" "opam"
-                    "minetest"))
+(define importers (delete-duplicates
+                   (filter-map (lambda (module)
+                                 (match (module-name module)
+                                   (`(guix scripts import ,importer)
+                                    (symbol->string importer))
+                                   ( #t #f)))
+                               (all-modules (map (lambda (entry)
+                                                   `(,entry . "guix/scripts/import"))
+                                                 %load-path)))))
 
 (define (resolve-importer name)
   (let ((module (resolve-interface