diff mbox series

[bug#51838,v2,05/26] guix: node-build-system: Add #:absent-dependencies argument.

Message ID 20211120043406.952350-5-philip@philipmcgrath.com
State Accepted
Headers show
Series [bug#51838,v2,01/26] gnu: node: Avoid duplicating build phases. | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

Philip McGrath Nov. 20, 2021, 4:33 a.m. UTC
Many of Guix's Node.js packages are built without some of the
dependencies they specify in their "package-lock.json" files,
either because we don't have them packaged yet (e.g. test
utilities) or because we don't want them (e.g. to reduce the
closure size). Previously, Guix package definitions would work
around this situation by deleting the `'configure`
phase (i.e. the initial `npm install`).

This commit adds an optional #:absent-dependencies argument to
`node-build-system` to list Node.js packages that should be
removed from the "package.json" file.Retaining the `'configure`
phase avoids skipping checks for the dependencies that are
intended to be present and other actions performed by `npm
install`, such as automatically building native add-ons with
`node-gyp` when the "gypfile" key is present.

* guix/build-system/node.scm (lower, node-build): Add optional
argument #:absent-dependencies with default of ''(). Pass it on
to the build-side code.
* guix/build/node-build-system.scm (patch-dependencies): Respect
the #:absent-dependencies argument, removing specified npm
packages from the "dependencies" or "devDependencies" tables
in "package.json". Also, strictly follow the linearity rules
for `assoc-set!` and friends.
---
 guix/build-system/node.scm       |  3 ++
 guix/build/node-build-system.scm | 55 ++++++++++++++++++++++----------
 2 files changed, 41 insertions(+), 17 deletions(-)

Comments

Liliana Marie Prikler Nov. 20, 2021, 7:41 a.m. UTC | #1
Hi,

Am Freitag, den 19.11.2021, 23:33 -0500 schrieb Philip McGrath:
> Many of Guix's Node.js packages are built without some of the
> dependencies they specify in their "package-lock.json" files,
> either because we don't have them packaged yet (e.g. test
> utilities) or because we don't want them (e.g. to reduce the
> closure size). Previously, Guix package definitions would work
> around this situation by deleting the `'configure`
> phase (i.e. the initial `npm install`).
> 
> This commit adds an optional #:absent-dependencies argument to
> `node-build-system` to list Node.js packages that should be
> removed from the "package.json" file.Retaining the `'configure`
> phase avoids skipping checks for the dependencies that are
> intended to be present and other actions performed by `npm
> install`, such as automatically building native add-ons with
> `node-gyp` when the "gypfile" key is present.
> 
> [...]
This is a somewhat decent approach, but I wonder whether we could
improve this.  Does nodejs complain if a dependency exists in the code,
but is not present in the packages.json?

In the resolve-dependencies subprocedure, we could check whether we
have a matching input somewhere and only include the dependency if we
do.  WDYT?
Philip McGrath Nov. 20, 2021, 5:04 p.m. UTC | #2
Hi,

On 11/20/21 02:41, Liliana Marie Prikler wrote:
> Hi,
> 
> Am Freitag, den 19.11.2021, 23:33 -0500 schrieb Philip McGrath:
>> Many of Guix's Node.js packages are built without some of the
>> dependencies they specify in their "package-lock.json" files,
>> either because we don't have them packaged yet (e.g. test
>> utilities) or because we don't want them (e.g. to reduce the
>> closure size). Previously, Guix package definitions would work
>> around this situation by deleting the `'configure`
>> phase (i.e. the initial `npm install`).
>>
>> This commit adds an optional #:absent-dependencies argument to
>> `node-build-system` to list Node.js packages that should be
>> removed from the "package.json" file.Retaining the `'configure`
>> phase avoids skipping checks for the dependencies that are
>> intended to be present and other actions performed by `npm
>> install`, such as automatically building native add-ons with
>> `node-gyp` when the "gypfile" key is present.
>>
>> [...]
> This is a somewhat decent approach, but I wonder whether we could
> improve this.  Does nodejs complain if a dependency exists in the code,
> but is not present in the packages.json?

To the best of my understanding, npm doesn't inspect the js files at all 
by default, so no. (A package.json file might define a build script that 
does more checks.) I think a missing and undeclared dependency manifests 
itself as a runtime error when the require() function is evaluated.

> In the resolve-dependencies subprocedure, we could check whether we
> have a matching input somewhere and only include the dependency if we
> do.  WDYT?

I thought about this, but it seems to me there are two problems. First, 
I'm very unsure about this, but, in the last few days, I've gotten the 
impression there may be some "package.json" packages that don't 
correspond directly to Guix packages. I don't really understand how that 
works, but I think it may have something to do with npm packages that 
can run either on Node.js or on the web and need to use some 
functionality that's part of the Node.js core but not the web platform 
(or vice versa?). I could be wrong about that, but I've tried, 
especially in v2, to only have the build-side code do things I'm 
confident are correct in all cases.

The other issue is that deleting packages with no matching input by 
default would replicate some of the drawbacks of the current `(delete 
'configure)` approach. I think it's better to have an explicit list of 
dependencies that Guix is deleting. If eventually we package all of the 
missing dependencies for Guix, it would be much easier to find the 
packages that ought to use it. And, in the highly dynamic JavaScript 
world, I'm reluctant to give up one of the few static checks we have. If 
a missing package that really was required were automatically deleted 
from "package.json", it seems the failure mode would by a mysterious 
runtime error, potentially many steps down a dependency chain.
Timothy Sample Nov. 20, 2021, 8:04 p.m. UTC | #3
Hello,

Philip McGrath <philip@philipmcgrath.com> writes:

> On 11/20/21 02:41, Liliana Marie Prikler wrote:
>
>> In the resolve-dependencies subprocedure, we could check whether we
>> have a matching input somewhere and only include the dependency if we
>> do.  WDYT?
>
> [...]
>
> The other issue is that deleting packages with no matching input by
> default would replicate some of the drawbacks of the current `(delete 
> 'configure)` approach. I think it's better to have an explicit list of
> dependencies that Guix is deleting. If eventually we package all of
> the missing dependencies for Guix, it would be much easier to find the 
> packages that ought to use it. And, in the highly dynamic JavaScript
> world, I'm reluctant to give up one of the few static checks we
> have. If a missing package that really was required were automatically
> deleted from "package.json", it seems the failure mode would by a
> mysterious runtime error, potentially many steps down a dependency
> chain.

This is well put.  I actually experimented with a similar approach when
we updated the Node.js build system.  This is a big improvement over
deleting the configure phase, which would never scale to more than a
handful of packages.  Having a build-time check that ensures all the
developer-declared dependencies are available (save the “absent” ones)
will be very helpful when we are maintaining hundreds of JavaScript
packages.  :)

For the patch itself, it would be better to move a lot of your commit
message into a comment somewhere in the build system code.  If we had a
section in the manual for Node packages, it would go there, but alas....
I think most people would be happy to see a comment in the build system
code and be saved from having to poke around with ‘git blame’.


-- Tim
Liliana Marie Prikler Nov. 20, 2021, 8:24 p.m. UTC | #4
Hi,

Am Samstag, den 20.11.2021, 12:04 -0500 schrieb Philip McGrath:
> To the best of my understanding, npm doesn't inspect the js files at
> all by default, so no. (A package.json file might define a build
> script that does more checks.) I think a missing and undeclared
> dependency manifests itself as a runtime error when the require()
> function is evaluated.
Sheesh.

> > In the resolve-dependencies subprocedure, we could check whether we
> > have a matching input somewhere and only include the dependency if
> > we do.  WDYT?
> 
> I thought about this, but it seems to me there are two problems.
> First, I'm very unsure about this, but, in the last few days, I've
> gotten the impression there may be some "package.json" packages that
> don't correspond directly to Guix packages. I don't really understand
> how that works, but I think it may have something to do with npm
> packages that can run either on Node.js or on the web and need to use
> some functionality that's part of the Node.js core but not the web
> platform (or vice versa?). I could be wrong about that, but I've
> tried, especially in v2, to only have the build-side code do things
> I'm confident are correct in all cases.
What does "not directly correspond to Guix packages" mean here?  At the
very least Node must have a way of finding them, probably somewhere in
lib/, no?

While I did say we may want to rewrite these dependencies in
‘configure’, perhaps we can find a middle ground in which we only do so
when a given argument is there.  ant-build-system does that for
instance, so as to cope with packages that actually have no ant build
spec.  WDYT?

> The other issue is that deleting packages with no matching input by 
> default would replicate some of the drawbacks of the current
> `(delete 'configure)` approach. I think it's better to have an
> explicit list of dependencies that Guix is deleting. If eventually we
> package all of the missing dependencies for Guix, it would be much
> easier to find the packages that ought to use it. And, in the highly
> dynamic JavaScript world, I'm reluctant to give up one of the few
> static checks we have. If a missing package that really was required
> were automatically deleted from "package.json", it seems the failure
> mode would by a mysterious runtime error, potentially many steps down
> a dependency chain.
I get where you're coming from, but OTOH if you were to change parts of
the package to suit your own needs – like adding or omitting some
inputs, you now have two places to change, and they're in a tricky
relationship with another.  I think this is setting up some very error-
prone boilerplate.

I hope that in most cases the test suite will cover the important use
cases, but that's not always possible (e.g. with flaky tests,
networking, missing test inputs...)  Furthermore, for better or for
worse, stack traces did become the gold standard of debugging long ago,
so even packages missing "deep down" would probably soon be dug out if
they lead to hard errors.  Soft errors in Javascript though...
Timothy Sample Nov. 28, 2021, 7:27 p.m. UTC | #5
Hi Philip,

Philip McGrath <philip@philipmcgrath.com> writes:

> -  (define (resolve-dependencies package-meta meta-key)
> -    (fold (lambda (key+value acc)
> -            (match key+value
> -              ('@ acc)
> -              ((key . value) (acons key (hash-ref index key value) acc))))
> +  (define (resolve-dependencies meta-alist meta-key)
> +    (match (assoc-ref meta-alist meta-key)
> +      (#f
> +       '())
> +      (('@ . orig-deps)
> +       (fold (match-lambda*
> +               (('@ acc)
> +                acc)
> +                (((key . value) acc)
> +                 (if (member key absent-dependencies)
> +                     acc
> +                     (acons key (hash-ref index key value) acc))))
>            '()
> -          (or (assoc-ref package-meta meta-key) '())))
> +          orig-deps))))

There’s a lot of refactoring going here in addition to change at hand.
To me, it needlessly obscures the actual change (which is just the check
for membership in 'absent-dependencies', AFAICS).

>    (with-atomic-file-replacement "package.json"
>      (lambda (in out)
> -      (let ((package-meta (read-json in)))
> -        (assoc-set! package-meta "dependencies"
> -                    (append
> -                     '(@)
> -                     (resolve-dependencies package-meta "dependencies")
> -                     (resolve-dependencies package-meta "peerDependencies")))
> -        (assoc-set! package-meta "devDependencies"
> -                    (append
> -                     '(@)
> -                     (resolve-dependencies package-meta "devDependencies")))
> +      ;; It is unsafe to rely on 'assoc-set!' to update an
> +      ;; existing assosciation list variable:
> +      ;; see 'info "(guile)Adding or Setting Alist Entries"'.

Good catch!

> +      (let* ((package-meta (read-json in))
> +             (alist (match package-meta
> +                      ((@ . alist) alist)))

Why do we have to unwrap (and then re-wrap later on) the alist?  It
would be nice to explain that in the changelog.  Maybe it’s just me, but
I can’t figure it out!  :)

Thanks!


-- Tim
Philip McGrath Dec. 2, 2021, 9:50 p.m. UTC | #6
Hi!

On 11/28/21 14:27, Timothy Sample wrote:
> Philip McGrath <philip@philipmcgrath.com> writes:
> 
>> -  (define (resolve-dependencies package-meta meta-key)
>> -    (fold (lambda (key+value acc)
>> -            (match key+value
>> -              ('@ acc)
>> -              ((key . value) (acons key (hash-ref index key value) acc))))
>> +  (define (resolve-dependencies meta-alist meta-key)
>> +    (match (assoc-ref meta-alist meta-key)
>> +      (#f
>> +       '())
>> +      (('@ . orig-deps)
>> +       (fold (match-lambda*
>> +               (('@ acc)
>> +                acc)
>> +                (((key . value) acc)
>> +                 (if (member key absent-dependencies)
>> +                     acc
>> +                     (acons key (hash-ref index key value) acc))))
>>             '()
>> -          (or (assoc-ref package-meta meta-key) '())))
>> +          orig-deps))))
> 
> There’s a lot of refactoring going here in addition to change at hand.
> To me, it needlessly obscures the actual change (which is just the check
> for membership in 'absent-dependencies', AFAICS).

I could split it into two commits, if that would be more clear. But see 
below.

>>     (with-atomic-file-replacement "package.json"
>>       (lambda (in out)
>> -      (let ((package-meta (read-json in)))
>> -        (assoc-set! package-meta "dependencies"
>> -                    (append
>> -                     '(@)
>> -                     (resolve-dependencies package-meta "dependencies")
>> -                     (resolve-dependencies package-meta "peerDependencies")))
>> -        (assoc-set! package-meta "devDependencies"
>> -                    (append
>> -                     '(@)
>> -                     (resolve-dependencies package-meta "devDependencies")))
>> +      ;; It is unsafe to rely on 'assoc-set!' to update an
>> +      ;; existing assosciation list variable:
>> +      ;; see 'info "(guile)Adding or Setting Alist Entries"'.
> 
> Good catch!
> 
>> +      (let* ((package-meta (read-json in))
>> +             (alist (match package-meta
>> +                      ((@ . alist) alist)))
> 
> Why do we have to unwrap (and then re-wrap later on) the alist?  It
> would be nice to explain that in the changelog.  Maybe it’s just me, but
> I can’t figure it out!  :)

The (guix build json) module represents a JSON object as a Scheme pair 
with the symbol @ in the car and an alist in the cdr. I'd guess this is 
because, if JSON objects were represented as Scheme alists, it would be 
impossible to distinguish the empty JSON object from the empty JSON 
array. (Racket's json library instead represents JSON objects with 
immutable hash tables, a disjoint type.)

I found it confusing that:

  1. this format does not seem to be documented;

  2. `resolve-dependencies` has the same shape as `assoc-ref`, but
     returns an unwrapped alist where `assoc-ref` would return
     a wrapped pair with @; and

  3. prior to this patch, the implementation of `resolve-dependencies`
     was written as though the symbol @ could appear at arbitrary
     positions in the list, which made it less than obvious that it would
     actually occur exactly once, at the head.

I tried to address the last point when refactoring, but I see that the 
patch I sent made matters extra confusing by leaving in dead code from 
the old approach (maybe due to a rebase conflict I vaguely remember?). I 
should fix that, and probably also explain some of this in comments.

-Philip
Philip McGrath Dec. 2, 2021, 10:02 p.m. UTC | #7
Hi,

On 11/20/21 15:04, Timothy Sample wrote:
> Hello,
> 
> Philip McGrath <philip@philipmcgrath.com> writes:
> 
>> On 11/20/21 02:41, Liliana Marie Prikler wrote:
>>
>>> In the resolve-dependencies subprocedure, we could check whether we
>>> have a matching input somewhere and only include the dependency if we
>>> do.  WDYT?
>>
>> [...]
>>
>> The other issue is that deleting packages with no matching input by
>> default would replicate some of the drawbacks of the current `(delete
>> 'configure)` approach. I think it's better to have an explicit list of
>> dependencies that Guix is deleting. If eventually we package all of
>> the missing dependencies for Guix, it would be much easier to find the
>> packages that ought to use it. And, in the highly dynamic JavaScript
>> world, I'm reluctant to give up one of the few static checks we
>> have. If a missing package that really was required were automatically
>> deleted from "package.json", it seems the failure mode would by a
>> mysterious runtime error, potentially many steps down a dependency
>> chain.
> 
> This is well put.  I actually experimented with a similar approach when
> we updated the Node.js build system.  This is a big improvement over
> deleting the configure phase, which would never scale to more than a
> handful of packages.  Having a build-time check that ensures all the
> developer-declared dependencies are available (save the “absent” ones)
> will be very helpful when we are maintaining hundreds of JavaScript
> packages.  :)

It's probably also worth noting that many (most?) of our Node.js 
packages currently don't run tests: an unfortunate number don't have any 
tests, and then there's the issue that we haven't packaged JavaScript 
test frameworks for Guix yet.


> For the patch itself, it would be better to move a lot of your commit
> message into a comment somewhere in the build system code.  If we had a
> section in the manual for Node packages, it would go there, but alas....
> I think most people would be happy to see a comment in the build system
> code and be saved from having to poke around with ‘git blame’.

Yes, I'll add comments.

-Philip
diff mbox series

Patch

diff --git a/guix/build-system/node.scm b/guix/build-system/node.scm
index 98f63f87ef..75ae34508f 100644
--- a/guix/build-system/node.scm
+++ b/guix/build-system/node.scm
@@ -44,6 +44,7 @@  (define (default-node)
 (define* (lower name
                 #:key source inputs native-inputs outputs system target
                 (node (default-node))
+                (absent-dependencies ''())
                 #:allow-other-keys
                 #:rest arguments)
   "Return a bag for NAME."
@@ -73,6 +74,7 @@  (define* (node-build store name inputs
                      (tests? #t)
                      (phases '(@ (guix build node-build-system)
                                  %standard-phases))
+                     (absent-dependencies ''())
                      (outputs '("out"))
                      (search-paths '())
                      (system (%current-system))
@@ -94,6 +96,7 @@  (define builder
                    #:test-target ,test-target
                    #:tests? ,tests?
                    #:phases ,phases
+                   #:absent-dependencies ,absent-dependencies
                    #:outputs %outputs
                    #:search-paths ',(map search-path-specification->sexp
                                          search-paths)
diff --git a/guix/build/node-build-system.scm b/guix/build/node-build-system.scm
index 70a367618e..32d6807e3e 100644
--- a/guix/build/node-build-system.scm
+++ b/guix/build/node-build-system.scm
@@ -69,30 +69,51 @@  (define (list-modules directory)
               input-paths)
     index))
 
-(define* (patch-dependencies #:key inputs #:allow-other-keys)
+(define* (patch-dependencies #:key inputs absent-dependencies
+                             #:allow-other-keys)
 
   (define index (index-modules (map cdr inputs)))
 
-  (define (resolve-dependencies package-meta meta-key)
-    (fold (lambda (key+value acc)
-            (match key+value
-              ('@ acc)
-              ((key . value) (acons key (hash-ref index key value) acc))))
+  (define (resolve-dependencies meta-alist meta-key)
+    (match (assoc-ref meta-alist meta-key)
+      (#f
+       '())
+      (('@ . orig-deps)
+       (fold (match-lambda*
+               (('@ acc)
+                acc)
+                (((key . value) acc)
+                 (if (member key absent-dependencies)
+                     acc
+                     (acons key (hash-ref index key value) acc))))
           '()
-          (or (assoc-ref package-meta meta-key) '())))
+          orig-deps))))
 
   (with-atomic-file-replacement "package.json"
     (lambda (in out)
-      (let ((package-meta (read-json in)))
-        (assoc-set! package-meta "dependencies"
-                    (append
-                     '(@)
-                     (resolve-dependencies package-meta "dependencies")
-                     (resolve-dependencies package-meta "peerDependencies")))
-        (assoc-set! package-meta "devDependencies"
-                    (append
-                     '(@)
-                     (resolve-dependencies package-meta "devDependencies")))
+      ;; It is unsafe to rely on 'assoc-set!' to update an
+      ;; existing assosciation list variable:
+      ;; see 'info "(guile)Adding or Setting Alist Entries"'.
+      (let* ((package-meta (read-json in))
+             (alist (match package-meta
+                      ((@ . alist) alist)))
+             ;; Other relevant keys may include peerDependenciesMeta
+             ;; and optionalDependencies, but it seems to work out fine
+             ;; just to leave those alone.
+             (alist
+              (assoc-set!
+               alist "dependencies"
+               (append
+                '(@)
+                (resolve-dependencies alist "dependencies")
+                (resolve-dependencies alist "peerDependencies"))))
+             (alist
+              (assoc-set!
+               alist "devDependencies"
+               (append
+                '(@)
+                (resolve-dependencies alist "devDependencies"))))
+             (package-meta (cons '@ alist)))
         (write-json package-meta out))))
   #t)