diff mbox series

[bug#51838,v5,07/45] guix: node-build-system: Add #:absent-dependencies argument.

Message ID 20211217020325.520821-8-philip@philipmcgrath.com
State Accepted
Headers show
Series guix: node-build-system: Support compiling add-ons with node-gyp. | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

Philip McGrath Dec. 17, 2021, 2:02 a.m. UTC
* guix/build-system/node.scm (lower, node-build): Add optional
argument #:absent-dependencies with default of ''(). Pass it on
to the build-side code.
* guix/build/node-build-system.scm (patch-dependencies): Respect
the #:absent-dependencies argument, removing specified npm
packages from the "dependencies" or "devDependencies" tables
in "package.json".
---
 guix/build-system/node.scm       | 19 ++++++++++++++++++-
 guix/build/node-build-system.scm |  7 +++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

Comments

Liliana Marie Prikler Dec. 17, 2021, 4:43 a.m. UTC | #1
Hi,

Am Donnerstag, dem 16.12.2021 um 21:02 -0500 schrieb Philip McGrath:
> * guix/build-system/node.scm (lower, node-build): Add optional
> argument #:absent-dependencies with default of ''(). Pass it on
> to the build-side code.
> * guix/build/node-build-system.scm (patch-dependencies): Respect
> the #:absent-dependencies argument, removing specified npm
> packages from the "dependencies" or "devDependencies" tables
> in "package.json".
> [...]
> +                (absent-dependencies ''())
> [...]
> +                     (absent-dependencies ''())
> [...]
> -(define* (patch-dependencies #:key inputs #:allow-other-keys)
> +(define* (patch-dependencies #:key inputs absent-dependencies
> +                             #:allow-other-keys)
>  
>    (define index (index-modules (map cdr inputs)))
>  
> @@ -86,7 +87,9 @@ (define (resolve-dependencies meta-alist meta-key)
>        (('@ . orig-deps)
>         (fold (match-lambda*
>                 (((key . value) acc)
> -                (acons key (hash-ref index key value) acc)))
> +                (if (member key absent-dependencies)
> +                    acc
> +                    (acons key (hash-ref index key value) acc))))
>               '()
>               orig-deps))))
I might be sounding like a broken record here, but again for the sake
of being able to add or remove inputs on the user side without
duplicated efforts, I'd use something else in lieu of absent-
dependencies.  For the time being, I think a switch between "fail" and
"warn about missing dependencies" ought to be enough.  In the future,
we might want to make it so that packages can specify which
dependencies they absolutely require and which they don't (or if
possible infer that from dependencies).  WDYT?
Timothy Sample Dec. 17, 2021, 3:46 p.m. UTC | #2
Hi Liliana,

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

> I might be sounding like a broken record here, but again for the sake
> of being able to add or remove inputs on the user side without
> duplicated efforts, I'd use something else in lieu of absent-
> dependencies.  For the time being, I think a switch between "fail" and
> "warn about missing dependencies" ought to be enough.  In the future,
> we might want to make it so that packages can specify which
> dependencies they absolutely require and which they don't (or if
> possible infer that from dependencies).  WDYT?

The way I see it is that we are basically copying the
“--disable-X”/“--without-Y” features of a GNU configure script.  With
the GNU build system, if one of our packages specifies “--without-zlib”
for whatever reason, and you as a user want to modify the package to use
zlib, you would have to

  1. remove the “--without-zlib” configure flag; and
  2. add the zlib package to the package’s inputs.

That’s a “duplicated effort”, right?

Similarly, for a Node package (with a pretend Node zlib package), you
would have to

  1. remove “zlib” from absent dependencies; and
  2. add the “node-zlib” package to the package’s inputs.

That seems okay to me, but I’m worried I’ve missed what you’re trying to
communicate.


-- Tim
Liliana Marie Prikler Dec. 17, 2021, 7:40 p.m. UTC | #3
Hi Timothy,

Am Freitag, dem 17.12.2021 um 10:46 -0500 schrieb Timothy Sample:
> Hi Liliana,
> 
> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
> 
> > I might be sounding like a broken record here, but again for the
> > sake of being able to add or remove inputs on the user side without
> > duplicated efforts, I'd use something else in lieu of absent-
> > dependencies.  For the time being, I think a switch between "fail"
> > and "warn about missing dependencies" ought to be enough.  In the
> > future, we might want to make it so that packages can specify which
> > dependencies they absolutely require and which they don't (or if
> > possible infer that from dependencies).  WDYT?
> 
> The way I see it is that we are basically copying the
> “--disable-X”/“--without-Y” features of a GNU configure script.  With
> the GNU build system, if one of our packages specifies “--without-
> zlib”
> for whatever reason, and you as a user want to modify the package to
> use zlib, you would have to
> 
>   1. remove the “--without-zlib” configure flag; and
>   2. add the zlib package to the package’s inputs.
> 
> That’s a “duplicated effort”, right?
> 
> Similarly, for a Node package (with a pretend Node zlib package), you
> would have to
> 
>   1. remove “zlib” from absent dependencies; and
>   2. add the “node-zlib” package to the package’s inputs.
> 
> That seems okay to me, but I’m worried I’ve missed what you’re trying
> to communicate.
For the GNU build system (and likewise meson-build-system), the default
behaviour if you haven't specified anything as per upstream conventions
is typically to error if the package is required and omit it if it's
not.  The default behaviour of node-build-system (and likewise cargo
and most other build systems that come with the advertisement of "we
know package managers better than people who actually produce useful
package managers) is "Oh my god, you don't have an exact copy of the
machine that built this stuff locally, I am going to barf huge walls of
noise at you".  Therefore, we can't meaningfully compare those build
systems in terms of strategies.

On a slightly related note, I recently had the pleasure of working on a
patch for a package that fails python-build-systems new sanity-check
phase.  If we really want some static verification for node-build-
system, I think we should take that as an approach rather than hard-
coding (absent) dependencies literally everywhere.  Although in writing
that I must concede that Python is still too sane to be a Web standard.

I'm sorry I can't explain my reasoning in a nicer manner.  Your point
would be totally valid if node-build-system did act like the GNU build
system or meson in that dependency resolution is DWIM by default, but
sadly we can't have nice things with "modern" programming languages and
their build systems.

TL;DR: "--without-zlib" is usually implied in packages based on the GNU
build system unless zlib really is a hard requirement of the package. 
The same can't be said for node-anything.

Cheers
Timothy Sample Dec. 18, 2021, 2:48 a.m. UTC | #4
Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

> For the GNU build system (and likewise meson-build-system), the default
> behaviour if you haven't specified anything as per upstream conventions
> is typically to error if the package is required and omit it if it's
> not.  The default behaviour of node-build-system (and likewise cargo
> and most other build systems that come with the advertisement of "we
> know package managers better than people who actually produce useful
> package managers) is "Oh my god, you don't have an exact copy of the
> machine that built this stuff locally, I am going to barf huge walls of
> noise at you".  Therefore, we can't meaningfully compare those build
> systems in terms of strategies.

NPM packages tend to wildly over-specify their dependencies.  We already
remove dependency version checking, and before this change, many of our
packages skipped any kind of dependency checking by skipping the
configure phase altogether.  To me, the ‘#:absent-dependencies’ approach
tries to work around the dependency over-specification by listing
exactly those things that are only there to elicit a useless “Oh my god
[...], I’m going to barf huge walls of noise” message.  The rest of the
dependencies are those that the Guix package author deemed required (or
at least important).  Basically, ‘#:absent-dependencies’ helps us
translate between the NPM culture of over-specification (which is really
a culture of prioritizing package author over package user) and the GNU
culture of “DWIM” dependencies.

> If we really want some static verification for node-build-system, I
> think we should take that as an approach rather than hard-coding
> (absent) dependencies literally everywhere.

We need some way to know what to statically verify.  We can’t magically
know what’s important and what isn’t.  The two options in this thread
are ‘#:absent-dependencies’, and only checking what’s already in the
package’s inputs.  What worries me about the second approach is that it
offers no help when updating a package.  With ‘#:absent-dependencies’,
if the developer adds a new dependency that really is required, we will
get a build-time failure letting us know.  Whoever is updating the
package can fix it before even committing anything.  If we just check
the inputs, that’s not the case, and we might end up with Philip’s
“mysterious runtime error, potentially many steps down a dependency
chain.”  Hopefully tests would catch it, but I like the extra assurance.

Another benefit is that it would help us gain knowledge as to which NPM
packages are often used but not actually required (e.g., NPM publishing
tools).  With this knowledge, we could write a clever NPM importer that
ignored obviously inessential dependencies.

I guess I’m starting to sound like a broken record now – this is
basically what we covered before!  :)  Maybe we’re in need of a fresh
perspective.  (If anyone is reading along and has thoughts, feel free to
chime in!)


-- Tim
Liliana Marie Prikler Dec. 18, 2021, 8:30 a.m. UTC | #5
Hi,

Am Freitag, dem 17.12.2021 um 21:48 -0500 schrieb Timothy Sample:
> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
> 
> > For the GNU build system (and likewise meson-build-system), the
> > default behaviour if you haven't specified anything as per upstream
> > conventions is typically to error if the package is required and
> > omit it if it's not.  The default behaviour of node-build-system
> > (and likewise cargo and most other build systems that come with the
> > advertisement of "we know package managers better than people who
> > actually produce usefulpackage managers) is "Oh my god, you don't
> > have an exact copy of the machine that built this stuff locally, I
> > am going to barf huge walls of noise at you".  Therefore, we can't
> > meaningfully compare those build systems in terms of strategies.
> 
> NPM packages tend to wildly over-specify their dependencies.  We
> already remove dependency version checking, and before this change,
> many of our packages skipped any kind of dependency checking by
> skipping the configure phase altogether.  To me, the ‘#:absent-
> dependencies’ approach tries to work around the dependency over-
> specification by listing exactly those things that are only there to
> elicit a useless “Oh my god [...], I’m going to barf huge walls of
> noise” message.  The rest of the dependencies are those that the Guix
> package author deemed required (or at least important).  Basically,
> ‘#:absent-dependencies’ helps us translate between the NPM culture of
> over-specification (which is really a culture of prioritizing package
> author over package user) and the GNU culture of “DWIM” dependencies.
Except that it's not.  The current workaround is "I know this thing's
going to barf at me, so I prepare an umbrella and hope it has no
holes".

> > If we really want some static verification for node-build-system, I
> > think we should take that as an approach rather than hard-coding
> > (absent) dependencies literally everywhere.
> 
> We need some way to know what to statically verify.  We can’t
> magically know what’s important and what isn’t.  The two options in
> this thread are ‘#:absent-dependencies’, and only checking what’s
> already in the package’s inputs.  What worries me about the second
> approach is that it offers no help when updating a package.  With
> ‘#:absent-dependencies’, if the developer adds a new dependency that
> really is required, we will get a build-time failure letting us
> know.  Whoever is updating the package can fix it before even
> committing anything.  If we just check the inputs, that’s not the
> case, and we might end up with Philip’s “mysterious runtime error,
> potentially many steps down a dependency chain.”  Hopefully tests
> would catch it, but I like the extra assurance.
That's why I didn't want to default to "do nothing", but to *warn*
about missing dependencies in configure.  Then whoever bumps the
package will at least know which warnings are produced if they do so
and they can cross-check by manually building past versions.  Including
#:absent-dependencies is no safe-guard against a failure here anyway. 
A dependency that was optional in V1 might become required in V2. 

> Another benefit is that it would help us gain knowledge as to which
> NPM packages are often used but not actually required (e.g., NPM
> publishing tools).  With this knowledge, we could write a clever NPM
> importer that ignored obviously inessential dependencies.
> 
> I guess I’m starting to sound like a broken record now – this is
> basically what we covered before!  :)  Maybe we’re in need of a fresh
> perspective.  (If anyone is reading along and has thoughts, feel free
> to chime in!)
I think the NPM convention is to put everything you need "at build
time, but not at runtime" into dev-dependencies, no?  In any case, one
approach I could offer is to sanity-check by searching for require()
statements and trying them in a controlled node environment.  This
could look something like

eval("try { var dep = require('" + dependency + "'); true }
catch (e) { false; }")

Once we know where require statements are made and whether they
succeed, we can start estimating the impact of a missing dependency. 
For this, it'd be nice to have a full function call graph, in which a
node is coloured dirty if it has a failing require statement, lies
within a module that has one or calls into a dirty node.  However, as a
primitive approximation we can also count the node modules with failing
requires against those that don't.  We set an arbitrary threshold of
allowed failures, e.g. 0.42, and then check that whatever value we
obtain from the above is lower than that.

While that would be nice and all, I think the overall issue with the
current node implementation in Guix is that 'configure' and 'sanity-
check' are the same phase, so you have to disable both or none.  I
think we could easily do with a configure phase that does nothing, not
even warn about a missing dependency, and a sanity-check phase that
requires every dependency mentioned in package.json to be met. 
Packagers would then outright delete sanity-check as they do for python
and as they did for configure (but not have configure fail due to it!)
or deliberately rewrite the package.json for the sanity check and
dropping absent dependencies, i.e. what you do minus the keyword.  If
later needed for the purposes of an importer, we would then still have
that database and could at some point introduce the key #:insane-
requirements.  WDYT?
Philip McGrath Dec. 18, 2021, 6:31 p.m. UTC | #6
Hi,

On 12/18/21 03:30, Liliana Marie Prikler wrote:
> Hi,
> 
> Am Freitag, dem 17.12.2021 um 21:48 -0500 schrieb Timothy Sample:
>> Liliana Marie Prikler <liliana.prikler@gmail.com> writes:
>>
>>> For the GNU build system (and likewise meson-build-system), the
>>> default behaviour if you haven't specified anything as per upstream
>>> conventions is typically to error if the package is required and
>>> omit it if it's not.  The default behaviour of node-build-system
>>> (and likewise cargo and most other build systems that come with the
>>> advertisement of "we know package managers better than people who
>>> actually produce usefulpackage managers) is "Oh my god, you don't
>>> have an exact copy of the machine that built this stuff locally, I
>>> am going to barf huge walls of noise at you".  Therefore, we can't
>>> meaningfully compare those build systems in terms of strategies.
>>
>> NPM packages tend to wildly over-specify their dependencies.  We
>> already remove dependency version checking, and before this change,
>> many of our packages skipped any kind of dependency checking by
>> skipping the configure phase altogether.  To me, the ‘#:absent-
>> dependencies’ approach tries to work around the dependency over-
>> specification by listing exactly those things that are only there to
>> elicit a useless “Oh my god [...], I’m going to barf huge walls of
>> noise” message.  The rest of the dependencies are those that the Guix
>> package author deemed required (or at least important).  Basically,
>> ‘#:absent-dependencies’ helps us translate between the NPM culture of
>> over-specification (which is really a culture of prioritizing package
>> author over package user) and the GNU culture of “DWIM” dependencies.
> Except that it's not.  The current workaround is "I know this thing's
> going to barf at me, so I prepare an umbrella and hope it has no
> holes".
> 
>>> If we really want some static verification for node-build-system, I
>>> think we should take that as an approach rather than hard-coding
>>> (absent) dependencies literally everywhere.
>>
>> We need some way to know what to statically verify.  We can’t
>> magically know what’s important and what isn’t.  The two options in
>> this thread are ‘#:absent-dependencies’, and only checking what’s
>> already in the package’s inputs.  What worries me about the second
>> approach is that it offers no help when updating a package.  With
>> ‘#:absent-dependencies’, if the developer adds a new dependency that
>> really is required, we will get a build-time failure letting us
>> know.  Whoever is updating the package can fix it before even
>> committing anything.  If we just check the inputs, that’s not the
>> case, and we might end up with Philip’s “mysterious runtime error,
>> potentially many steps down a dependency chain.”  Hopefully tests
>> would catch it, but I like the extra assurance.
> That's why I didn't want to default to "do nothing", but to *warn*
> about missing dependencies in configure.  Then whoever bumps the
> package will at least know which warnings are produced if they do so
> and they can cross-check by manually building past versions.  Including
> #:absent-dependencies is no safe-guard against a failure here anyway.
> A dependency that was optional in V1 might become required in V2.

I also feel like I'm missing something, though maybe I just disagree.

To try to be concrete, here's a real example. Between v3 and v4 of this 
patch series, I discovered that leaving out node-debug could actually 
cause runtime problems for some of the node-serialport packages. (It 
turns out it's really a logging library, which wasn't what I'd thought 
at first.) After I added the node-debug, I could easily search 
node-xyz.scm for the packages that listed it among their 
#:absent-dependencies and add it to them as an input.

It seems like this would be much less convenient if node-build-system 
were to silently delete such dependencies and simply print a warning. I 
guess I would have to search through the build logs for all Node.js 
packages, right?

More generally, I think truly "optional dependencies" (in the Node.js 
sense, to the extent I understand it) and dependencies we actively don't 
want (e.g. because node-sqlite3 shouldn't transitively depend on 
node-aws-sdk) are relatively rare cases.

The most common cases seem to be dependencies we really would like to 
have, such as test frameworks or Typescript, but haven't packaged yet, 
and thus need to work around. Many packages that have 
#:absent-dependencies for such reasons also need to use `#:tests? #f` or 
to replace the build phase with some kind of manual alternative.

I guess I don't understand what case the warn-and-drop approach is 
optimizing for. For both the case when dependencies aren't packaged for 
Guix yet and the case when Guix packagers have actively decided to skip 
some upstream dependencies, I think #:absent-dependencies is more 
useful. Having to look for that information in warnings in the build log 
seems much less ergonomic.

>> Another benefit is that it would help us gain knowledge as to which
>> NPM packages are often used but not actually required (e.g., NPM
>> publishing tools).  With this knowledge, we could write a clever NPM
>> importer that ignored obviously inessential dependencies.

I agree with this, too: likewise, we could see packages that are often 
wanted but aren't in Guix and prioritize adding them! Part of the 
benefit of #:absent-dependencies, IMO, is as a form of communication 
with humans.

>> I guess I’m starting to sound like a broken record now – this is
>> basically what we covered before!  :)  Maybe we’re in need of a fresh
>> perspective.  (If anyone is reading along and has thoughts, feel free
>> to chime in!)
> I think the NPM convention is to put everything you need "at build
> time, but not at runtime" into dev-dependencies, no?  In any case, one
> approach I could offer is to sanity-check by searching for require()
> statements and trying them in a controlled node environment.  This
> could look something like
> 
> eval("try { var dep = require('" + dependency + "'); true }
> catch (e) { false; }")
> 
> Once we know where require statements are made and whether they
> succeed, we can start estimating the impact of a missing dependency.
> For this, it'd be nice to have a full function call graph, in which a
> node is coloured dirty if it has a failing require statement, lies
> within a module that has one or calls into a dirty node.  However, as a
> primitive approximation we can also count the node modules with failing
> requires against those that don't.  We set an arbitrary threshold of
> allowed failures, e.g. 0.42, and then check that whatever value we
> obtain from the above is lower than that.

This could be interesting, and I think some of the JavaScript blunders 
we don't have packaged for Guix yet try to do something like this, but 
such analysis is not tractable in the general case, especially with 
CommonJS `require`, which is just a function and can be given arbitrary 
arguments computed at runtime. (And some packages really use it that way!)

Also, currently node-build-system doesn't seem to be removing those 
files which `npm pack` is supposed to exclude, which would probably be a 
prerequisite for addressing this.

> While that would be nice and all, I think the overall issue with the
> current node implementation in Guix is that 'configure' and 'sanity-
> check' are the same phase, so you have to disable both or none.  I
> think we could easily do with a configure phase that does nothing, not
> even warn about a missing dependency, and a sanity-check phase that
> requires every dependency mentioned in package.json to be met.
> Packagers would then outright delete sanity-check as they do for python
> and as they did for configure (but not have configure fail due to it!)
> or deliberately rewrite the package.json for the sanity check and
> dropping absent dependencies, i.e. what you do minus the keyword.  If
> later needed for the purposes of an importer, we would then still have
> that database and could at some point introduce the key #:insane-
> requirements.  WDYT?

I don't understand the benefit of this, and I'm also confused about the 
proposed implementation specifics. Why even have "a configure phase that 
does nothing"? What phase would run `npm install`? Presumably, we would 
have to delete all missing dependencies before that.

I think there is room for improvement in node-build-system. One thing 
I've been thinking about is some articles I've seen (but not fully 
thought through yet) from the developers of PNPM, an alternative package 
manager for Node.js, which seems to have some similarities to Guix in 
symlinking things to a "store".[1][2][3] (It could be especially 
interesting for bootstrapping npm! And the approach to "monorepos" also 
seems relevant.) I also think an importer is very important, even if 
it's an imperfect one: `guix import npm-binary` was indispensable in 
developing these patches. I have some ideas about improving it, in 
particular that we should assume the newer "^" semantics for 
dependencies everywhere (i.e. that major versions and only major 
versions have incompatible changes: a common case recently seems to be 
moving from CommonJS modules to ES6 modules).

As I understand it, node-build-system is undocumented, with no 
guarantees of compatibility. If #:absent-dependencies is at least an 
improvement over the status quo---which I think it is, since the new 
packages wouldn't build with the status quo---could we apply this and 
replace it later if someone implements a better strategy? I don't think 
I can implement control-flow analysis for JavaScript within the scope of 
this patch series.

-Philip

[1]: https://pnpm.io/blog/2020/05/27/flat-node-modules-is-not-the-only-way
[2]: https://pnpm.io/symlinked-node-modules-structure
[3]: https://pnpm.io/how-peers-are-resolved
Liliana Marie Prikler Dec. 18, 2021, 8:49 p.m. UTC | #7
Hi,

Am Samstag, dem 18.12.2021 um 13:31 -0500 schrieb Philip McGrath:
> I also feel like I'm missing something, though maybe I just disagree.
> 
> To try to be concrete, here's a real example. Between v3 and v4 of
> this patch series, I discovered that leaving out node-debug could
> actually cause runtime problems for some of the node-serialport
> packages. (It turns out it's really a logging library, which wasn't
> what I'd thought at first.) After I added the node-debug, I could
> easily search node-xyz.scm for the packages that listed it among
> their #:absent-dependencies and add it to them as an input.
Wouldn't we get a huge stack trace of doom with the failed require in
that case if it was truly non-negotiable, though?

> It seems like this would be much less convenient if node-build-system
> were to silently delete such dependencies and simply print a warning.
> I guess I would have to search through the build logs for all Node.js
> packages, right?
Since node packages do have the tendency to pull in the entire
internet, you might not be too far off with that statement, but again
if you have a particular issue in a single package due to an omitted
dependency, the offending package ought to be close to the most recent
call on the stack.

An alternative to searching through the build logs would also be to
build all the sources and grepping their package.jsons ;)

> More generally, I think truly "optional dependencies" (in the Node.js
> sense, to the extent I understand it) and dependencies we actively
> don't want (e.g. because node-sqlite3 shouldn't transitively depend
> on node-aws-sdk) are relatively rare cases.
> 
> The most common cases seem to be dependencies we really would like to
> have, such as test frameworks or Typescript, but haven't packaged
> yet, and thus need to work around. Many packages that have 
> #:absent-dependencies for such reasons also need to use `#:tests? #f`
> or to replace the build phase with some kind of manual alternative.
> 
> I guess I don't understand what case the warn-and-drop approach is 
> optimizing for. For both the case when dependencies aren't packaged
> for Guix yet and the case when Guix packagers have actively decided
> to skip some upstream dependencies, I think #:absent-dependencies is
> more useful. Having to look for that information in warnings in the
> build log seems much less ergonomic.
That's why my original suggestion was to have a switch between "strict"
meaning we fail as before and "warn" meaning "we know we have unmet
dependencies".  The "warn" case would be annotated similar to #:tests?
#f and the strict case default.  So you could still communicate
important missing packages like typescript et al., but people who hack
on their own channels with lower standards can wing it without having
to delete configure.

> > > Another benefit is that it would help us gain knowledge as to
> > > which NPM packages are often used but not actually required
> > > (e.g., NPM publishing tools).  With this knowledge, we could
> > > write a clever NPM importer that ignored obviously inessential
> > > dependencies.
> 
> I agree with this, too: likewise, we could see packages that are
> often wanted but aren't in Guix and prioritize adding them! Part of
> the benefit of #:absent-dependencies, IMO, is as a form of
> communication with humans.
We do have a wishlist...

> > > I guess I’m starting to sound like a broken record now – this is
> > > basically what we covered before!  :)  Maybe we’re in need of a
> > > fresh perspective.  (If anyone is reading along and has thoughts,
> > > feel free to chime in!)
> > I think the NPM convention is to put everything you need "at build
> > time, but not at runtime" into dev-dependencies, no?  In any case,
> > one approach I could offer is to sanity-check by searching for
> > require() statements and trying them in a controlled node
> > environment.  Thiscould look something like
> > 
> > eval("try { var dep = require('" + dependency + "'); true }
> > catch (e) { false; }")
> > 
> > Once we know where require statements are made and whether they
> > succeed, we can start estimating the impact of a missing
> > dependency.
> > For this, it'd be nice to have a full function call graph, in which
> > a node is coloured dirty if it has a failing require statement,
> > lies within a module that has one or calls into a dirty node. 
> > However, as a primitive approximation we can also count the node
> > modules with failing requires against those that don't.  We set an
> > arbitrary threshold of allowed failures, e.g. 0.42, and then check
> > that whatever value we obtain from the above is lower than that.
> 
> This could be interesting, and I think some of the JavaScript
> blunders we don't have packaged for Guix yet try to do something like
> this, but such analysis is not tractable in the general case,
> especially with CommonJS `require`, which is just a function and can
> be given arbitrary arguments computed at runtime. (And some packages
> really use it that way!)
Of course I forgot about computed goto...

> Also, currently node-build-system doesn't seem to be removing those 
> files which `npm pack` is supposed to exclude, which would probably
> be a  prerequisite for addressing this.
For the record, which files would that be?  Could we do emacs-build-
system style #:include and #:exclude lists?

> > While that would be nice and all, I think the overall issue with
> > the current node implementation in Guix is that 'configure' and
> > 'sanity-check' are the same phase, so you have to disable both or
> > none.  I think we could easily do with a configure phase that does
> > nothing, not even warn about a missing dependency, and a sanity-
> > check phase that requires every dependency mentioned in
> > package.json to be met. Packagers would then outright delete
> > sanity-check as they do for python and as they did for configure
> > (but not have configure fail due to it!)
> > or deliberately rewrite the package.json for the sanity check and
> > dropping absent dependencies, i.e. what you do minus the keyword. 
> > If later needed for the purposes of an importer, we would then
> > still have that database and could at some point introduce the key
> > #:insane-requirements.  WDYT?
> 
> I don't understand the benefit of this, and I'm also confused about
> the proposed implementation specifics. Why even have "a configure
> phase that does nothing"? What phase would run `npm install`?
> Presumably, we would have to delete all missing dependencies before
> that.
I meant "does nothing" in the sense of "does nothing with missing
dependencies", i.e. does not even warn.  It would of course still run
npm whatever.

On that note, I did typo there, so it would be patch-dependencies.  So
patch-dependencies would be implemented using and=> or and-let* to
decide whether to patch an entry or not.

Another implementation of the sanity check phase would be to read the
package json once more as we already do for patch-dependencies, but
this time only to check that we have an input that provides it.  The
benefit of doing that would the same as the strict/warn switch from
before, but without adding a new keyword at all.  The downside would be
that it still invites (delete 'sanity-check) without the comments we'd
tend to write around #:tests? #f.

> I think there is room for improvement in node-build-system. One thing
> I've been thinking about is some articles I've seen (but not fully 
> thought through yet) from the developers of PNPM, an alternative
> package manager for Node.js, which seems to have some similarities to
> Guix in symlinking things to a "store".[1][2][3] (It could be
> especially interesting for bootstrapping npm! And the approach to
> "monorepos" also seems relevant.) I also think an importer is very
> important, even if it's an imperfect one: `guix import npm-binary`
> was indispensable in developing these patches. I have some ideas
> about improving it, in particular that we should assume the newer "^"
> semantics for dependencies everywhere (i.e. that major versions and
> only major versions have incompatible changes: a common case recently
> seems to be moving from CommonJS modules to ES6 modules).
> 
> As I understand it, node-build-system is undocumented, with no 
> guarantees of compatibility. If #:absent-dependencies is at least an 
> improvement over the status quo---which I think it is, since the new 
> packages wouldn't build with the status quo---could we apply this and
> replace it later if someone implements a better strategy? I don't
> think I can implement control-flow analysis for JavaScript within the
> scope of this patch series.
It's sadly not that easy.  See XKCD 1172.

Given the reality of XKCD 1172, I do think that implementing a sanity-
check phase that checks for existence of all packages is the sanest
option here, together with safeguarding against missing dependencies in
patch-dependencies.  Does that sound reasonable to everyone else here?

Cheers
diff mbox series

Patch

diff --git a/guix/build-system/node.scm b/guix/build-system/node.scm
index 735f8dd06e..330d10dca5 100644
--- a/guix/build-system/node.scm
+++ b/guix/build-system/node.scm
@@ -47,6 +47,7 @@  (define (default-node)
 (define* (lower name
                 #:key source inputs native-inputs outputs system target
                 (node (default-node))
+                (absent-dependencies ''())
                 #:allow-other-keys
                 #:rest arguments)
   "Return a bag for NAME."
@@ -77,6 +78,7 @@  (define* (node-build name inputs
                      (test-target "test")
                      (tests? #t)
                      (phases '%standard-phases)
+                     (absent-dependencies ''())
                      (outputs '("out"))
                      (search-paths '())
                      (system (%current-system))
@@ -84,7 +86,21 @@  (define* (node-build name inputs
                      (imported-modules %node-build-system-modules)
                      (modules '((guix build node-build-system)
                                 (guix build utils))))
-  "Build SOURCE using NODE and INPUTS."
+  "Build SOURCE using NODE and INPUTS.
+
+The builder will remove Node.js packages listed in ABSENT-DEPENCENCIES from
+the 'package.json' file's 'dependencies' and 'devDependencies' tables.  This
+mechanism can be used both avoid dependencies we don't want (e.g. optional
+features that would increase closure size) and to work around dependencies
+that haven't been packaged for Guix yet (e.g. test utilities)."
+  ;; Before #:absent-dependencies existed, this scenario was often handled by
+  ;; deleting the 'configure phase. Using #:absent-dependencies, instead,
+  ;; retains the check that no dependencies are silently missing and other
+  ;; actions performed by 'npm install', such as building native
+  ;; addons. Having an explicit list of absent dependencies in the package
+  ;; definition should also facilitate future maintenence: for example, if we
+  ;; add a package for a test framework, it should be easy to find all the
+  ;; other packages that use it and enable their tests.
   (define builder
     (with-imported-modules imported-modules
       #~(begin
@@ -96,6 +112,7 @@  (define builder
                       #:test-target #$test-target
                       #:tests? #$tests?
                       #:phases #$phases
+                      #:absent-dependencies #$absent-dependencies
                       #:outputs #$(outputs->gexp outputs)
                       #:search-paths '#$(sexp->gexp
                                          (map search-path-specification->sexp
diff --git a/guix/build/node-build-system.scm b/guix/build/node-build-system.scm
index b74e593838..892104b6d2 100644
--- a/guix/build/node-build-system.scm
+++ b/guix/build/node-build-system.scm
@@ -69,7 +69,8 @@  (define (list-modules directory)
               input-paths)
     index))
 
-(define* (patch-dependencies #:key inputs #:allow-other-keys)
+(define* (patch-dependencies #:key inputs absent-dependencies
+                             #:allow-other-keys)
 
   (define index (index-modules (map cdr inputs)))
 
@@ -86,7 +87,9 @@  (define (resolve-dependencies meta-alist meta-key)
       (('@ . orig-deps)
        (fold (match-lambda*
                (((key . value) acc)
-                (acons key (hash-ref index key value) acc)))
+                (if (member key absent-dependencies)
+                    acc
+                    (acons key (hash-ref index key value) acc))))
              '()
              orig-deps))))