diff mbox series

[bug#49946,08/31] gnu: node: Patch /usr/bin/env in node-gyp.

Message ID 20210808233354.6745-8-pierre.langlois@gmx.com
State New
Headers show
Series Tree-sitter, node-gyp addon support and emacs-tree-sitter | expand

Commit Message

Pierre Langlois Aug. 8, 2021, 11:33 p.m. UTC
* gnu/packages/node.scm (node)[arguments]: Fix /usr/bin/env shebang in
node-gyp.js.
(node-lts)[arguments]: Ditto.
---
 gnu/packages/node.scm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--
2.32.0

Comments

M Aug. 10, 2021, 6:28 p.m. UTC | #1
Pierre Langlois schreef op ma 09-08-2021 om 00:33 [+0100]:
> @@ -120,6 +120,10 @@
>                 (("'/usr/bin/env'")
>                  (string-append "'" (which "env") "'")))
> 
> +             ;; Fix /usr/bin/env shebang in node-gyp.
> +             (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
> +               (("#!/usr/bin/env") (string-append "#!" (which "env"))))

For cross-compilation, this should most likely be
(string-append (assoc-ref inputs "coreutils") "/bin/env")
or something like that instead.  Likewise in other places.
The old code uses (which "env") in some cases, but those
are probably wrong (except where it is patched in tests).

Greetings,
Maxime.
Pierre Langlois Aug. 11, 2021, 3:36 p.m. UTC | #2
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

> [[PGP Signed Part:Undecided]]
> Pierre Langlois schreef op ma 09-08-2021 om 00:33 [+0100]:
>> @@ -120,6 +120,10 @@
>>                 (("'/usr/bin/env'")
>>                  (string-append "'" (which "env") "'")))
>> 
>> +             ;; Fix /usr/bin/env shebang in node-gyp.
>> +             (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
>> +               (("#!/usr/bin/env") (string-append "#!" (which "env"))))
>
> For cross-compilation, this should most likely be
> (string-append (assoc-ref inputs "coreutils") "/bin/env")
> or something like that instead.  Likewise in other places.
> The old code uses (which "env") in some cases, but those
> are probably wrong (except where it is patched in tests).

Good point, I didn't consider cross-compilation.  Actually, trying it,
it looks like our node package doesn't currently cross-compile
correctly.  I just managed to get it to cross-compile though, I'll
submit another patch for it!  In the meantime, with this series I agree
it's better for the new code to be correct from the begining though.

Thanks for taking a look!
Pierre
Philip McGrath Sept. 23, 2021, 9:18 a.m. UTC | #3
I'm interested in the node-gyp part of this, which has come up in some 
other software I'm trying to package. These comments come with the 
caveat that my experience with node.js and npm is fairly shallow.

On 8/10/21 2:28 PM, Maxime Devos wrote:
> Pierre Langlois schreef op ma 09-08-2021 om 00:33 [+0100]:
>> @@ -120,6 +120,10 @@
>>                  (("'/usr/bin/env'")
>>                   (string-append "'" (which "env") "'")))
>>
>> +             ;; Fix /usr/bin/env shebang in node-gyp.
>> +             (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
>> +               (("#!/usr/bin/env") (string-append "#!" (which "env"))))
> 
> For cross-compilation, this should most likely be
> (string-append (assoc-ref inputs "coreutils") "/bin/env")
> or something like that instead.  Likewise in other places.

Since the shebang line for node-gyp is specifically "#!/usr/bin/env 
node", I wonder if it should use the node built by this package, rather 
than a dynamic node.

More generally, I see that there are 355 directories installed under 
"lib/node_modules/npm/node_modules" (which corresponds to the "deps" 
path above). Most of them don't seem to be available as Guix packages 
that could be depended upon by other Guix node packages. I'd guess 
node-gyp may not be the only one with shebangs that ought to be patched.

On 8/8/21 6:29 PM, Pierre Langlois wrote:

 >    ... `node-gyp' needs

 >    node headers to compile against, packaged as a tarball, which it tries

 >    to download.  Instead, we can run a `node-gyp --tarball <> configure'

 >    step to manually provide the tarball, which we can package separately

 >    for any given node version.

There is also a --nodedir option, which I found could work with 
something like:

     (string-append "--nodedir=" (assoc-ref inputs "node"))

That seems like it might be better, though I don't know all the 
considerations for cross-compilation and such.

-Philip
Pierre Langlois Sept. 25, 2021, 10:24 a.m. UTC | #4
Hi Philip,

Philip McGrath <philip@philipmcgrath.com> writes:

> I'm interested in the node-gyp part of this, which has come up in some other
> software I'm trying to package. These comments come with the caveat that my
> experience with node.js and npm is fairly shallow.

Thanks for your feedback!

>
> On 8/10/21 2:28 PM, Maxime Devos wrote:
>> Pierre Langlois schreef op ma 09-08-2021 om 00:33 [+0100]:
>>> @@ -120,6 +120,10 @@
>>>                  (("'/usr/bin/env'")
>>>                   (string-append "'" (which "env") "'")))
>>>
>>> +             ;; Fix /usr/bin/env shebang in node-gyp.
>>> +             (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
>>> +               (("#!/usr/bin/env") (string-append "#!" (which "env"))))
>> For cross-compilation, this should most likely be
>> (string-append (assoc-ref inputs "coreutils") "/bin/env")
>> or something like that instead.  Likewise in other places.
>
> Since the shebang line for node-gyp is specifically "#!/usr/bin/env node", I
> wonder if it should use the node built by this package, rather than a dynamic
> node.

Yeah we could do that, although I generally prefer to follow whatever
the script already does, there could be a good reason for them to use
`env' no?

> More generally, I see that there are 355 directories installed under
> "lib/node_modules/npm/node_modules" (which corresponds to the "deps" 
> path above). Most of them don't seem to be available as Guix packages that could
> be depended upon by other Guix node packages.

Yeah that's tricky, ideally we should remove all the node_modules deps
and package them separately, I wonder if anybody tried to do that
already. I would suspect it to be quite a lot of work, sometimes
unbundling stops being worth and when it's hard to maintain dependencies
manually.

Hopefully we can get there one day though! I don't want to deter anybody
from trying :-), I might give it a go on a raindy day.

> I'd guess node-gyp may not be the only one with shebangs that ought to
> be patched.

Yeah there could be others, although normally the patching phase from
the gnu build system should have taken care of most of them, hopefully
all, I'm not sure why it didn't work for /usr/bin/env though.

I would suggest we patch things as we encounter them, did you find
anymore issues when working on your package?

For instance, while working on a newer version of one of the packages in
this series, I saw we may need to patch GYP's python reference as well,
like so:

(substitute* "deps/npm/node_modules/node-gyp/gyp/gyp_main.py"
  (("#!/usr/bin/env python")
   (string-append "#!" (assoc-ref inputs "python") "/bin/python3")))

Only for node 14+. The reason seems to be that gyp still refers to
"python", but python2 is no longer a dependency for newer nodes. And it
seems GYP is perfectly happy with python3, and the shebang is fixed
upstream so a never node will be fine:
https://github.com/nodejs/node-gyp/pull/2355/files

Maybe updating node would be better than this fix though.

> On 8/8/21 6:29 PM, Pierre Langlois wrote:
>
>>    ... `node-gyp' needs
>
>>    node headers to compile against, packaged as a tarball, which it tries
>
>>    to download.  Instead, we can run a `node-gyp --tarball <> configure'
>
>>    step to manually provide the tarball, which we can package separately
>
>>    for any given node version.
>
> There is also a --nodedir option, which I found could work with something like:
>
>     (string-append "--nodedir=" (assoc-ref inputs "node"))
>
> That seems like it might be better, though I don't know all the considerations
> for cross-compilation and such.

Oh that's a good idea, I didn't really like having to download the
headers separately from the main package, especially given we run
snippet on the source to remove bundled dependencies.

Trying this out this approach does work, but I needed to:

  - Create a union directory with both node and libuv. The node package
    only has headers for V8/node, but we also need libuv, so doing
    something like this works:

    (union-build node-sources
                (list (assoc-ref inputs "node")
                      (assoc-ref inputs "libuv"))
                #:create-all-directories? #t
                #:log-port (%make-void-port "w"))

  - For some reason, --nodedir didn't really "configure" gyp to use that
    node directory, I think it's meant to be passed everytime you run
    any gyp command. Instead I found that you can use and environment
    variable:

    (setenv "npm_config_nodedir" node-sources)

And that works for the packages in this series!  That'll be much better
than before, I'll do it this way.

Thanks again for taking a look, I'll see if I can send updated patches
sometimes this weekend.

Pierre
Philip McGrath Sept. 26, 2021, 10:02 p.m. UTC | #5
Hi Pierre,

On 9/25/21 6:24 AM, Pierre Langlois wrote:
> Philip McGrath <philip@philipmcgrath.com> writes:
>> Since the shebang line for node-gyp is specifically "#!/usr/bin/env node", I
>> wonder if it should use the node built by this package, rather than a dynamic
>> node.
> 
> Yeah we could do that, although I generally prefer to follow whatever
> the script already does, there could be a good reason for them to use
> `env' no
I think it might be better to use `patch-shebang` from `(guix build 
utils)` rather than `substitute*` these by hand, and it seems that 
`patch-shebang` removed the indirection through `env`. My guess is most 
of these cases are to accommodate the fact that `node` and `python` are 
often installed to places other than `/usr/bin`.

 >> I'd guess node-gyp may not be the only one with shebangs that ought to

 >> be patched.

 >

 > Yeah there could be others, although normally the patching phase from

 > the gnu build system should have taken care of most of them, hopefully

 > all, I'm not sure why it didn't work for /usr/bin/env though.

 >

 > I would suggest we patch things as we encounter them, did you find

 > anymore issues when working on your package?


Looking at `gnu-build-system`, it seems that the `'patch-shebangs` phase 
only operates on files installed in the "/bin" and "/sbin" 
subdirectories of the package's outputs. That restriction doesn't make 
sense to me in general: for instance, what about "/libexec"? For Node 
specifically, this misses a lot of stuff under "/lib/node_modules" and 
"/lib/node_modules/npm/node_modules". I think I more general fix could 
subsume the `'patch-npm-shebang` and `'patch-node-shebang` phases in 
building Node, too.

 > For instance, while working on a newer version of one of the packages in

 > this series, I saw we may need to patch GYP's python reference as well,

 > like so:

 >

 > (substitute* "deps/npm/node_modules/node-gyp/gyp/gyp_main.py"

 >    (("#!/usr/bin/env python")

 >     (string-append "#!" (assoc-ref inputs "python") "/bin/python3")))

 >

 > Only for node 14+. The reason seems to be that gyp still refers to

 > "python", but python2 is no longer a dependency for newer nodes. And it

 > seems GYP is perfectly happy with python3, and the shebang is fixed

 > upstream so a never node will be fine:

 > https://github.com/nodejs/node-gyp/pull/2355/files


I think in some places (but perhaps not enough places) Guix uses 
`python-wrapper` to work around this ...

 >

 > Maybe updating node would be better than this fix though.

I'm not totally clear on whether the upstream fix is in 14.17.6 LTS, 
but, if so, that seems great!

> 
>> More generally, I see that there are 355 directories installed under
>> "lib/node_modules/npm/node_modules" (which corresponds to the "deps"
>> path above). Most of them don't seem to be available as Guix packages that could
>> be depended upon by other Guix node packages.
> 
> Yeah that's tricky, ideally we should remove all the node_modules deps
> and package them separately, I wonder if anybody tried to do that
> already. I would suspect it to be quite a lot of work, sometimes
> unbundling stops being worth and when it's hard to maintain dependencies
> manually.
> 
> Hopefully we can get there one day though! I don't want to deter anybody
> from trying :-), I might give it a go on a raindy day.

Since these are developed and released with Node, and apparently we can 
build them as part of the Node build process, I was thinking we could 
just make packages that point to these versions we're already building. 
It might be good to hear from someone who develops with node/npm, though 
... I just use it to install software that I can't find packaged elsewhere.

> 
>> On 8/8/21 6:29 PM, Pierre Langlois wrote:
>>
>>>     ... `node-gyp' needs
>>
>>>     node headers to compile against, packaged as a tarball, which it tries
>>
>>>     to download.  Instead, we can run a `node-gyp --tarball <> configure'
>>
>>>     step to manually provide the tarball, which we can package separately
>>
>>>     for any given node version.
>>
>> There is also a --nodedir option, which I found could work with something like:
>>
>>      (string-append "--nodedir=" (assoc-ref inputs "node"))
>>
>> That seems like it might be better, though I don't know all the considerations
>> for cross-compilation and such.
> 
> Oh that's a good idea, I didn't really like having to download the
> headers separately from the main package, especially given we run
> snippet on the source to remove bundled dependencies.
> 
> Trying this out this approach does work, but I needed to:
> 
>    - Create a union directory with both node and libuv. The node package
>      only has headers for V8/node, but we also need libuv, so doing
>      something like this works:
> 
>      (union-build node-sources
>                  (list (assoc-ref inputs "node")
>                        (assoc-ref inputs "libuv"))
>                  #:create-all-directories? #t
>                  #:log-port (%make-void-port "w"))

I found it worked to just add libuv as an input of packages built with 
node-gyp. I hadn't tried to change `node-build-system`, but I think that 
would be the place to do it.

> 
>    - For some reason, --nodedir didn't really "configure" gyp to use that
>      node directory, I think it's meant to be passed everytime you run
>      any gyp command. Instead I found that you can use and environment
>      variable:
> 
>      (setenv "npm_config_nodedir" node-sources)

That seems right. I believe there's a similar "npm_config_python" for 
the Python executable to use.

Alternatively, I think it's possible to configure these in 
$PREFIX/etc/npmrc: <https://docs.npmjs.com/cli/v7/configuring-npm/npmrc>

> 
> And that works for the packages in this series!  That'll be much better
> than before, I'll do it this way.
> 
> Thanks again for taking a look, I'll see if I can send updated patches
> sometimes this weekend.

Glad it was useful!

For patching the shebangs, here's a variant of node-lts that worked for 
me, though I think it would be even better to combine it with the 
existing phases:

```
(define-public patched-node
   (let ((node node-lts))
     (package
       (inherit node)
       (arguments
        (substitute-keyword-arguments (package-arguments node)
          ((#:phases standard-phases)
           `(modify-phases ,standard-phases
              (add-after 'patch-npm-shebang 'patch-more-shebangs
                (lambda* (#:key inputs outputs #:allow-other-keys)
                  (define (append-map f lst)
                    (apply append (map f lst)))
                  ;; from patch-shebangs
                  (define bin-directories
                    ;;(match-lambda
                    ;;  ((_ . dir)
                    (lambda (pr)
                      (let ((dir (cdr pr)))
                        (list (string-append dir "/bin")
                              (string-append dir "/sbin")))))
                  (define output-bindirs
                    (append-map bin-directories outputs))
                  (define input-bindirs
                    ;; Shebangs should refer to binaries of the target 
system---i.e., from
                    ;; "inputs", not from "native-inputs".
                    (append-map bin-directories inputs))
                  (define path
                    (append output-bindirs input-bindirs))
                  (with-directory-excursion
                      (string-append (assoc-ref outputs "out")
                                     "/lib/node_modules/npm/node_modules")
                    (for-each
                     ;;(cut patch-shebang <> path)
                     (lambda (file)
                       (patch-shebang file path))
                     ;; from patch-generated-file-shebangs
                     (find-files "."
                                 (lambda (file stat)
                                   (and (eq? 'regular (stat:type stat))
                                        (not (zero? (logand (stat:mode 
stat) #o100)))))
                                 #:stat lstat))))))))))))
```

-Philip
M Sept. 27, 2021, 10:11 a.m. UTC | #6
Philip McGrath schreef op zo 26-09-2021 om 18:02 [-0400]:
> Looking at `gnu-build-system`, it seems that the `'patch-shebangs` phase 
> only operates on files installed in the "/bin" and "/sbin" 
> subdirectories of the package's outputs. That restriction doesn't make 
> sense to me in general: for instance, what about "/libexec"?

'libexec' is included on core-updates{,-frozen}.  I believe the idea of the
restriction is to avoid patching too much.  E.g., "autoconf" has a file
share/autoconf/build-aux/config.guess with a #!/bin/sh shebang.  It should
not be patched, otherwise tarballs created with "make dist" would include
a store path and hence be Guix-specific and architecture-specific.

Greetings,
Maxime.
Philip McGrath Sept. 29, 2021, 4:45 a.m. UTC | #7
On 9/27/21 6:11 AM, Maxime Devos wrote:
> Philip McGrath schreef op zo 26-09-2021 om 18:02 [-0400]:
>> Looking at `gnu-build-system`, it seems that the `'patch-shebangs` phase
>> only operates on files installed in the "/bin" and "/sbin"
>> subdirectories of the package's outputs. That restriction doesn't make
>> sense to me in general: for instance, what about "/libexec"?
> 
> 'libexec' is included on core-updates{,-frozen}.  I believe the idea of the
> restriction is to avoid patching too much.  E.g., "autoconf" has a file
> share/autoconf/build-aux/config.guess with a #!/bin/sh shebang.  It should
> not be patched, otherwise tarballs created with "make dist" would include
> a store path and hence be Guix-specific and architecture-specific.

That makes some sense. I would have thought checking that the file is 
executable would catch most such cases, but, if this works for 
`gnu-build-system`, great.

As I look at potentially making a patch, another thing that seems odd is 
that `(gnu packages node)` exports node@10.24.0 as `node` (via 
`define-public`), but node@14.16.0 as `node-lts`. Normally, if I saw 
that there were packages `node` and `node-lts`, I'd assume that 
`node-lts` was *older*. It's especially confusing because, at the 
command line, `guix install node` refers to what in Scheme you have to 
write as `node-lts`.

I wonder if it was a mistake, and should have used `define` rather than 
`define-public`, since this code:

```
;; This should be the latest version of node that still builds without
;; depending on llhttp.
(define-public node-bootstrap
   (hidden-package node))
```

seems to be trying to hide the older node.

It looks like `node` has only a few dependents, and it seems like at 
least several of them only used it because it had the more obvious name. 
The `node-build-system` uses `node-lts` as the `(default-node)`.

Would it make sense to change the names? Or just to remove the 
`define-public` of `node`?

-Philip
Philip McGrath Sept. 29, 2021, 6:31 a.m. UTC | #8
On 9/25/21 6:24 AM, Pierre Langlois wrote:
> For instance, while working on a newer version of one of the packages in
> this series, I saw we may need to patch GYP's python reference as well,
> like so:
> 
> (substitute* "deps/npm/node_modules/node-gyp/gyp/gyp_main.py"
>    (("#!/usr/bin/env python")
>     (string-append "#!" (assoc-ref inputs "python") "/bin/python3")))
> 
> Only for node 14+. The reason seems to be that gyp still refers to
> "python", but python2 is no longer a dependency for newer nodes. And it
> seems GYP is perfectly happy with python3, and the shebang is fixed
> upstream so a never node will be fine:
> https://github.com/nodejs/node-gyp/pull/2355/files

I think this needs to be a `python` from `inputs` rather than 
`native-inputs`, for cross-compilation, IIUC.

I tried building node 14.18.0, and it ran into other issues, but there 
were still a number shebangs with `python` rather than `python3` in 
various places, though I think they'd be fine with `python-wrapper`.

-Philip
Philip McGrath Sept. 30, 2021, 10:56 p.m. UTC | #9
Hi,

I've reworked the part of the patch series dealing with node-gyp.

I'd like to find an NPM addon package to submit as part of this series, too,
basically as a test case. If I can find one that doesn't raise too many other
complications, I may send this in as a separate patch, but feel free to try it
with tree-sitter, too.

There are a few things I'm still not sure about. I haven't made
node-build-system add libuv as an implicit input, because I think some
node-gyp addons don't actually need libuv, but maybe it's common enough that
it should be done automatically.

Likewise, I haven't tried to change the issue of `node` referring to
`node-bootstrap`, but I still think it should be changed.

These patches are also on GitLab at
<https://gitlab.com/philip1/guix-patches/-/tree/wip-node-npm-gyp>.

Improvements welcome!

-Philip


Philip McGrath (3):
  gnu: node: Avoid duplicating build phases.
  gnu: node: Update to 10.24.1 for bootstrapping.
  guix: node-build-system: Support compiling addons with node-gyp.

 gnu/packages/node.scm            | 187 ++++++++++---------------------
 guix/build-system/node.scm       |   7 +-
 guix/build/node-build-system.scm |   9 ++
 3 files changed, 74 insertions(+), 129 deletions(-)
diff mbox series

Patch

diff --git a/gnu/packages/node.scm b/gnu/packages/node.scm
index 36c45e9c7a..522d4943d0 100644
--- a/gnu/packages/node.scm
+++ b/gnu/packages/node.scm
@@ -6,7 +6,7 @@ 
 ;;; Copyright © 2017 Mike Gerwitz <mtg@gnu.org>
 ;;; Copyright © 2018 Tobias Geerinckx-Rice <me@tobias.gr>
 ;;; Copyright © 2018, 2019, 2020, 2021 Marius Bakke <marius@gnu.org>
-;;; Copyright © 2020 Pierre Langlois <pierre.langlois@gmx.com>
+;;; Copyright © 2020, 2021 Pierre Langlois <pierre.langlois@gmx.com>
 ;;; Copyright © 2020 Ricardo Wurmus <rekado@elephly.net>
 ;;; Copyright © 2021 Simon Tournier <zimon.toutoune@gmail.com>
 ;;;
@@ -120,6 +120,10 @@ 
                (("'/usr/bin/env'")
                 (string-append "'" (which "env") "'")))

+             ;; Fix /usr/bin/env shebang in node-gyp.
+             (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
+               (("#!/usr/bin/env") (string-append "#!" (which "env"))))
+
              ;; FIXME: These tests fail in the build container, but they don't
              ;; seem to be indicative of real problems in practice.
              (for-each delete-file
@@ -661,6 +665,10 @@  source files.")
                  (("'/usr/bin/env'")
                   (string-append "'" (which "env") "'")))

+               ;; Fix /usr/bin/env shebang in node-gyp.
+               (substitute* "deps/npm/node_modules/node-gyp/bin/node-gyp.js"
+                 (("#!/usr/bin/env") (string-append "#!" (which "env"))))
+
                ;; FIXME: These tests fail in the build container, but they don't
                ;; seem to be indicative of real problems in practice.
                (for-each delete-file