diff mbox series

[bug#61910,v2] git-download: Apply CR/LF to Git repository from SWH.

Message ID 20230302120546.1074630-1-zimon.toutoune@gmail.com
State New
Headers show
Series [bug#61910,v2] git-download: Apply CR/LF to Git repository from SWH. | expand

Commit Message

Simon Tournier March 2, 2023, 12:05 p.m. UTC
From: Ludovic Courtès <ludo@gnu.org>

Fixes a bug whereby CR/LF convention would not be applied on Git
repositories retrieved from SWH:

  https://sympa.inria.fr/sympa/arc/swh-devel/2023-03/msg00000.html

Reported by Simon Tournier <simon.tournier@inserm.fr>.
Suggested by Valentin Lorentz <valentin.lorentz@inria.fr>.
Co-authored by Simon Tournier <simon.tournier@inserm.fr>.

* guix/git-download.scm (git-fetch)[build]: Add Git operations conditioned by
.gitattributes on the result from SWH.
---
 guix/git-download.scm | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

Hi,

This uses 'flat' as previously but applies some Git commands for fixing CR/LF
when .gitattributes is present or not.  It works on the example hidapi.

I have not tried for other examples.

WDYT?

Cheers,
simon



base-commit: af95f2d8f98eb2c8c64954bb2fd0b70838899174

Comments

Ludovic Courtès March 2, 2023, 1:15 p.m. UTC | #1
Hello!

Simon Tournier <zimon.toutoune@gmail.com> skribis:


[...]

>                         (swh-download (getenv "git url") (getenv "git commit")
> -                                     #$output))))))))
> +                                     #$output)
> +                       ;; Perform CR/LF conversion if specificied by .gitattributes
> +                       (when (find-files "." ".gitattributes")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "init")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "config" "--local" "user.email" "you@example.org")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "config" "--local" "user.name" "Your Name")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "add" ".")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "commit" "-am" "'init'")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "read-tree" "--empty")
> +                         (invoke #+(file-append git "/bin/git") "-C" #$output
> +                                 "reset" "--hard")
> +                         (delete-file-recursively
> +                          (string-append #$output "/.git"))))))))))

This is much better than fetching the whole repo, well done!

Should we replace (find-files …) with (file-exists? ".gitattributes")?

I wonder if we could achieve the same without creating a repo just to
delete it afterwards, either by invoking a low-level ‘git’ command it if
exists, or using Guile-Git, or doing it ourselves in Scheme.

WDYT?

Thanks,
Ludo’.
Simon Tournier March 2, 2023, 1:50 p.m. UTC | #2
Re,

On Thu, 2 Mar 2023 at 14:15, Ludovic Courtès <ludo@gnu.org> wrote:

> Should we replace (find-files …) with (file-exists? ".gitattributes")?

Damned!  It was what I was looking for. :-)

> I wonder if we could achieve the same without creating a repo just to
> delete it afterwards, either by invoking a low-level ‘git’ command it if
> exists, or using Guile-Git, or doing it ourselves in Scheme.

Doing it ourselves in Scheme, I do not know.  I guess it would be
hard.  Basically, it is not clear for me what we would have to
implement.  For instance, we get the ("normalized") content fetched
from SWH , so we have:

--8<---------------cut here---------------start------------->8---
hidapi.swh/testgui/test.cpp:            C++ source, ASCII text
hidapi.swh/testgui/testgui.sln:         UTF-8 Unicode (with BOM) text
hidapi.swh/testgui/testgui.vcproj:      XML 1.0 document, ASCII text
--8<---------------cut here---------------end--------------->8---

and we need:

--8<---------------cut here---------------start------------->8---
hidapi.guix/testgui/test.cpp:            C++ source, ASCII text
hidapi.guix/testgui/testgui.sln:         UTF-8 Unicode (with BOM)
text, with CRLF line terminators
hidapi.guix/testgui/testgui.vcproj:      XML 1.0 document, ASCII text,
with CRLF line terminators
--8<---------------cut here---------------end--------------->8---

So the question is how to know which files require CRLF and which not?

From my understanding, this information is provided by:

--8<---------------cut here---------------start------------->8---
$ cat hidapi.guix/.gitattributes
* text=auto

*.sln text eol=crlf
*.vcproj text eol=crlf

bootstrap text eol=lf
configure.ac text eol=lf
--8<---------------cut here---------------end--------------->8---

As zack pointed yesterday, well that what I understood or
misunderstood, it is related to "smudge" filters [1].

1: <https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes>


About Guile-Git, probably the same as you answered me earlier today. ;-)

        <civodul> it's just that it was easier to implement and yeah,
shallow clones
                  are nice
        <civodul> we don't need tight integration in this context, so less of an
                  incentive to use Guile-Git
        <civodul> (it's very different from the channel code from that
perspective)

Kidding aside, I agree that relying on Guile-Git would mean being less
"fragile".  Hum, somehow this asks about the size of the binary seed
for bootstrapping.

About low-level 'git' command, maybe the same could be achieved
without "git init" + "git add" + "git commit".  Well, my gitology is
not enough skilled. :-)

Cheers,
simon
Simon Tournier March 2, 2023, 5:42 p.m. UTC | #3
Hi,

Follow up after discussing on #swh-devel. :-)

On jeu., 02 mars 2023 at 14:50, Simon Tournier <zimon.toutoune@gmail.com> wrote:

> So the question is how to know which files require CRLF and which not?
>
> From my understanding, this information is provided by:
>
> $ cat hidapi.guix/.gitattributes
> * text=auto
>
> *.sln text eol=crlf
> *.vcproj text eol=crlf
>
> bootstrap text eol=lf
> configure.ac text eol=lf

Yeah, we could parse this file.  It is not that hard but I am not so
much interested right now for implementing such.


> As zack pointed yesterday, well that what I understood or
> misunderstood, it is related to "smudge" filters [1].

No, it is not “smudge”.  These attributes are much simpler than
smudge/clean filters – well they are scary!


> Kidding aside, I agree that relying on Guile-Git would mean being less
> "fragile".  Hum, somehow this asks about the size of the binary seed
> for bootstrapping.

After some quick look, Guile-Git provides some plumbing and here it is
more about porcelain.  Well, I do not know how it could be implemented
with Guile-Git.


> About low-level 'git' command, maybe the same could be achieved
> without "git init" + "git add" + "git commit".  Well, my gitology is
> not enough skilled. :-)

Asking on #swh-devel, vlorentz confirmed using something similar.
Quoting: « I looked at git's normalization-related test cases; but they
all use git-add »

Therefore, I guess that’s probably the best tradeoff for now.

Well, I am going to run more tests.


Cheers,
simon
Ludovic Courtès March 3, 2023, 11:12 a.m. UTC | #4
Hi,

Simon Tournier <zimon.toutoune@gmail.com> skribis:

> On Thu, 2 Mar 2023 at 14:15, Ludovic Courtès <ludo@gnu.org> wrote:

[...]

>> I wonder if we could achieve the same without creating a repo just to
>> delete it afterwards, either by invoking a low-level ‘git’ command it if
>> exists, or using Guile-Git, or doing it ourselves in Scheme.
>
> Doing it ourselves in Scheme, I do not know.  I guess it would be
> hard.
[...]

> From my understanding, this information is provided by:
>
> $ cat hidapi.guix/.gitattributes
> * text=auto
>
> *.sln text eol=crlf
> *.vcproj text eol=crlf
>
> bootstrap text eol=lf
> configure.ac text eol=lf

Uh, tricky enough.

[...]

> Kidding aside, I agree that relying on Guile-Git would mean being less
> "fragile".

Libgit2 has an “attr” interface¹ that Guile-Git only partially binds.
That interface works on a repo, so we’d still have to ‘git init’ just to
be able to read those attributes.  Not great.

¹ https://libgit2.org/libgit2/#HEAD/group/attr

But anyway, skimming through gitattributes(5) has convinced me that we
should not try to implement it by ourselves.  :-)

> About low-level 'git' command, maybe the same could be achieved
> without "git init" + "git add" + "git commit".  Well, my gitology is
> not enough skilled. :-)

I did a quick search and didn’t find anything so it looks like the
strategy you chose is the right one.

Thanks,
Ludo’.
Ludovic Courtès March 3, 2023, 1:27 p.m. UTC | #5
Hi again,

I pushed a slightly modified version of your patch as commit
58f20fa8181bdcd4269671e1d3cef1268947af3a.  It still passes the hidapi
test.  :-)

Let me know if anything’s amiss!

Thanks,
Ludo’.
Simon Tournier March 3, 2023, 6:52 p.m. UTC | #6
Hi,

On Fri, 03 Mar 2023 at 14:27, Ludovic Courtès <ludo@gnu.org> wrote:

> Let me know if anything’s amiss!

The version using ’file-exists?’ works well for Julia packages.  Well, I
will try your submitted one later next week. :-)

Cheers,
simon
diff mbox series

Patch

diff --git a/guix/git-download.scm b/guix/git-download.scm
index a1566bed4d..be0b13615c 100644
--- a/guix/git-download.scm
+++ b/guix/git-download.scm
@@ -1,8 +1,9 @@ 
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2014-2021, 2023 Ludovic Courtès <ludo@gnu.org>
 ;;; Copyright © 2017 Mathieu Lirzin <mthl@gnu.org>
 ;;; Copyright © 2017 Christopher Baines <mail@cbaines.net>
 ;;; Copyright © 2020 Jakub Kądziołka <kuba@kadziolka.net>
+;;; Copyright © 2023 Simon Tournier <zimon.toutoune@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -162,8 +163,27 @@  (define recursive?
                      (parameterize ((%verify-swh-certificate? #f))
                        (format (current-error-port)
                                "Trying to download from Software Heritage...~%")
+
                        (swh-download (getenv "git url") (getenv "git commit")
-                                     #$output))))))))
+                                     #$output)
+                       ;; Perform CR/LF conversion if specificied by .gitattributes
+                       (when (find-files "." ".gitattributes")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "init")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "config" "--local" "user.email" "you@example.org")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "config" "--local" "user.name" "Your Name")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "add" ".")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "commit" "-am" "'init'")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "read-tree" "--empty")
+                         (invoke #+(file-append git "/bin/git") "-C" #$output
+                                 "reset" "--hard")
+                         (delete-file-recursively
+                          (string-append #$output "/.git"))))))))))
 
   (mlet %store-monad ((guile (package->derivation guile system)))
     (gexp->derivation (or name "git-checkout") build