diff mbox series

[bug#51427] nix: libstore: Do not remove unused links when deleting specific items.

Message ID 20211027034918.4591-1-maxim.cournoyer@gmail.com
State New
Headers show
Series [bug#51427] nix: libstore: Do not remove unused links when deleting specific items. | expand

Checks

Context Check Description
cbaines/comparison success View comparision
cbaines/git branch success View Git branch
cbaines/applying patch success View Laminar job
cbaines/issue success View issue

Commit Message

Maxim Cournoyer Oct. 27, 2021, 3:49 a.m. UTC
Deleting unused links can be a very costly operation, especially on rotative
hard drives.  As removing single store items is often used for experimentation
rather than for cleaning purposes, this change allows it to run without the
links cleanup.

* nix/libstore/gc.cc (LocalStore::collectGarbage): Do not clean up links when
the specified action is GCOptions::gcDeleteSpecific.
---
 nix/libstore/gc.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ludovic Courtès Oct. 28, 2021, 2:16 p.m. UTC | #1
Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Deleting unused links can be a very costly operation, especially on rotative
> hard drives.  As removing single store items is often used for experimentation
> rather than for cleaning purposes, this change allows it to run without the
> links cleanup.
>
> * nix/libstore/gc.cc (LocalStore::collectGarbage): Do not clean up links when
> the specified action is GCOptions::gcDeleteSpecific.
> ---
>  nix/libstore/gc.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
> index e1d0765154..7d872d8cc1 100644
> --- a/nix/libstore/gc.cc
> +++ b/nix/libstore/gc.cc
> @@ -771,7 +771,7 @@ void LocalStore::collectGarbage(const GCOptions & options, GCResults & results)
>      deleteGarbage(state, state.trashDir);
>  
>      /* Clean up the links directory. */
> -    if (options.action == GCOptions::gcDeleteDead || options.action == GCOptions::gcDeleteSpecific) {
> +    if (options.action == GCOptions::gcDeleteDead) {

I believe the effect is that ‘guix gc -D /gnu/store/…-disk-image’ would
remove nothing: /gnu/store/.links would still contain a copy of that big
disk image, so as a result, you’ve freed zero bytes.

Am I right?

Perhaps what we could do is, upon ‘gcDeleteSpecific’, only look at the
relevant entry in .links instead of traversing all of them.

WDYT?

Thanks,
Ludo’.
Liliana Marie Prikler Oct. 31, 2021, 8:50 a.m. UTC | #2
Hi,

Am Donnerstag, den 28.10.2021, 16:16 +0200 schrieb Ludovic Courtès:
> Hi,
> 
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
> 
> > Deleting unused links can be a very costly operation, especially on
> > rotative hard drives.  As removing single store items is often used
> > for experimentation rather than for cleaning purposes, this change
> > allows it to run without the links cleanup.
> > 
> > * nix/libstore/gc.cc (LocalStore::collectGarbage): Do not clean up
> > links when
> > the specified action is GCOptions::gcDeleteSpecific.
> > ---
> >  nix/libstore/gc.cc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
> > index e1d0765154..7d872d8cc1 100644
> > --- a/nix/libstore/gc.cc
> > +++ b/nix/libstore/gc.cc
> > @@ -771,7 +771,7 @@ void LocalStore::collectGarbage(const GCOptions
> > & options, GCResults & results)
> >      deleteGarbage(state, state.trashDir);
> >  
> >      /* Clean up the links directory. */
> > -    if (options.action == GCOptions::gcDeleteDead ||
> > options.action == GCOptions::gcDeleteSpecific) {
> > +    if (options.action == GCOptions::gcDeleteDead) {
> 
> I believe the effect is that ‘guix gc -D /gnu/store/…-disk-image’
> would remove nothing: /gnu/store/.links would still contain a copy of
> that big disk image, so as a result, you’ve freed zero bytes.
> 
> Am I right?
I think that might be the point.  As Maxim said, single items are
(likely) not removed for cleaning purposes, so freeing the disk image
has little effect.  Plus, you could invoke it like

  guix gc -D dead-item dead-item live-item dead-item

It would fail at live-item and then not continue to free the links of
the two dead items prior.

So there's a few things we could do here:

1. simply fail and have the user deal with it (including the option of
doing a normal `guix gc' or `guix gc -C 1')
2. remember which paths were live and dead and always clean up the
links, only reporting errors afterwards
3. add an option to explicitly check the .links directory (which
defaults to true for the current things, but could also be used to
clean links after a liveness check or after a do-nothing `guix gc -F').
4. ...

WDYT?
Ludovic Courtès Oct. 31, 2021, 2:07 p.m. UTC | #3
Hi,

Liliana Marie Prikler <liliana.prikler@gmail.com> skribis:

> Am Donnerstag, den 28.10.2021, 16:16 +0200 schrieb Ludovic Courtès:
>> Hi,
>> 
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>> 
>> > Deleting unused links can be a very costly operation, especially on
>> > rotative hard drives.  As removing single store items is often used
>> > for experimentation rather than for cleaning purposes, this change
>> > allows it to run without the links cleanup.
>> > 
>> > * nix/libstore/gc.cc (LocalStore::collectGarbage): Do not clean up
>> > links when
>> > the specified action is GCOptions::gcDeleteSpecific.
>> > ---
>> >  nix/libstore/gc.cc | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > 
>> > diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
>> > index e1d0765154..7d872d8cc1 100644
>> > --- a/nix/libstore/gc.cc
>> > +++ b/nix/libstore/gc.cc
>> > @@ -771,7 +771,7 @@ void LocalStore::collectGarbage(const GCOptions
>> > & options, GCResults & results)
>> >      deleteGarbage(state, state.trashDir);
>> >  
>> >      /* Clean up the links directory. */
>> > -    if (options.action == GCOptions::gcDeleteDead ||
>> > options.action == GCOptions::gcDeleteSpecific) {
>> > +    if (options.action == GCOptions::gcDeleteDead) {
>> 
>> I believe the effect is that ‘guix gc -D /gnu/store/…-disk-image’
>> would remove nothing: /gnu/store/.links would still contain a copy of
>> that big disk image, so as a result, you’ve freed zero bytes.
>> 
>> Am I right?
> I think that might be the point.  As Maxim said, single items are
> (likely) not removed for cleaning purposes, so freeing the disk image
> has little effect.

What do you mean?  When doing VM testing, I regularly do ‘guix gc -D
/gnu/store/…-disk-image’ precisely to save space.  Fortunately it does
have the intended effect of freeing a bunch of GiBs.

> Plus, you could invoke it like
>
>   guix gc -D dead-item dead-item live-item dead-item
>
> It would fail at live-item and then not continue to free the links of
> the two dead items prior.

Yes, and that’s annoying, but it’s unrelated.  :-)

> So there's a few things we could do here:
>
> 1. simply fail and have the user deal with it (including the option of
> doing a normal `guix gc' or `guix gc -C 1')
> 2. remember which paths were live and dead and always clean up the
> links, only reporting errors afterwards
> 3. add an option to explicitly check the .links directory (which
> defaults to true for the current things, but could also be used to
> clean links after a liveness check or after a do-nothing `guix gc -F').
> 4. ...

You seem to be proposing to remove ‘-D’ altogether.  I agree it has the
shortcomings you write, but I think it’s occasionally useful
nonetheless.

My proposal would be either the status quo, or removing just the one
link that matters from /gnu/store/.links upon ‘-D’.

Thoughts?

Ludo’.
Liliana Marie Prikler Oct. 31, 2021, 2:39 p.m. UTC | #4
Hi,

Am Sonntag, den 31.10.2021, 15:07 +0100 schrieb Ludovic Courtès:
> What do you mean?  When doing VM testing, I regularly do ‘guix gc -D
> /gnu/store/…-disk-image’ precisely to save space.  Fortunately it
> does have the intended effect of freeing a bunch of GiBs.
Fair enough, different strokes and all.

> > Plus, you could invoke it like
> > 
> >   guix gc -D dead-item dead-item live-item dead-item
> > 
> > It would fail at live-item and then not continue to free the links
> > of the two dead items prior.
> 
> Yes, and that’s annoying, but it’s unrelated.  :-)
> 
> > So there's a few things we could do here:
> > 
> > 1. simply fail and have the user deal with it (including the option
> > of doing a normal `guix gc' or `guix gc -C 1')
> > 2. remember which paths were live and dead and always clean up the
> > links, only reporting errors afterwards
> > 3. add an option to explicitly check the .links directory (which
> > defaults to true for the current things, but could also be used to
> > clean links after a liveness check or after a do-nothing `guix gc
> > -F').
> > 4. ...
> 
> You seem to be proposing to remove ‘-D’ altogether.  
I wrote no such thing.  Obviously there needs to be a way of removing
single items from the store, but what else to do is not so clear.  It
is only obvious that traversing all of .links is too expensive.

> I agree it has the shortcomings you write, but I think it’s
> occasionally useful nonetheless.
> 
> My proposal would be either the status quo, or removing just the one
> link that matters from /gnu/store/.links upon ‘-D’.
> 
> Thoughts?
There isn't "just the one link that matters" when it comes to removing
multiple items -- heck, even if tasked to free up just 5MB rather than
all of the garbage, traversing all .links is probably too expensive.

Accepting that we might want to always delete links at the end, I think
`guix gc' needs a way to record which links had their count go to 1
during garbage deletion, so that when it comes to deleting links only
those are tried (and of course checked again to make sure their count
is indeed still 1).  This should be the preferred mode if less than
some arbitrary large number of store items are affected (let's say 1024
or some multiple of it) or a total cleaning of .links has been forced.

Though perhaps there's a way to do this without manual recording. 
Let's say we notice a link count going to one as we clear the trash. 
We could just add the link behind it to the trash right away to ensure
that it is not reused by something else and then clean the trash a
second time (we would have to check for potential race conditions in
this case).

WDYT?

Liliana
Maxim Cournoyer Oct. 31, 2021, 8:51 p.m. UTC | #5
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Liliana Marie Prikler <liliana.prikler@gmail.com> skribis:
>
>> Am Donnerstag, den 28.10.2021, 16:16 +0200 schrieb Ludovic Courtès:
>>> Hi,
>>> 
>>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>> 
>>> > Deleting unused links can be a very costly operation, especially on
>>> > rotative hard drives.  As removing single store items is often used
>>> > for experimentation rather than for cleaning purposes, this change
>>> > allows it to run without the links cleanup.
>>> > 
>>> > * nix/libstore/gc.cc (LocalStore::collectGarbage): Do not clean up
>>> > links when
>>> > the specified action is GCOptions::gcDeleteSpecific.
>>> > ---
>>> >  nix/libstore/gc.cc | 2 +-
>>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>>> > 
>>> > diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
>>> > index e1d0765154..7d872d8cc1 100644
>>> > --- a/nix/libstore/gc.cc
>>> > +++ b/nix/libstore/gc.cc
>>> > @@ -771,7 +771,7 @@ void LocalStore::collectGarbage(const GCOptions
>>> > & options, GCResults & results)
>>> >      deleteGarbage(state, state.trashDir);
>>> >  
>>> >      /* Clean up the links directory. */
>>> > -    if (options.action == GCOptions::gcDeleteDead ||
>>> > options.action == GCOptions::gcDeleteSpecific) {
>>> > +    if (options.action == GCOptions::gcDeleteDead) {
>>> 
>>> I believe the effect is that ‘guix gc -D /gnu/store/…-disk-image’
>>> would remove nothing: /gnu/store/.links would still contain a copy of
>>> that big disk image, so as a result, you’ve freed zero bytes.
>>> 
>>> Am I right?
>> I think that might be the point.  As Maxim said, single items are
>> (likely) not removed for cleaning purposes, so freeing the disk image
>> has little effect.
>
> What do you mean?  When doing VM testing, I regularly do ‘guix gc -D
> /gnu/store/…-disk-image’ precisely to save space.  Fortunately it does
> have the intended effect of freeing a bunch of GiBs.
>
>> Plus, you could invoke it like
>>
>>   guix gc -D dead-item dead-item live-item dead-item
>>
>> It would fail at live-item and then not continue to free the links of
>> the two dead items prior.
>
> Yes, and that’s annoying, but it’s unrelated.  :-)
>
>> So there's a few things we could do here:
>>
>> 1. simply fail and have the user deal with it (including the option of
>> doing a normal `guix gc' or `guix gc -C 1')
>> 2. remember which paths were live and dead and always clean up the
>> links, only reporting errors afterwards
>> 3. add an option to explicitly check the .links directory (which
>> defaults to true for the current things, but could also be used to
>> clean links after a liveness check or after a do-nothing `guix gc -F').
>> 4. ...
>
> You seem to be proposing to remove ‘-D’ altogether.  I agree it has the
> shortcomings you write, but I think it’s occasionally useful
> nonetheless.
>
> My proposal would be either the status quo, or removing just the one
> link that matters from /gnu/store/.links upon ‘-D’.

The second proposal makes sense.  I didn't care about freeing space, as
my use case was getting around corrupting an item in my store due to
https://issues.guix.gnu.org/51400, which the patch proposed here allowed
me to do without wasting hours of cleaning up links (nearly 1 GiB of
store on spinning drives).

Thanks,

Maxim
Simon Tournier Nov. 3, 2021, 10:45 a.m. UTC | #6
Hi,

On Sun, 31 Oct 2021 at 21:53, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:

> > My proposal would be either the status quo, or removing just the one
> > link that matters from /gnu/store/.links upon ‘-D’.
>
> The second proposal makes sense.  I didn't care about freeing space, as
> my use case was getting around corrupting an item in my store due to
> https://issues.guix.gnu.org/51400, which the patch proposed here allowed
> me to do without wasting hours of cleaning up links (nearly 1 GiB of
> store on spinning drives).

I often use "guix gc -D" for being able to launch again a command and
then check against SWH or other upstream.  Because the "deleting
links" is too much long, I am doing ^C at this step.  Therefore, be
able to skip this step when running "guix gc -D" appears to me the
thing to do.  Well, if "guix gc -D" is not doing it, then I am forcing
it by interrupting it; which appears to me worse than status quo.
BTW, if I need space, then I do not use "guix gc -D" on few items but
instead I use the options -F or -d or -C; here I am fine it takes the
time it takes. :-)

Well, «"deleting unused links" GC phase is too slow» is not new
because it is bug#24937 [1].  Therefore, status quo is really annoying
and I would prefer to skip this GC phase when using option -D although
it is not optimal.  Or remove just the one link that matters.

1: <http://issues.guix.gnu.org/issue/24937>


Cheers,
simon
Ludovic Courtès Nov. 6, 2021, 4:57 p.m. UTC | #7
Hi Maxim and all,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> You seem to be proposing to remove ‘-D’ altogether.  I agree it has the
>> shortcomings you write, but I think it’s occasionally useful
>> nonetheless.
>>
>> My proposal would be either the status quo, or removing just the one
>> link that matters from /gnu/store/.links upon ‘-D’.
>
> The second proposal makes sense.

Maybe that proposal is bogus though because you’d need to know the hash
of the files being removed, which means reading them…

> I didn't care about freeing space, as my use case was getting around
> corrupting an item in my store due to
> https://issues.guix.gnu.org/51400, which the patch proposed here
> allowed me to do without wasting hours of cleaning up links (nearly 1
> GiB of store on spinning drives).

The ideal solution as zimoun writes would be to address
<https://issues.guix.gnu.org/24937>.

Perhaps that phase needs to be implemented using a different strategy,
say an sqlite database that records the current link count (hoping that
‘SELECT * FROM links WHERE NLINKS = 1’ would be faster than traversing
all of ‘.links’) as well as a mapping from store item to file hashes.

BTW, those using Btrfs can probably use ‘--disable-deduplication’ and be
done with it.

Thanks,
Ludo’.
Maxim Cournoyer Nov. 9, 2021, 4:11 a.m. UTC | #8
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi Maxim and all,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> You seem to be proposing to remove ‘-D’ altogether.  I agree it has the
>>> shortcomings you write, but I think it’s occasionally useful
>>> nonetheless.
>>>
>>> My proposal would be either the status quo, or removing just the one
>>> link that matters from /gnu/store/.links upon ‘-D’.
>>
>> The second proposal makes sense.
>
> Maybe that proposal is bogus though because you’d need to know the hash
> of the files being removed, which means reading them…

Oops :-).

>> I didn't care about freeing space, as my use case was getting around
>> corrupting an item in my store due to
>> https://issues.guix.gnu.org/51400, which the patch proposed here
>> allowed me to do without wasting hours of cleaning up links (nearly 1
>> GiB of store on spinning drives).
>
> The ideal solution as zimoun writes would be to address
> <https://issues.guix.gnu.org/24937>.

Seems there's some improvement ready, but which needs more
testing/measurements?  I'd suggest simply invoking GNU sort; if it has
many pages of program for doing what it does, it's probably doing
something fancier/faster than we can (are ready to) emulate -- for free!

> Perhaps that phase needs to be implemented using a different strategy,
> say an sqlite database that records the current link count (hoping that
> ‘SELECT * FROM links WHERE NLINKS = 1’ would be faster than traversing
> all of ‘.links’) as well as a mapping from store item to file hashes.

Hmm.  I'll need to dive in the problem a bit more before I can comment
on this.

> BTW, those using Btrfs can probably use ‘--disable-deduplication’ and be
> done with it.

I erroneously used to think that Btrfs could do live deduplication, but
it doesn't.  There are external tools to do out of band / batch
deduplication though [0]; so if they perform better than the guix daemon's
own dedup, perhaps we could document this way out for our Btrfs users.

[0]  https://btrfs.wiki.kernel.org/index.php/Deduplication

Thank you,

Maxim
Jack Hill Nov. 9, 2021, 4:57 a.m. UTC | #9
On Mon, 8 Nov 2021, Maxim Cournoyer wrote:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> BTW, those using Btrfs can probably use ‘--disable-deduplication’ and be
>> done with it.
>
> I erroneously used to think that Btrfs could do live deduplication, but
> it doesn't.  There are external tools to do out of band / batch
> deduplication though [0]; so if they perform better than the guix daemon's
> own dedup, perhaps we could document this way out for our Btrfs users.
>
> [0]  https://btrfs.wiki.kernel.org/index.php/Deduplication

A little while ago I had hoped to test btrfs with --disable-deduplication 
and bees [1] as the deduplication agent, but wasn't able to successfully 
run a system with --disable-deduplication because I needed the 
deduplication to cover up problem with grafts [2]. Until we resolve the 
second issue, I don't think we should recommend folks run the daemon with 
--disable-deduplication.

[1] https://issues.guix.gnu.org/47983 (still missing a service)
[2] https://issues.guix.gnu.org/47115

Best,
Jack
Ludovic Courtès Nov. 9, 2021, 12:56 p.m. UTC | #10
Hi,

Jack Hill <jackhill@jackhill.us> skribis:

> On Mon, 8 Nov 2021, Maxim Cournoyer wrote:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> BTW, those using Btrfs can probably use ‘--disable-deduplication’ and be
>>> done with it.
>>
>> I erroneously used to think that Btrfs could do live deduplication, but
>> it doesn't.  There are external tools to do out of band / batch
>> deduplication though [0]; so if they perform better than the guix daemon's
>> own dedup, perhaps we could document this way out for our Btrfs users.
>>
>> [0]  https://btrfs.wiki.kernel.org/index.php/Deduplication
>
> A little while ago I had hoped to test btrfs with
> --disable-deduplication and bees [1] as the deduplication agent, but
> wasn't able to successfully run a system with --disable-deduplication
> because I needed the deduplication to cover up problem with grafts
> [2]. Until we resolve the second issue, I don't think we should
> recommend folks run the daemon with --disable-deduplication.
>
> [1] https://issues.guix.gnu.org/47983 (still missing a service)
> [2] https://issues.guix.gnu.org/47115

Oh, right.  We didn’t quite get to the bottom of #2.  Is it still an
issue?  Some questions remained opened.

Thanks,
Ludo’.
Simon Tournier Nov. 9, 2021, 6:10 p.m. UTC | #11
Hi Ludo,

On Sat, 06 Nov 2021 at 17:57, Ludovic Courtès <ludo@gnu.org> wrote:

> The ideal solution as zimoun writes would be to address
> <https://issues.guix.gnu.org/24937>.

Cool for your last reply with a plan for mitigating the issue.

Even if the phase is drastically speed up, it would be probably still
too slow when using the option ’-D’ remove only one <item>; or just
some.

I have not checked the code, maybe I should start by that. ;-) Is it not
possible to simply skip the deleting phase when the option ’-D’ is used?


Cheers,
simon
Ludovic Courtès Nov. 17, 2021, 10:02 a.m. UTC | #12
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> I have not checked the code, maybe I should start by that. ;-) Is it not
> possible to simply skip the deleting phase when the option ’-D’ is used?

No; like I wrote, it would have the effect of not deleting anything:

  https://issues.guix.gnu.org/51427#1

Needs more thought…

Ludo’.
Simon Tournier Nov. 17, 2021, 11:49 a.m. UTC | #13
Hi Ludo,

On Wed, 17 Nov 2021 at 11:02, Ludovic Courtès <ludo@gnu.org> wrote:
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> I have not checked the code, maybe I should start by that. ;-) Is it not
>> possible to simply skip the deleting phase when the option ’-D’ is used?
>
> No; like I wrote, it would have the effect of not deleting anything:

After giving a look at the code, yeah it is not so simple. :-)


>   https://issues.guix.gnu.org/51427#1
>
> Needs more thought…

The logic is complicated, thus adding this guard…

--8<---------------cut here---------------start------------->8---
    if (options.maxFreed > 0) {      
      /* Clean up the links directory. */
      if (options.action == GCOptions::gcDeleteDead || options.action == GCOptions::gcDeleteSpecific) {
        printMsg(lvlError, format("deleting unused links..."));
        removeUnusedLinks(state);
      }      
    }
--8<---------------cut here---------------end--------------->8---

…is probably dumb.  From my understanding, it should bypass the phase
’removeUnusedLinks’ when using “guix gc -D”.  Well, I have not tested
it.


Cheers,
simon
Ludovic Courtès July 18, 2022, 1:57 p.m. UTC | #14
Hello,

With commit 472a0e82a52a3d5d841e1dfad6b13e26082a5750 (Nov. 2021),
partially fixing <https://issues.guix.gnu.org/24937>, there is hopefully
less pressure to skip the remove-unused-links phase.

Should we close this issue?

Ludo’.
Liliana Marie Prikler July 18, 2022, 5:03 p.m. UTC | #15
Am Montag, dem 18.07.2022 um 15:57 +0200 schrieb Ludovic Courtès:
> Hello,
> 
> With commit 472a0e82a52a3d5d841e1dfad6b13e26082a5750 (Nov. 2021),
> partially fixing <https://issues.guix.gnu.org/24937>, there is
> hopefully less pressure to skip the remove-unused-links phase.
> 
> Should we close this issue?
As a hard disk user, I'm leaning towards "no".  In fact, I recently
encountered a case where I think I might want to skip it even if not
deleting "specific items".  For context, my machine has troubles with
sudden power outages during builds (courtesy of a certain graphics card
manufacturer), so if one of those happens during `guix package' or
`guix system' invocation, the sanest thing to do is to run `guix gc'
after reboot and retry whatever command I wanted to run.  However,
since I'm not really deleting much here, I'd probably be fine with
accumulating trash and collecting it at a later date.  Deleting unused
links is also something that can on some machines be postponed to a
time when they'd otherwise be idle, though I don't think it matters too
much in the context of CI since the global lock is no longer held at
that point, while the lack of storage is still blocking builds.

Cheers
Ludovic Courtès July 19, 2022, 8:34 a.m. UTC | #16
Hi,

Liliana Marie Prikler <liliana.prikler@gmail.com> skribis:

> Am Montag, dem 18.07.2022 um 15:57 +0200 schrieb Ludovic Courtès:
>> Hello,
>> 
>> With commit 472a0e82a52a3d5d841e1dfad6b13e26082a5750 (Nov. 2021),
>> partially fixing <https://issues.guix.gnu.org/24937>, there is
>> hopefully less pressure to skip the remove-unused-links phase.
>> 
>> Should we close this issue?
> As a hard disk user, I'm leaning towards "no".

At the REPL, could you do:

  ,use(ice-9 ftw)
  ,t (length (scandir "/gnu/store/.links"))

?

On my SSD I get:

  $4 = 438356
  ;; 24.613712s real time, 10.195698s run time.  1.805636s spent in GC.

This ‘scandir’ call is a good approximation of the cost of the
remove-unused-links phase.

Ludo’.
Liliana Marie Prikler July 19, 2022, 6:42 p.m. UTC | #17
Am Dienstag, dem 19.07.2022 um 10:34 +0200 schrieb Ludovic Courtès:
> Hi,
> 
> Liliana Marie Prikler <liliana.prikler@gmail.com> skribis:
> 
> > Am Montag, dem 18.07.2022 um 15:57 +0200 schrieb Ludovic Courtès:
> > > Hello,
> > > 
> > > With commit 472a0e82a52a3d5d841e1dfad6b13e26082a5750 (Nov. 2021),
> > > partially fixing <https://issues.guix.gnu.org/24937>, there is
> > > hopefully less pressure to skip the remove-unused-links phase.
> > > 
> > > Should we close this issue?
> > As a hard disk user, I'm leaning towards "no".
> 
> At the REPL, could you do:
> 
>   ,use(ice-9 ftw)
>   ,t (length (scandir "/gnu/store/.links"))
> 
> ?
> 
> On my SSD I get:
> 
>   $4 = 438356
>   ;; 24.613712s real time, 10.195698s run time.  1.805636s spent in GC.
scheme@(guile-user)> ,use (ice-9 ftw)
scheme@(guile-user)> ,t (length (scandir "/gnu/store/.links"))
$1 = 213027
;; 1417.872747s real time, 28.514293s run time.  1.284866s spent in GC.
scheme@(guile-user)> (/ 1417.872747 60)
$2 = 23.63121245

So yeah, assuming that scandir scales linearly, if my store was as big
as yours, I could eat lunch and GC still wouldn't be finished (for
context, lunch breaks in my country are only 30 minutes).
Tobias Geerinckx-Rice July 19, 2022, 7:25 p.m. UTC | #18
Liliana Marie Prikler 写道:
> scheme@(guile-user)> ,t (length (scandir "/gnu/store/.links"))
> $1 = 213027
> ;; 1417.872747s real time, 28.514293s run time.  1.284866s spent 
> in GC.

[…]

> So yeah, assuming that scandir scales linearly

…your rotational drive is beyond ridiculously slower than mine (an 
ST1000DM010-2EP102):

athena.tobias.gr:~ λ echo 3 | sudo tee /proc/sys/vm/drop_caches 
3
athena.tobias.gr:~ λ guix repl
[…]
scheme@(guix-user)> ,use (ice-9 ftw)
scheme@(guix-user)> ,t (length (scandir "/gnu/store/.links"))
$1 = 164437
;; 7.081361s real time, 2.569773s run time.  0.199963s spent in 
   GC.

Kind regards,

T G-R
Ludovic Courtès July 21, 2022, 9:21 a.m. UTC | #19
Hi,

Tobias Geerinckx-Rice <me@tobias.gr> skribis:

> Liliana Marie Prikler 写道:
>> scheme@(guile-user)> ,t (length (scandir "/gnu/store/.links"))
>> $1 = 213027
>> ;; 1417.872747s real time, 28.514293s run time.  1.284866s spent in
>> GC.
>
> […]
>
>> So yeah, assuming that scandir scales linearly
>
> …your rotational drive is beyond ridiculously slower than mine (an
> ST1000DM010-2EP102):
>
> athena.tobias.gr:~ λ echo 3 | sudo tee /proc/sys/vm/drop_caches 3
> athena.tobias.gr:~ λ guix repl
> […]
> scheme@(guix-user)> ,use (ice-9 ftw)
> scheme@(guix-user)> ,t (length (scandir "/gnu/store/.links"))
> $1 = 164437
> ;; 7.081361s real time, 2.569773s run time.  0.199963s spent in    GC.

It’s crazy that there are two orders of magnitude of difference between
these two hard disks.

Liliana, is your hard disk old or low-end?

I agree that we should strive to have good performance on that kind of
hardware too, but I don’t know how to get there.

Ludo’.
Liliana Marie Prikler July 21, 2022, 6:02 p.m. UTC | #20
Am Donnerstag, dem 21.07.2022 um 11:21 +0200 schrieb Ludovic Courtès:
> Hi,
> 
> Tobias Geerinckx-Rice <me@tobias.gr> skribis:
> 
> > Liliana Marie Prikler 写道:
> > > scheme@(guile-user)> ,t (length (scandir "/gnu/store/.links"))
> > > $1 = 213027
> > > ;; 1417.872747s real time, 28.514293s run time.  1.284866s spent
> > > in
> > > GC.
> > 
> > […]
> > 
> > > So yeah, assuming that scandir scales linearly
> > 
> > …your rotational drive is beyond ridiculously slower than mine (an
> > ST1000DM010-2EP102):
> > 
> > athena.tobias.gr:~ λ echo 3 | sudo tee /proc/sys/vm/drop_caches 3
> > athena.tobias.gr:~ λ guix repl
> > […]
> > scheme@(guix-user)> ,use (ice-9 ftw)
> > scheme@(guix-user)> ,t (length (scandir "/gnu/store/.links"))
> > $1 = 164437
> > ;; 7.081361s real time, 2.569773s run time.  0.199963s spent in   
> > GC.
> 
> It’s crazy that there are two orders of magnitude of difference
> between these two hard disks.
> 
> Liliana, is your hard disk old or low-end?
I'm not too sure about age, but it's probably low-end in terms of
speed.  There's room for 2TB data after all.

> I agree that we should strive to have good performance on that kind
> of hardware too, but I don’t know how to get there.
I don't think deleting links will ever be fast on that disk.  But what
I've been saying the whole time is that I don't always need the links
deleted.  I think adding "expert" switches to skip these phases might
actually be enough – after all, if I ever do want to run a full GC, the
information ought to be the same, no?
Ludovic Courtès July 22, 2022, 12:14 p.m. UTC | #21
Hi,

Liliana Marie Prikler <liliana.prikler@gmail.com> skribis:

> I don't think deleting links will ever be fast on that disk.  But what
> I've been saying the whole time is that I don't always need the links
> deleted.  I think adding "expert" switches to skip these phases might
> actually be enough – after all, if I ever do want to run a full GC, the
> information ought to be the same, no? 

The expert will have to know that skipping that phase will have the
effect of *not* freeing space on the device, so…

Ludo’.
M July 22, 2022, 1:39 p.m. UTC | #22
On 22-07-2022 14:14, Ludovic Courtès wrote:
> Hi,
>
> Liliana Marie Prikler<liliana.prikler@gmail.com>  skribis:
>
>> I don't think deleting links will ever be fast on that disk.  But what
>> I've been saying the whole time is that I don't always need the links
>> deleted.  I think adding "expert" switches to skip these phases might
>> actually be enough – after all, if I ever do want to run a full GC, the
>> information ought to be the same, no?
> The expert will have to know that skipping that phase will have the
> effect of *not* freeing space on the device, so…

I believe the word "expert" implies that the expert knows that, 
otherwise they are, by definition, not an expert, so I don't see your 
point. So ... what does the ... after the 'so' hide here? I don't 
understand what point you are trying to make here.

The idea is to, when deleting specific items, just do that, and not 
start iterating over all (*) the other things in the store.

This is important for, say, testing substitution code efficiently (or 
SWH code as mentioned previously, etc).

There, the lack of freeing space is not a concern.  This appears, after 
reading debbugs, to be already mentioned at 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51427#20.

Maybe something that would be acceptable to all parties: When deleting 
specific store items, don't remove _all_ the unused links, but only 
remove the unused links that correspond to deleted files. Which after 
reading 51427 appears to already have been proposed.

> Maybe that proposal is bogus though because you’d need to know the hash
> of the files being removed, which means reading them…
I don't see the problem -- when deleting a specific store item, read the 
files one-by-one, hash them one-by-one, and delete the link if appropriate.

 > Things about https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24937 
lessening the need

Sure, but as informally mentioned by, say, Liliana, even after that 
things remain ~ O(n) (or probably O(n lg n) if the file system uses some 
tree structure) where n=size of the store, which in any realistic 
situation is going to be way slower than O(m), where m = the number of 
individual store items to delete, for reasonable implementations of 
"delete individual store item". (*)

The point isn't to work-around slow "deleting unused links" 
implementation, but rather to avoid inherit slowness of deleting 
everything when deleting a few things suffice.

Summarised, I don't understand the reluctance to merge an implementation 
of "delete individual store item" -- yes, the delete link phase is slow 
and could possibly be improved, yes when using certain implementations 
very little disk is freed, but those aren't the point of the patch 
AFAICT, they are orthogonal concerns.

Greetings,
Maxime.

(*) Yes, I'm neglecting the difference between number of store items and 
links and size of store items here, but those don't make a difference to 
the conclusion here.
Ludovic Courtès July 22, 2022, 11:07 p.m. UTC | #23
Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

> On 22-07-2022 14:14, Ludovic Courtès wrote:
>> Hi,
>>
>> Liliana Marie Prikler<liliana.prikler@gmail.com>  skribis:
>>
>>> I don't think deleting links will ever be fast on that disk.  But what
>>> I've been saying the whole time is that I don't always need the links
>>> deleted.  I think adding "expert" switches to skip these phases might
>>> actually be enough – after all, if I ever do want to run a full GC, the
>>> information ought to be the same, no?
>> The expert will have to know that skipping that phase will have the
>> effect of *not* freeing space on the device, so…
>
> I believe the word "expert" implies that the expert knows that,

Apologies for being elliptic.  My point here, as has been discussed
earlier in this thread, is that we can’t just skip that phase or we’d
simply leave files around without actually deleting them.

Thus, a command-line switch to skip the phase doesn’t seem valuable to
me because it’d let users run the GC in a way that doesn’t actually
collect garbage.

I hope this is clearer!

Thanks,
Ludo’.
Liliana Marie Prikler July 23, 2022, 6:52 a.m. UTC | #24
Am Samstag, dem 23.07.2022 um 01:07 +0200 schrieb Ludovic Courtès:
> Hi,
> 
> Maxime Devos <maximedevos@telenet.be> skribis:
> 
> > On 22-07-2022 14:14, Ludovic Courtès wrote:
> > > Hi,
> > > 
> > > Liliana Marie Prikler<liliana.prikler@gmail.com>  skribis:
> > > 
> > > > I don't think deleting links will ever be fast on that disk. 
> > > > But what I've been saying the whole time is that I don't always
> > > > need the links deleted.  I think adding "expert" switches to
> > > > skip these phases might actually be enough – after all, if I
> > > > ever do want to run a full GC, the information ought to be the
> > > > same, no?
> > > The expert will have to know that skipping that phase will have
> > > the effect of *not* freeing space on the device, so…
> > 
> > I believe the word "expert" implies that the expert knows that,
> 
> Apologies for being elliptic.  My point here, as has been discussed
> earlier in this thread, is that we can’t just skip that phase or we’d
> simply leave files around without actually deleting them.
> 
> Thus, a command-line switch to skip the phase doesn’t seem valuable
> to me because it’d let users run the GC in a way that doesn’t
> actually collect garbage.
> 
> I hope this is clearer!
As noted before, I don't always run GC to free up X amount of space. 
Even if I did, link deletion is greedy and frees up whatever it can. 
So the initial suggestion to only look at what might have been freed in
this gc already makes sense.  However, it was ruled complicated because
the GC is implemented in C++.  

My personal motivation to just skip the phase entirely comes from the
hypothesis that the store is in a sane state even if the links are not
deleted.  Particularly, if I `guix gc broken-item' and `guix build
broken-item', even without deleting links, the broken-item should now
be fixed.  This has practical advantages over `guix build --repair': if
the last `guix package' or `guix system' failed mid-way, any user, not
just root, can simply `guix gc' the broken items.

Now, I understand that as a default, you never want to skip this phase,
because it doesn't actually free up disk space.  But if you have a slow
disk with large space, do you really need to free all that much space,
or would it be fine to delay freeing it until a later date?

Cheers
M July 23, 2022, 10:24 a.m. UTC | #25
On 23-07-2022 01:07, Ludovic Courtès wrote:

> Apologies for being elliptic.  My point here, as has been discussed
> earlier in this thread, is that we can’t just skip that phase or we’d
> simply leave files around without actually deleting them.
>
> Thus, a command-line switch to skip the phase doesn’t seem valuable to
> me because it’d let users run the GC in a way that doesn’t actually
> collect garbage.

It's definitively possible to skip the phase, AFAICT -- there was some 
code doing exactly that, and for some use cases the limitations (i.e., 
very limited amount of space actually being freed) were found to be 
acceptable, for the user isn't trying to free space in the first place 
(doing that would be a nice side-effect, but not what the user was 
trying to accomplish), and other people aren't impacted by the 
limitations as it's an off-by-default switch.

As noted before, sometimes the point isn't to free space, but only to 
collect _some_ (not all, just some, i.e., the store item, but the 
individual files in the store item don't matter) garbage. For some users 
and use cases, not freeing space is not a problem, as mentioned in the 
previous mail:

> This is important for, say, testing substitution code efficiently (or 
> SWH code as mentioned previously, etc).
>
> There, the lack of freeing space is not a concern.  This appears, 
> after reading debbugs, to be already mentioned at 
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51427#20.
For these users, skipping that phase (or another solution, whatever 
works), is quite valuable and not a problem.

(Remember, people modifying the substitution code or such are users too.)

That said, there were some approaches mentioned that do skip the phase, 
but in a manner such that space is still freed (but only the relevant 
space, not things from the whole store, so performance wouldn't be 
horrible).

> Thus, a command-line switch to skip the phase doesn’t seem valuable to
> me because it’d let users run the GC in a way that doesn’t actually
> collect garbage.
If the user runs "guix gc" with an off-by-default switch that isn't 
recommended for general usage, whose description makes it look like some 
arcane thing (if you didn't know the phase already, how would you know 
what ‘don't delete unused links’ means?), while they actually just 
wanted to free space, then that's their problem; not Guix, I think.  
Furthermore, if they somehow do that by mistake, then can just do a 
regular "guix gc" afterwards -- it's a quite recoverable mistake. As 
such, I don't see a problem.

Also, it _does_ collect garbage -- it collects the /gnu/store/... item, 
it just doesn't collect _all_ the garbage (it doesn't collect the 
individual files in the store item or the things in /gnu/store/.links).

If you mean it doesn't fit under "guix gc" because it doesn't free much 
space and hence doesn't fit in the concept of "gc'ing", I suppose we 
could make a new command "guix $bikeshed" that's like "guix gc" except 
sometimes it doesn't free much space, though I don't see the point when 
we already have "guix gc" where it's easy to just add a flag. 
Alternatively, we could just inflate the concept of "GC" a little such 
that it becomes more useful for some people without making it worse for 
others instead of defining a new command.

Summarised: gc'ing is not limited to freeing $N MiB, there are other 
valid reasons to gc too as mentioned previously (make some slots empty 
in the weak hash table that is /gnu/store), why are attempts to 
implement some huge optimisations for the latter rejected when they 
don't impact the former at all?

Or summarised another way: we aren't trying to remove the GC, rather 
previously the GC mostly only supported running a full cycle, with this 
patch the GC also has a more incremental mode of operation.

Greetings,
Maxime.
Felix Lechner June 12, 2023, 8:55 p.m. UTC | #26
Hi,

> Liliana, is your hard disk old or low-end?

Some of my equipment is very old (2011). I think Liliana's issue is
elsewhere. Maybe it's a problem with the DMA setup or with a shared
interrupt. (Dmesg might help.) Liliana, have you tried removing that
rebellious graphics card?

A good place to start when looking at hard drive speeds is the read
performance with

  hdparm -Tt /dev/sdX

I would further look at the tuning parameters of the file system. I
would also try a different internal data cable. (It would be the
connector that has issues.)

For the Guix REPL exercise looking at the links in the store, I
locally got the following results on a mirrored array with two disks
(ext4 on SATA 6 Gb/s, 7200 rpm):

  ;; 70.719509s real time, 14.560439s run time.  3.645179s spent in GC.

On another piece of equipment with three mirrored disks (ext4 on SAS 6
Gb/s, 7200 rpm) I saw:

  ;; 56.528064s real time, 20.906853s run time.  5.087733s spent in GC.

Arrays with multiple drives are sometimes faster because they can read
in parallel.

Kind regards
Felix
Liliana Marie Prikler June 13, 2023, 4:26 a.m. UTC | #27
Hi,

Am Montag, dem 12.06.2023 um 13:55 -0700 schrieb Felix Lechner:
> Hi,
> 
> > Liliana, is your hard disk old or low-end?
> 
> Some of my equipment is very old (2011). I think Liliana's issue is
> elsewhere. Maybe it's a problem with the DMA setup or with a shared
> interrupt. (Dmesg might help.) Liliana, have you tried removing that
> rebellious graphics card?
I have removed the entire machine, or more specifically, I have given
them away to a family member, who now uses it as an office machine and
thus has lighter loads.  As far as I'm aware, the graphics card is
still acting up from kernel to kernel – last time we had one where it
would freeze up after ~5 minutes of "work".

> A good place to start when looking at hard drive speeds is the read
> performance with
> 
>   hdparm -Tt /dev/sdX
> 
> I would further look at the tuning parameters of the file system. I
> would also try a different internal data cable. (It would be the
> connector that has issues.)
My /gnu/store now lives on an SSD hopefully fast enough to not
needlessly slow down garbage collections.  My /home still lives on the
old HDD, but since the only paths to the store in there are symbolic
links, I doubt it makes a difference.


Cheers
diff mbox series

Patch

diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
index e1d0765154..7d872d8cc1 100644
--- a/nix/libstore/gc.cc
+++ b/nix/libstore/gc.cc
@@ -771,7 +771,7 @@  void LocalStore::collectGarbage(const GCOptions & options, GCResults & results)
     deleteGarbage(state, state.trashDir);
 
     /* Clean up the links directory. */
-    if (options.action == GCOptions::gcDeleteDead || options.action == GCOptions::gcDeleteSpecific) {
+    if (options.action == GCOptions::gcDeleteDead) {
         printMsg(lvlError, format("deleting unused links..."));
         removeUnusedLinks(state);
     }