mbox series

[bug#51346,0/1,core-updates-frozen] Rework swap device to add dependencies and flags

Message ID 87fsssdqg2.fsf@jpoiret.xyz
Headers show
Series Rework swap device to add dependencies and flags | expand

Message

Josselin Poiret Oct. 23, 2021, 9:46 a.m. UTC
Hi,

This patchset adds new record types swap-partition and swap-file, to be used in the swap-devices field of operating-system. These support dependencies on mapped-device and file-system objects respectively, as well as swapon flags. I pulled those from GNU libc, and in the manual I refer to 'man 2 swapon' for the description of these flags. Support for the old style is kept for now, but I added deprecation warnings.

This works well on my laptop, whereas my swap file used to never be swapon on boot because it wasn't available yet (on BTRFS on LUKS). I don't have a swap partition lying around though so testers welcome!

I hope this can make it in time for the core-updates-frozen merge. I also plan to add swap file hibernation support eventually, where the file offsets are automatically determined by guix (or we could even write our own suspend/resume script in guile, see https://www.kernel.org/doc/html/latest/power/userland-swsusp.html).

Josselin Poiret (1):
  gnu: system: Add support for swap dependencies and flags

 doc/guix.texi               |  98 +++++++++++++++++++---------
 gnu/build/file-systems.scm  |  25 ++++++-
 gnu/services/base.scm       | 126 ++++++++++++++++++++++++++----------
 gnu/system.scm              |   4 +-
 gnu/system/file-systems.scm |  34 +++++++++-
 guix/build/syscalls.scm     |  12 ++++
 6 files changed, 230 insertions(+), 69 deletions(-)

Comments

Tobias Geerinckx-Rice Oct. 24, 2021, 2:05 a.m. UTC | #1
Josselin,

Josselin Poiret via Guix-patches via 写道:
> This patchset adds new record types swap-partition and 
> swap-file, to be used in the swap-devices field of 
> operating-system.

Thank you so much for this.

Do you happen to know anything about how the Hurd handles swap?

> in the manual I refer to 'man 2 swapon' for the description of 
> these flags.

I think we should document the basics ourselves.  We can still 
refer to the man page if you think it's needed.  WDYT?

Pity that there's no (libc) Info node to which we can link.

> This works well on my laptop, whereas my swap file used to never 
> be swapon on boot because it wasn't available yet (on BTRFS on 
> LUKS). I don't have a swap partition lying around though so 
> testers welcome!

Also boots fine with my plain swap partition:

  (swap-devices (list (swap-partition
                       (device hibernation-device))))

Not having to explicitly manage HIBERNATION-DEVICE, as you suggest 
below, sounds nice too :-)

> I hope this can make it in time for the core-updates-frozen 
> merge.

As noted on IRC, I don't see a reason to involve core-updates at 
all.

We should take the time to define solid interfaces but, once done, 
this can go straight to master.

> I also plan to add swap file hibernation support eventually, 
> where the file offsets are automatically determined by guix (or 
> we could even write our own suspend/resume script in guile, see 
> https://www.kernel.org/doc/html/latest/power/userland-swsusp.html).

Okay.  As also implied on IRC… I have a very low opinion of 
uswsusp.  It's brittle, gimmicky, and introduces many ways for 
bugs to hide and boots to break.  We'll have to carefully track 
incompatible format changes and kernel/initrd generations.

If it is added, we shouldn't involve early userspace in cases 
where it's not strictly needed.

But that for later :-)

> +++ b/doc/guix.texi
> +* Swap Space::                  Adding swap space.

You're following existing precedent here, but I just read the same 
thing twice.

I suggest ‘Swap Space:: Adding virtual memory to free up precious 
RAM.’.

> +@cindex swap devices
> +A list of @code{<swap-partition>} or @code{<swap-file>} objects
> +(@pxref{Swap Space}), to be used for ``swap space'' 
> (@pxref{Memory
> +Concepts,,, libc, The GNU C Library Reference Manual}).

At the risk of leaving this very stubby, I think the (libc) ref 
should be moved to the Swap Space node, which readers might visit 
directly without reading the above.

> +@node Swap Space
> +@section Swap Space
> +@cindex swap space

…so, here.

I'm missing a short intro sentence that mentions what swap is for, 
and that it comes in 2 common forms.

The libc explanation is quite technical, doesn't actually define 
‘swap space’ except by implication, and immediately rambles on 
about zeroes that don't even exist.  As a new user, I think I'd 
feel lost.

> +@deftp {Data Type} swap-partition
> +Objects of this type represent swap partitions. They contain 
> the following
> +members:

(What are ‘swap partitions’?  Maybe explain the pros/cons of both 
in each @deftp intro.  Mostly a reminder to myself, but if you 
want to write more docs: be my guest.)

Always double-space after full stops in prose.

> +@item @code{flags} (default: @code{'()})
> +A list of flags. The supported flags are @code{'delayed} and
> +@code{('priority n)}, see @command{man 2 swapon} in the kernel 
> man pages
> +(@code{man-pages} guix package) for more information.

'delayed?  To?  When?

I'm unenthusiastic about this interface.

On the one hand, exposing this tiny and ossified list of 2.5 
‘flags’ (what even is that priority… thing…) this way feels like 
exposing users to an ugly C implementation detail for no benefit: 
why not

  (swap-partition
    (priority 5) ; or #f distinct from 0
    (discard? #t)
    …)

instead?

On the other hand: perhaps other kernels expose different flags 
and this model might make sense.  I'm not convinced, but I'm 
willing to be.

> +A string, specifying the file path of the swap file to use.

s/file path/name/

> +@item @code{fs}

s/fs/file-system/

As a rule, avoid such pointless abbreviation.  GNU's not unix, 
thankfully.

That said, why does this field exist at all?  The example given 
here:

> +@item (swap-file (path "/swapfile") (fs root-fs))
> +Use the file @file{/swapfile} as swap space, which is present 
> on the
> +@var{root-fs} filesystem.

…rather side-steps the question of how this is supposed to work, 
or in which situation it makes sense.  I feel like it's papering 
over a bug.

> +(define (swap-flags->bit-mask flags)

So I made the mistake of looking at how util-linux does this.

Firstly, it silently clamps (> priority max) to MAX.  I think it 
makes sense to follow that behaviour, but print a warning. 
Ignoring (< priority 0), with a warning, is fine.

Secondly, and this is just weird, ‘man 2 swapon’ explicitly 
documents:

  (prio << SWAP_FLAG_PRIO_SHIFT) & SWAP_FLAG_PRIO_MASK

so naturally util-linux's swapon.c explicitly does this:

  (prio & SWAP_FLAG_PRIO_MASK) << SWAP_FLAG_PRIO_SHIFT

What?  Surely this ancient code can't work just by sheer luck… 
I'll ask.

I see no advantage in ignoring SWAP_FLAG_PRIO_SHIFT, only risks. 
Let's not.

Here's how I'd write it:

--8<---------------cut here---------------start------------->8---
(define (swap-flags->bit-mask flags)
  "Return the number suitable for the 'flags' argument of 'mount' 
  that
corresponds to the symbols listed in FLAGS."
  (let loop ((flags flags))
    (match flags
      ((('priority p) rest ...)
       (if (< p 0)
           (begin (warning
                   (G_ "Ignoring swap priority ~a as it is less 
                   than 0.~%" p))
                  (loop rest))
           (let* ((max (ash SWAP_FLAG_PRIO_MASK (- 
           SWAP_FLAG_PRIO_SHIFT)))
                  (pri (if (> p max)
                           (begin (warning
                                   (G_ "Limiting swap priority ~a to ~a.~%"
                                       p max))
                                  max)
                           p)))
             (logior SWAP_FLAG_PREFER
                     (ash pri SWAP_FLAG_PRIO_SHIFT)
                     (loop rest)))))
      (('discard rest ...)
       (logior SWAP_FLAG_DISCARD (loop rest)))
      (()
       0))))
--8<---------------cut here---------------end--------------->8---

It should also handle invalid input by printing the offending 
symbol instead of a generic match error, but I'm about to board my 
train, and will call it a night here.

Kind regards,

T G-R
Josselin Poiret Oct. 25, 2021, 2:17 p.m. UTC | #2
Hi,



First and foremost, thank you for your review!



On 10/24/21 2:05 AM, Tobias Geerinckx-Rice wrote:

> Do you happen to know anything about how the Hurd handles swap?



I'm investigating this currently, got a Hurd vm up and running ("guix system image -t hurd-raw" is very nice). We talked about this on IRC, and decided that it would be easier to simply used the Hurd's swapon binary (from the hurd package itself), rather than directly communicate with the default_pager server, since that'd involve writing a lot of Hurd RPC code. In the long run, this would be the way to go, just like for Linux (we use (guix build syscalls) instead of util-linux's swapon), but getting initial support in faster would be great.



>> in the manual I refer to 'man 2 swapon' for the description of these flags.

> 

> I think we should document the basics ourselves.  We can still refer to the man page if you think it's needed.  WDYT?

> 

> Pity that there's no (libc) Info node to which we can link.



Now that I'm looking at it, they really are not that complicated, I'll write something to describe them then.


>> +++ b/doc/guix.texi

>> +* Swap Space::                  Adding swap space.

> 

> You're following existing precedent here, but I just read the same thing twice.

> 

> I suggest ‘Swap Space:: Adding virtual memory to free up precious RAM.’.



Noted! Maybe something along "Backing RAM with disk space." rather, as swap space isn't really memory?



>> +@cindex swap devices

>> +A list of @code{<swap-partition>} or @code{<swap-file>} objects

>> +(@pxref{Swap Space}), to be used for ``swap space'' (@pxref{Memory

>> +Concepts,,, libc, The GNU C Library Reference Manual}).

> 

> At the risk of leaving this very stubby, I think the (libc) ref should be moved to the Swap Space node, which readers might visit directly without reading the above.

> 

>> +@node Swap Space

>> +@section Swap Space

>> +@cindex swap space

> 

> …so, here.

> 

> I'm missing a short intro sentence that mentions what swap is for, and that it comes in 2 common forms.

> 

> The libc explanation is quite technical, doesn't actually define ‘swap space’ except by implication, and immediately rambles on about zeroes that don't even exist.  As a new user, I think I'd feel lost.



Will add.



>> +@deftp {Data Type} swap-partition

>> +Objects of this type represent swap partitions. They contain the following

>> +members:

> 

> (What are ‘swap partitions’?  Maybe explain the pros/cons of both in each @deftp intro.  Mostly a reminder to myself, but if you want to write more docs: be my guest.)

> 

> Always double-space after full stops in prose.

> 

>> +@item @code{flags} (default: @code{'()})

>> +A list of flags. The supported flags are @code{'delayed} and

>> +@code{('priority n)}, see @command{man 2 swapon} in the kernel man pages

>> +(@code{man-pages} guix package) for more information.

> 

> 'delayed?  To?  When?

> 

> I'm unenthusiastic about this interface.

> 

> On the one hand, exposing this tiny and ossified list of 2.5 ‘flags’ (what even is that priority… thing…) this way feels like exposing users to an ugly C implementation detail for no benefit: why not

> 

>  (swap-partition

>    (priority 5) ; or #f distinct from 0

>    (discard? #t)

>    …)

> 

> instead?

> 

> On the other hand: perhaps other kernels expose different flags and this model might make sense.  I'm not convinced, but I'm willing to be.


Welp, looks like you gave me the perfect excuse: from what I gather, Hurd does not have any flags for its swap space (see hurd/default_pager.defs, line 86). The other part of the reason is that I did not want to have even more "swap-partition-priority"/"swap-file-priority" duplicate accessors (more on swap-file/swap-partition below).



>> +A string, specifying the file path of the swap file to use.

> 

> s/file path/name/

> 

>> +@item @code{fs}

> 

> s/fs/file-system/

> 

> As a rule, avoid such pointless abbreviation.  GNU's not unix, thankfully.

> 

> That said, why does this field exist at all?  The example given here:

> 

>> +@item (swap-file (path "/swapfile") (fs root-fs))

>> +Use the file @file{/swapfile} as swap space, which is present on the

>> +@var{root-fs} filesystem.

> 

> …rather side-steps the question of how this is supposed to work, or in which situation it makes sense.  I feel like it's papering over a bug.



Abbreviation aside, the example is bad: even though theoretically presence of the swap file depends on the root filesystem being mounted, well, root-fs is always mounted and should be ignored (there's no filesystem-/ service either). A better example would be my personal configuration:



(swap-devices (list

 (swap-file

  (path "/btrfs/swapfile")

  (fs (car (filter (lambda (x)

		     (equal? (file-system-mount-point x)

			     "/btrfs"))

		   file-systems)))))) 



You can see here that /btrfs/swapfile wouldn't be accessible if the filesystem under fs wasn't mounted first.



>> +(define (swap-flags->bit-mask flags)

> 

> So I made the mistake of looking at how util-linux does this.

> 

> Firstly, it silently clamps (> priority max) to MAX.  I think it makes sense to follow that behaviour, but print a warning. Ignoring (< priority 0), with a warning, is fine.

> 

> Secondly, and this is just weird, ‘man 2 swapon’ explicitly documents:

> 

>  (prio << SWAP_FLAG_PRIO_SHIFT) & SWAP_FLAG_PRIO_MASK

> 

> so naturally util-linux's swapon.c explicitly does this:

> 

>  (prio & SWAP_FLAG_PRIO_MASK) << SWAP_FLAG_PRIO_SHIFT

> 

> What?  Surely this ancient code can't work just by sheer luck… I'll ask.


I mean, SWAP_FLAG_PRIO_SHIFT has been equal 0 since at least the initial linux git commit 17 years ago, so it might actually be sheer luck.



> I see no advantage in ignoring SWAP_FLAG_PRIO_SHIFT, only risks. Let's not.


How I saw it: if SWAP_FLAG_PRIO_SHIFT ever changes from 0, and our code actually cared about its value, we'd still have to notice the change and change the value (it would be different if we were parsing linux headers instead). But I guess that's mostly laziness on my part, and since you wrote a nice replacement, I guess I'll add that.



> It should also handle invalid input by printing the offending symbol instead of a generic match error, but I'm about to board my train, and will call it a night here.


Will do!





On 10/24/21 1:58 PM, Tobias Geerinckx-Rice wrote:

> Oh no,

> 

> he's back.  With another annoying question: why don't we drop the whole swap-partition/swap-file dichotomy?  The distinction is artificial insofar as Linux doesn't make one.

> 

> Which end is supposed to explode if you

> 

>  (swap-partition (device "/home/nckx/swap"))

>  (swap-file (name "/dev/sda2"))

>

> ?

>

> What real-world drawback(s) do you see to

> 

>  (swap (space "/home/nckx/swap"))

>  (swap (space "/dev/sda2"))

>  (swap (space (uuid "ab-c-d-e-fgh")))

>  (swap (space (file-system-label "best-swaps")))

> 

> naming aside?



The motivating thing for me was that they have to be treated differently for hibernation purposes: you shouldn't make the dependencies of a swap file available (ie mount the filesystem it's on) but rather determine its offset inside the block device. For a swap partition, there's no such thing, you have to make the device itself available to the kernel. We could have a swapfile? flag instead though, but I'm still not convinced by both approaches. For the current patch though, nothing is going to explode there (although for a swap-file you do have to specify the filesystem it is on, but that's just my record definition forcing you to).



> Josselin Poiret via Guix-patches via 写道:

>> +(define (swap-partition->service-name spartition)

> 

> Nitpick: ->shepherd-service-name just for similarity to <file-system>s.

> 

> Aside, when I try to apply your third manual example, I get:

> 

>  guix system: error: service 'swap-/dev/sda2' requires

>  'file-system-/', which is not provided by any service



I forgot that the root filesystem is treated differently from others, so this example is borked. I'll add something akin to my config instead. On a tangential note, there's nothing stopping us from renaming root-file-system to file-system-/, so that these at least don't give an error, right (they're logically not wrong dependencies, although a bit useless)?



Thanks again for your review!

A revised patchset (yes, this time with multiple commits) will be coming soon, once I figure out the proper way to work with block devices on Hurd.



Best,

Josselin Poiret