diff mbox series

[bug#37305,V2] Allow booting from a Btrfs subvolume.

Message ID 87lfp6b5cs.fsf_-_@gmail.com
State Accepted
Headers show
Series [bug#37305,V2] Allow booting from a Btrfs subvolume. | expand

Checks

Context Check Description
cbaines/applying patch fail View Laminar job

Commit Message

Maxim Cournoyer Feb. 13, 2020, 8:27 p.m. UTC
I've fixed a few problems I've found with more extensive testing.

Attached is the version 2 of the patch series.

Comments

Ludovic Courtès Feb. 14, 2020, 5:22 p.m. UTC | #1
Hi Maxim!

Great to see Btrfs support improving; many people will love that!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> From 3640bea548826e1c1ec9b766da1fdfe4791d7452 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sun, 17 Nov 2019 06:01:00 +0900
> Subject: [PATCH 1/9] gnu: tests: Reduce the time required to run the system
>  tests.
>
> When setting the GUIX_DEV_HACKS environment variable, the Guix package used
> inside the instrumented VMs recycles the binaries already found in the Guix
> checkout of the developer instead of rebuilding Guix from scratch.  This
> brings the time required for this component from 20+ minutes down to 2-3
> minutes on an X200 machine.
>
> * gnu/packages/package-management.scm (current-guix/pre-built): New procedure.
> * build-aux/run-system-tests.scm (tests-for-channel-instance): Use it, when
> GUIX_DEV_HACKS is defined.

I understand the need, but I’d really like to avoid that; it’s too
fragile IMO.

But I have good news!  First, commit
887fd835a7c90f720d36a211478012547feaead0 really improved things by
avoiding the full ‘guix’ package rebuild (and we’re only talking about
the installation tests; other tests are just fine.)  Second, there are
improvements to Guile that will appear in 3.0.1/2.2.7 that make
compilation of big files roughly twice as fast.

There’s still room for improvement, but I’d rather work in those
directions.  WDYT?

> From 97d8a635eba34c7cf0708e99bf77ef9bad1344bf Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Tue, 11 Feb 2020 12:57:29 -0500
> Subject: [PATCH 2/9] gnu: linux-boot: Ensure volatile root is mounted
>  read-only.
>
> * gnu/build/linux-boot.scm (mount-root-file-system): Ensure MS_RDONLY is
> present among the root file system flags when VOLATILE-ROOT? is #t.

(You can drop the “gnu:” prefix.)

OK!

> From e8d6642d3597207657842c9ca4849f8660d06638 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Tue, 11 Feb 2020 23:56:45 -0500
> Subject: [PATCH 3/9] file-systems: Add a 'file-system-device->string'
>  procedure.
>
> * gnu/system/file-systems.scm (file-system-device->string): New procedure.
> * gnu/system.scm (bootable-kernel-arguments): Use it.
> * gnu/system/vm.scm (operating-system-uuid): Likewise.
> * guix/scripts/system.scm (display-system-generation): Likewise.

OK!

> From 4f6e3955957beb5287e9d5a5d33b74725836e1ac Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Tue, 11 Feb 2020 14:00:06 -0500
> Subject: [PATCH 4/9] gnu: linux-boot: Refactor boot-system.
>
> The --root option can now be omitted, and inferred from the root file system
> declaration instead.
>
> * gnu/build/linux-boot.scm (boot-system): Remove nested definitions for
> root-fs-type, root-fs-flags and root-fs-options, and bind those inside the
> let* instead.  Make "--root" take precendence over the device field string
> representation of the root file system.
> * doc/guix.texi (Initial RAM Disk): Document that "--root" can be left
> unspecified.

[...]

>  @item --root=@var{root}
> -Mount @var{root} as the root file system.  @var{root} can be a
> -device name like @code{/dev/sda1}, a file system label, or a file system
> -UUID.
> +Mount @var{root} as the root file system.  @var{root} can be a device
> +name like @code{/dev/sda1}, a file system label, or a file system UUID.
> +When unspecified, the device name from the root file system of the
> +operating system declaration is used.

Oh!  Does it always work?  That makes me wonder why we’ve been carrying
‘--root’ and I’m not sure if I’m forgetting a good reason to do it that
way.

>        (mount-essential-file-systems)
>        (let* ((args    (linux-command-line))
>               (to-load (find-long-option "--load" args))
> -             (root    (find-long-option "--root" args)))
> +             (root-fs (find root-mount-point? mounts))
> +             (root-fs-type (or (and=> root-fs file-system-type)
> +                               "ext4"))
> +             (root-device (and=> root-fs file-system-device))
> +             (root-device-str (and=> root-device file-system-device->string))
> +             ;; --root takes precedence over the 'device' field of the root
> +             ;; <file-system> record.
> +             (root (or (find-long-option "--root" args)
> +                       root-device-str))
> +             (root-fs-flags (mount-flags->bit-mask
> +                             (or (and=> root-fs file-system-flags)
> +                                 '())))
> +             (root-fs-options (if root-fs
> +                                  (file-system-options root-fs)
> +                                  '()))
> +             (root-options (if (null? root-fs-options)
> +                               #f
> +                               (file-system-options->str
> +                                root-fs-options))))

Since ‘file-system-device->string’ is lossy, I think we should do it the
other way around: convert the ‘--root’ string to a <file-system>.
Perhaps that bit can be moved to a separate procedure, like
‘root-string->file-system’.

Also, the (or … "ext4") bit doesn’t sound great, but perhaps it’ll be
unnecessary if we do with something as outlined above?

> From af61745d8b686755a5d9deb9e21c9eac624fb43e Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Wed, 25 Sep 2019 22:43:41 +0900
> Subject: [PATCH 5/9] file-systems: Represent the file system options as an
>  alist.
>
> This allows accessing the parameter values easily, without having to parse a
> string.
>
> * gnu/system/file-systems.scm (<file-system>): Update the default value of the
> OPTIONS field, doc.
> (%file-system-options): Field accessor renamed from `file-system-options'.
> (file-system-options, file-system-options->string): New procedures.
> * gnu/build/file-systems.scm (mount-file-system): Adapt.
> * gnu/services/base.scm (file-system->fstab-entry): Likewise.
> * tests/file-systems.scm: New tests.
> * doc/guix.texi (File Systems): Document the modified default value of the
> 'file-system-options' field.

The main issue I see with this change is that mount(2) takes raw strings
for the options.  There’s a convention to have those strings look like
“KEY1=VALUE1,KEY2=VALUE2”, but it’s just a convention.

As a rule of thumb, I’d rather have our interface be as close as
possible to the actual mount(2) interface, which means taking strings.

Now, we can surely add helper procedures to parse options that follow
the above conventions.

WDYT?

> +  ;; Support the deprecated options format (a string).
> +  (define (options-string->options-list str)
> +    (let ((option-list (string-split str #\,)))
> +      (map (lambda (param)
> +             (if (string-contains param "=")
> +                 (apply cons (string-split param #\=))
> +                 param))
> +           option-list)))

I think we’d want to split only on the first ‘=’ sign, meaning we should
use ‘string-index’ etc. instead of ‘string-split’.

> From 67135c925b07f2e077b4cd852e07178691a10164 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Tue, 11 Feb 2020 14:14:36 -0500
> Subject: [PATCH 6/9] gnu: linux-boot: Honor the "--root-options" kernel
>  argument.
>
> * gnu/build/linux-boot.scm (boot-system): Parse the "--root-options" kernel
> argument, and use it when calling `mount-root-file-system'.  Update doc.
> * doc/guix.texi (Initial RAM Disk): Document the use of the "--root-options"
> argument.

Hmm do we really need this extra option?  :-)

(Also, in hindsight, I think it was a mistake to call them
‘--something’.  Following the common naming convention, we should rather
call these options ‘gnu.something’.)

> From cb060af5ea56427e1fd63ced5f9c9edd3ae61f76 Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Tue, 11 Feb 2020 14:27:19 -0500
> Subject: [PATCH 7/9] gnu: linux-boot: Filter out file system independent
>  options.
>
> This fixes an issue where options such as "defaults", which are understood by
> the command line program "mount", are not understood by the system call of the
> same name, which is used in the initial RAM disk.
>
> * gnu/system/file-systems.scm (%file-system-independent-mount-options): New variable.
> (file-system-independent-mount-option?): New predicate.
> * gnu/build/linux-boot.scm (boot-system): Use the above predicate to filter
> out system independent mount options.

[...]

> +(define %file-system-independent-mount-options
> +  ;; Taken from 'man 8 mount'.
> +  '("async" "atime" "auto" "noatime" "noauto" "context" "defaults" "dev" "nodev"
> +    "diratime" "nodiratime" "dirsync" "exec" "noexec" "group" "iversion"
> +    "noiversion" "mand" "nomand" "_netdev" "nofail" "relatime" "norelatime"
> +    "strictatime" "nostrictatime" "lazytime" "nolazytime" "suid" "nosuid"
> +    "silent" "loud" "owner" "remount" "ro" "rw" "sync" "user" "nouser" "users"))

I’d rather avoid it.  In general, as much as possible, I think we should
pass options to the kernel without trying to be “smart”.  It’s often the
safe strategy.

WDYT?

> From 6cf2ece21683e98544f8f46675aef58d5a7231fd Mon Sep 17 00:00:00 2001
> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
> Date: Sun, 14 Jul 2019 20:50:23 +0900
> Subject: [PATCH 8/9] bootloader: grub: Allow booting from a Btrfs subvolume.
>
> * gnu/bootloader/grub.scm (grub-configuration-file) [btrfs-subvolume-path]:
> New parameter.  When it is defined, prepend its value to the kernel and
> initrd file paths.
> * gnu/bootloader/depthcharge.scm (depthcharge-configuration-file): Adapt.
> * gnu/bootloader/extlinux.scm (extlinux-configuration-file): Likewise.
> * gnu/system/file-systems.scm (btrfs-subvolume?)
> (btrfs-store-subvolume-path): New procedures.
> * gnu/system.scm (operating-system-bootcfg): Specify the Btrfs subvolume path
> of the GNU store to the `operating-system-bootcfg' procedure, using the new
> BTRFS-SUBVOLUME-PATH argument.
> * doc/guix.texi (File Systems): Add a Btrfs subsection to document the use of
> subvolumes.  Document the new `properties' field of the `<file-system>'
> record.
> * gnu/tests/install.scm: Add test "btrfs-root-on-subvolume-os".

Neat!

>  (define* (grub-configuration-file config entries
>                                    #:key
>                                    (system (%current-system))
> -                                  (old-entries '()))
> +                                  (old-entries '())
> +                                  btrfs-subvolume-path)
>    "Return the GRUB configuration file corresponding to CONFIG, a
>  <bootloader-configuration> object, and where the store is available at
>  STORE-FS, a <file-system> object.  OLD-ENTRIES is taken to be a list of menu
> -entries corresponding to old generations of the system."
> +entries corresponding to old generations of the system.  BTRFS-SUBVOLUME-PATH
> +may be used to specify on which subvolume a Btrfs root file system resides."

(Nitpick: s/path/file name/ :-))

It’s a bit problematic that (1) GRUB needs explicit Btrfs support, and
(2) other bootloaders will silently ignore the option, due to
#:allow-other-keys.

I don’t have a better idea, but it’d be great if Btrfs support could be
made bootloader-independent, and if it could be somewhat
not-too-btrfs-specific, if that is possible at all.

Thoughts?

> +  (properties       file-system-properties        ; list of name-value pairs
> +                    (default '()))
>    (location         file-system-location
>                      (default (current-source-location))
>                      (innate)))
> @@ -582,4 +589,48 @@ system has the given TYPE."
>      (or (string-prefix-ci? "x-" option-name)
>          (member option-name %file-system-independent-mount-options))))
>  
> +(define (btrfs-subvolume? fs)
> +  "Predicate to check if FS, a file-system object, is a Btrfs subvolume."
> +  (and-let* ((btrfs-file-system? (string= "btrfs" (file-system-type fs)))
> +             (option-keys (map (match-lambda
> +                                 ((key . value) key)
> +                                 (key key))
> +                               (file-system-options fs))))
> +    (find (cut string-prefix? "subvol" <>) option-keys)))

I wonder if we can avoid special support in the <file-system> API for
Btrfs.

> +              (error "The store is on a Btrfs subvolume, but the \
> +subvolume name is unknown.
> +Hint: Define the \"btrfs-subvolume-path\" file system property or
> +use the \"subvol\" Btrfs file system option."))))

Rather use ‘raise’ with ‘&message’ and ‘&fix-hint’ conditions.

Pheeew, that was a long patch series!

Perhaps we can split the continuation of this thread in sizable chunks!

Thanks for working on this,
Ludo’.
Maxim Cournoyer Feb. 16, 2020, 5:36 a.m. UTC | #2
Hello!

Thanks for the prompt feedback.

Ludovic Courtès <ludo@gnu.org> writes:

> Hi Maxim!
>
> Great to see Btrfs support improving; many people will love that!
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> From 3640bea548826e1c1ec9b766da1fdfe4791d7452 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Sun, 17 Nov 2019 06:01:00 +0900
>> Subject: [PATCH 1/9] gnu: tests: Reduce the time required to run the system
>>  tests.
>>
>> When setting the GUIX_DEV_HACKS environment variable, the Guix package used
>> inside the instrumented VMs recycles the binaries already found in the Guix
>> checkout of the developer instead of rebuilding Guix from scratch.  This
>> brings the time required for this component from 20+ minutes down to 2-3
>> minutes on an X200 machine.
>>
>> * gnu/packages/package-management.scm (current-guix/pre-built): New procedure.
>> * build-aux/run-system-tests.scm (tests-for-channel-instance): Use it, when
>> GUIX_DEV_HACKS is defined.
>
> I understand the need, but I’d really like to avoid that; it’s too
> fragile IMO.
>
> But I have good news!  First, commit
> 887fd835a7c90f720d36a211478012547feaead0 really improved things by
> avoiding the full ‘guix’ package rebuild (and we’re only talking about
> the installation tests; other tests are just fine.)  Second, there are
> improvements to Guile that will appear in 3.0.1/2.2.7 that make
> compilation of big files roughly twice as fast.
>
> There’s still room for improvement, but I’d rather work in those
> directions.  WDYT?

With a little bit more love (inheriting from the Guix package from the
inferior captured at build time), I don't see the hack being any less
fragile than the Guix checkout compiled and ran with ./pre-inst-env.

There's no arguing that it *is* a hack, but:

1) Being labeled as such (GUIX_DEV_HACKS)
2) Being undocumented
3) Only being enabled explicitly (through that GUIX_DEV_HACKS
environment variable)
4) Can be reverted easily in the future when the default, clean implementation is
fast enough to obsolete it.

To me means its targeted to developers who understand the nature of the
hack and is provided without any warranty.

And while it's exciting that Guile's compilation time is going to be
improved, no compiler's going to be as efficient as "no compilation"
;-).

>> From 97d8a635eba34c7cf0708e99bf77ef9bad1344bf Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Tue, 11 Feb 2020 12:57:29 -0500
>> Subject: [PATCH 2/9] gnu: linux-boot: Ensure volatile root is mounted
>>  read-only.
>>
>> * gnu/build/linux-boot.scm (mount-root-file-system): Ensure MS_RDONLY is
>> present among the root file system flags when VOLATILE-ROOT? is #t.
>
> (You can drop the “gnu:” prefix.)

Done.

I never know before looking at past logs (and then sometimes it's a
mixed bag).  Is there any mechanical process for selecting the right
commit prefix? :-)

> OK!
>
>> From e8d6642d3597207657842c9ca4849f8660d06638 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Tue, 11 Feb 2020 23:56:45 -0500
>> Subject: [PATCH 3/9] file-systems: Add a 'file-system-device->string'
>>  procedure.
>>
>> * gnu/system/file-systems.scm (file-system-device->string): New procedure.
>> * gnu/system.scm (bootable-kernel-arguments): Use it.
>> * gnu/system/vm.scm (operating-system-uuid): Likewise.
>> * guix/scripts/system.scm (display-system-generation): Likewise.
>
> OK!
>
>> From 4f6e3955957beb5287e9d5a5d33b74725836e1ac Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Tue, 11 Feb 2020 14:00:06 -0500
>> Subject: [PATCH 4/9] gnu: linux-boot: Refactor boot-system.
>>
>> The --root option can now be omitted, and inferred from the root file system
>> declaration instead.
>>
>> * gnu/build/linux-boot.scm (boot-system): Remove nested definitions for
>> root-fs-type, root-fs-flags and root-fs-options, and bind those inside the
>> let* instead.  Make "--root" take precendence over the device field string
>> representation of the root file system.
>> * doc/guix.texi (Initial RAM Disk): Document that "--root" can be left
>> unspecified.
>
> [...]
>
>>  @item --root=@var{root}
>> -Mount @var{root} as the root file system.  @var{root} can be a
>> -device name like @code{/dev/sda1}, a file system label, or a file system
>> -UUID.
>> +Mount @var{root} as the root file system.  @var{root} can be a device
>> +name like @code{/dev/sda1}, a file system label, or a file system UUID.
>> +When unspecified, the device name from the root file system of the
>> +operating system declaration is used.
>
> Oh!  Does it always work?  That makes me wonder why we’ve been carrying
> ‘--root’ and I’m not sure if I’m forgetting a good reason to do it that
> way.

If the documentation is accurate, it should :-), given that --root gets
written as a string to the GRUB configuration file, and that the doc
says it's possible to give it as a device name, label or UUID.

About why providing options such as --root or --root-options in the
first place; I pondered about this as well, especially after making the
file systems from operating system able to be mounted with all their
(file system independent -- more on that later) options.  A reason I
came up with was that it allows to experiment at the GRUB command line
and change the root device, or perhaps the root options.  One use case
would be debugging the right options to pass to a file system driver in
case of a mistake in the operating system declaration.

>
>>        (mount-essential-file-systems)
>>        (let* ((args    (linux-command-line))
>>               (to-load (find-long-option "--load" args))
>> -             (root    (find-long-option "--root" args)))
>> +             (root-fs (find root-mount-point? mounts))
>> +             (root-fs-type (or (and=> root-fs file-system-type)
>> +                               "ext4"))
>> +             (root-device (and=> root-fs file-system-device))
>> +             (root-device-str (and=> root-device file-system-device->string))
>> +             ;; --root takes precedence over the 'device' field of the root
>> +             ;; <file-system> record.
>> +             (root (or (find-long-option "--root" args)
>> +                       root-device-str))
>> +             (root-fs-flags (mount-flags->bit-mask
>> +                             (or (and=> root-fs file-system-flags)
>> +                                 '())))
>> +             (root-fs-options (if root-fs
>> +                                  (file-system-options root-fs)
>> +                                  '()))
>> +             (root-options (if (null? root-fs-options)
>> +                               #f
>> +                               (file-system-options->str
>> +                                root-fs-options))))
>
> Since ‘file-system-device->string’ is lossy, I think we should do it the
> other way around: convert the ‘--root’ string to a <file-system>.
> Perhaps that bit can be moved to a separate procedure, like
> ‘root-string->file-system’.

Done!

> Also, the (or … "ext4") bit doesn’t sound great, but perhaps it’ll be
> unnecessary if we do with something as outlined above?

That bit was already there; it exists because we allow an operating
system definition to be made without any root file system (for use with
the 'guix system container' or 'guix system vm' commands IIRC).  What
could be simpler would be to always require such information to be
embedded in the operating system declaration: when an interactive user
command would need to modify the root file system, it could create a new
operating-system object, inheriting from the original one.

>> From af61745d8b686755a5d9deb9e21c9eac624fb43e Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Wed, 25 Sep 2019 22:43:41 +0900
>> Subject: [PATCH 5/9] file-systems: Represent the file system options as an
>>  alist.
>>
>> This allows accessing the parameter values easily, without having to parse a
>> string.
>>
>> * gnu/system/file-systems.scm (<file-system>): Update the default value of the
>> OPTIONS field, doc.
>> (%file-system-options): Field accessor renamed from `file-system-options'.
>> (file-system-options, file-system-options->string): New procedures.
>> * gnu/build/file-systems.scm (mount-file-system): Adapt.
>> * gnu/services/base.scm (file-system->fstab-entry): Likewise.
>> * tests/file-systems.scm: New tests.
>> * doc/guix.texi (File Systems): Document the modified default value of the
>> 'file-system-options' field.
>
> The main issue I see with this change is that mount(2) takes raw strings
> for the options.  There’s a convention to have those strings look like
> “KEY1=VALUE1,KEY2=VALUE2”, but it’s just a convention.
>
> As a rule of thumb, I’d rather have our interface be as close as
> possible to the actual mount(2) interface, which means taking strings.
>
> Now, we can surely add helper procedures to parse options that follow
> the above conventions.
>
> WDYT?

To me, it's an implementation detail that I'd rather abstract away (or
make optional, like in this patch).  Just like we provide a higher level
configuration for services instead of requiring the user to input the
configuration in the native format of the tool (or allowing for both).
The idea for this format was taken from a discussion here:
http://issues.guix.info/issue/33517#3.

Are we really targeting mount(2)?  The commit
9d3053819dfd834a1c29a03427c41d8524b8a7d5 (which you co-authored :-))
mentions 'man 8 mount' for the file system options. This should be
stressed in the manual, as someone attempting to pass file system
independent options (perhaps taking their options line from a previous
/etc/fstab config), could prevent the system from booting.

I mistakenly was under the impression that mount(8) was targeted, since
the main interaction I observed between a file system object and my
system was the /etc/fstab file getting produced.

Doesn't targeting mount(2) mean that it becomes impossible to give all
the options someone would possibly want to produce their /etc/fstab
file?  Suppose you want the 'ro' option for one of your partition in
/etc/fstab; that wouldn't fly currently, unless I'm missing something.
It seems to me we should target mount(8), and internally translate the
file system independent mount(8) options into mount(2) flags or
vice-versa but targeting the higher level and going down makes more
sense to me.

>> +  ;; Support the deprecated options format (a string).
>> +  (define (options-string->options-list str)
>> +    (let ((option-list (string-split str #\,)))
>> +      (map (lambda (param)
>> +             (if (string-contains param "=")
>> +                 (apply cons (string-split param #\=))
>> +                 param))
>> +           option-list)))
>
> I think we’d want to split only on the first ‘=’ sign, meaning we should
> use ‘string-index’ etc. instead of ‘string-split’.

Good catch!

>> From 67135c925b07f2e077b4cd852e07178691a10164 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Tue, 11 Feb 2020 14:14:36 -0500
>> Subject: [PATCH 6/9] gnu: linux-boot: Honor the "--root-options" kernel
>>  argument.
>>
>> * gnu/build/linux-boot.scm (boot-system): Parse the "--root-options" kernel
>> argument, and use it when calling `mount-root-file-system'.  Update doc.
>> * doc/guix.texi (Initial RAM Disk): Document the use of the "--root-options"
>> argument.
>
> Hmm do we really need this extra option?  :-)

It is not strictly needed but allows the user to experiment/troubleshoot
with the init RAM disk from GRUB as discussed earlier for --root.  Do
you think it has enough value to be kept?

> (Also, in hindsight, I think it was a mistake to call them
> ‘--something’.  Following the common naming convention, we should rather
> call these options ‘gnu.something’.)

Is this convention detailed somewhere?  I haven't found it in 'Standards'.

>> From cb060af5ea56427e1fd63ced5f9c9edd3ae61f76 Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>> Date: Tue, 11 Feb 2020 14:27:19 -0500
>> Subject: [PATCH 7/9] gnu: linux-boot: Filter out file system independent
>>  options.
>>
>> This fixes an issue where options such as "defaults", which are understood by
>> the command line program "mount", are not understood by the system call of the
>> same name, which is used in the initial RAM disk.
>>
>> * gnu/system/file-systems.scm (%file-system-independent-mount-options): New variable.
>> (file-system-independent-mount-option?): New predicate.
>> * gnu/build/linux-boot.scm (boot-system): Use the above predicate to filter
>> out system independent mount options.
>
> [...]
>
>> +(define %file-system-independent-mount-options
>> +  ;; Taken from 'man 8 mount'.
>> +  '("async" "atime" "auto" "noatime" "noauto" "context" "defaults" "dev" "nodev"
>> +    "diratime" "nodiratime" "dirsync" "exec" "noexec" "group" "iversion"
>> +    "noiversion" "mand" "nomand" "_netdev" "nofail" "relatime" "norelatime"
>> +    "strictatime" "nostrictatime" "lazytime" "nolazytime" "suid" "nosuid"
>> +    "silent" "loud" "owner" "remount" "ro" "rw" "sync" "user" "nouser" "users"))
>
> I’d rather avoid it.  In general, as much as possible, I think we should
> pass options to the kernel without trying to be “smart”.  It’s often the
> safe strategy.

Let's revisit this topic after we've sorted the mount(8) vs mount(2)
problem above.

I'll address the rest in a part 2.

Thank you for your time and patience!

Maxim
Ludovic Courtès Feb. 16, 2020, 11:11 a.m. UTC | #3
Hi Maxim,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi Maxim!
>>
>> Great to see Btrfs support improving; many people will love that!
>>
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>
>>> From 3640bea548826e1c1ec9b766da1fdfe4791d7452 Mon Sep 17 00:00:00 2001
>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>>> Date: Sun, 17 Nov 2019 06:01:00 +0900
>>> Subject: [PATCH 1/9] gnu: tests: Reduce the time required to run the system
>>>  tests.
>>>
>>> When setting the GUIX_DEV_HACKS environment variable, the Guix package used
>>> inside the instrumented VMs recycles the binaries already found in the Guix
>>> checkout of the developer instead of rebuilding Guix from scratch.  This
>>> brings the time required for this component from 20+ minutes down to 2-3
>>> minutes on an X200 machine.
>>>
>>> * gnu/packages/package-management.scm (current-guix/pre-built): New procedure.
>>> * build-aux/run-system-tests.scm (tests-for-channel-instance): Use it, when
>>> GUIX_DEV_HACKS is defined.
>>
>> I understand the need, but I’d really like to avoid that; it’s too
>> fragile IMO.
>>
>> But I have good news!  First, commit
>> 887fd835a7c90f720d36a211478012547feaead0 really improved things by
>> avoiding the full ‘guix’ package rebuild (and we’re only talking about
>> the installation tests; other tests are just fine.)  Second, there are
>> improvements to Guile that will appear in 3.0.1/2.2.7 that make
>> compilation of big files roughly twice as fast.
>>
>> There’s still room for improvement, but I’d rather work in those
>> directions.  WDYT?
>
> With a little bit more love (inheriting from the Guix package from the
> inferior captured at build time), I don't see the hack being any less
> fragile than the Guix checkout compiled and ran with ./pre-inst-env.

It’s grabbing files from the working tree, with heuristics to minimize
the changes in incorporating files that shouldn’t be there; I think it’s
fragile.  :-)

It also breaks that idea that things compiled by Guix are well under
control.

> There's no arguing that it *is* a hack, but:
>
> 1) Being labeled as such (GUIX_DEV_HACKS)
> 2) Being undocumented
> 3) Only being enabled explicitly (through that GUIX_DEV_HACKS
> environment variable)
> 4) Can be reverted easily in the future when the default, clean implementation is
> fast enough to obsolete it.
>
> To me means its targeted to developers who understand the nature of the
> hack and is provided without any warranty.
>
> And while it's exciting that Guile's compilation time is going to be
> improved, no compiler's going to be as efficient as "no compilation"
> ;-).

True!  But still, we’d have to maintain that in the meantime and deal
with any “surprising” effects it has for those using it.

FWIW, when testing (gnu installer tests), I only have to rebuild
“guix-system-tests.drv”, which is fast.  And in fact, I can also not
rebuild anything by doing:

  (define operating-system-with-current-guix identity)

because in this particular case, I know it’s not necessary to have the
current ‘guix’ package.

I suppose this would also be an option in your case, when testing the
Btrfs changes, no?

Perhaps we should make this case more easily accessible through an
environment variable?

Thanks,
Ludo’.
Ludovic Courtès Feb. 24, 2020, 4:02 p.m. UTC | #4
Hi Maxim,

Resuming review of this series…  Sorry for the delay!

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

>>> From 97d8a635eba34c7cf0708e99bf77ef9bad1344bf Mon Sep 17 00:00:00 2001
>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>>> Date: Tue, 11 Feb 2020 12:57:29 -0500
>>> Subject: [PATCH 2/9] gnu: linux-boot: Ensure volatile root is mounted
>>>  read-only.
>>>
>>> * gnu/build/linux-boot.scm (mount-root-file-system): Ensure MS_RDONLY is
>>> present among the root file system flags when VOLATILE-ROOT? is #t.
>>
>> (You can drop the “gnu:” prefix.)
>
> Done.
>
> I never know before looking at past logs (and then sometimes it's a
> mixed bag).  Is there any mechanical process for selecting the right
> commit prefix? :-)

“gnu:” is for changes to (gnu packages).  The idea is that the prefix
should reflect what subsystem the commit is modifying.  But yeah,
looking at ‘git log’ can be inspiring.  :-)

>>>  @item --root=@var{root}
>>> -Mount @var{root} as the root file system.  @var{root} can be a
>>> -device name like @code{/dev/sda1}, a file system label, or a file system
>>> -UUID.
>>> +Mount @var{root} as the root file system.  @var{root} can be a device
>>> +name like @code{/dev/sda1}, a file system label, or a file system UUID.
>>> +When unspecified, the device name from the root file system of the
>>> +operating system declaration is used.
>>
>> Oh!  Does it always work?  That makes me wonder why we’ve been carrying
>> ‘--root’ and I’m not sure if I’m forgetting a good reason to do it that
>> way.
>
> If the documentation is accurate, it should :-), given that --root gets
> written as a string to the GRUB configuration file, and that the doc
> says it's possible to give it as a device name, label or UUID.

Yes, ‘--root’ can resolve labels and UUIDs; my question was more about
why we have it in the first place.

> About why providing options such as --root or --root-options in the
> first place; I pondered about this as well, especially after making the
> file systems from operating system able to be mounted with all their
> (file system independent -- more on that later) options.  A reason I
> came up with was that it allows to experiment at the GRUB command line
> and change the root device, or perhaps the root options.  One use case
> would be debugging the right options to pass to a file system driver in
> case of a mistake in the operating system declaration.

Yes, that makes sense.  It’s certainly useful to have ‘--root’ at least
as an option.

>> The main issue I see with this change is that mount(2) takes raw strings
>> for the options.  There’s a convention to have those strings look like
>> “KEY1=VALUE1,KEY2=VALUE2”, but it’s just a convention.
>>
>> As a rule of thumb, I’d rather have our interface be as close as
>> possible to the actual mount(2) interface, which means taking strings.
>>
>> Now, we can surely add helper procedures to parse options that follow
>> the above conventions.
>>
>> WDYT?
>
> To me, it's an implementation detail that I'd rather abstract away (or
> make optional, like in this patch).  Just like we provide a higher level
> configuration for services instead of requiring the user to input the
> configuration in the native format of the tool (or allowing for both).
> The idea for this format was taken from a discussion here:
> http://issues.guix.info/issue/33517#3.
>
> Are we really targeting mount(2)?  The commit
> 9d3053819dfd834a1c29a03427c41d8524b8a7d5 (which you co-authored :-))
> mentions 'man 8 mount' for the file system options.

Right, mount(8) documents file system options that can be passed to
mount(2).

What does it mean to target mount(8) vs. mount(2)?  To me, mount(8) is a
CLI to mount(2) that provides additional features to make the CLI more
convenient: the “defaults” option, a way to pass mount(2) flags as
options (like “ro”, “remount”, “bind”), /etc/fstab handling, etc.

Guix System handles /etc/fstab differently and “defaults” makes little
sense in our API (one can just use leave the default value of the
‘options’ field.)

I think mount(8) is actually a good illustration of what not to do.  It
ends up mixing things that are separate in the mount(2) API, and that
doesn’t improve clarity and future-proof-ness (what if a file system has
a “bind” option, etc.).

But again, I think the helper procedures that you propose to move back
and forth between the string and the alist representations are very
welcome.  I just wouldn’t hard-code that directly in our API.

WDYT?

>>> From 67135c925b07f2e077b4cd852e07178691a10164 Mon Sep 17 00:00:00 2001
>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>>> Date: Tue, 11 Feb 2020 14:14:36 -0500
>>> Subject: [PATCH 6/9] gnu: linux-boot: Honor the "--root-options" kernel
>>>  argument.
>>>
>>> * gnu/build/linux-boot.scm (boot-system): Parse the "--root-options" kernel
>>> argument, and use it when calling `mount-root-file-system'.  Update doc.
>>> * doc/guix.texi (Initial RAM Disk): Document the use of the "--root-options"
>>> argument.
>>
>> Hmm do we really need this extra option?  :-)
>
> It is not strictly needed but allows the user to experiment/troubleshoot
> with the init RAM disk from GRUB as discussed earlier for --root.  Do
> you think it has enough value to be kept?

I’d rather avoid it for now.  Less code is better.  :-)

>> (Also, in hindsight, I think it was a mistake to call them
>> ‘--something’.  Following the common naming convention, we should rather
>> call these options ‘gnu.something’.)
>
> Is this convention detailed somewhere?  I haven't found it in 'Standards'.

It’s a convention of the Linux kernel, I don’t know if it’s documented.

That’s it!

Ludo’.
Maxim Cournoyer March 3, 2020, 5 a.m. UTC | #5
Hello Ludovic!

Resuming review; I hope I'll be able to bring this series to completion
this time.

[...]

> “gnu:” is for changes to (gnu packages).  The idea is that the prefix
> should reflect what subsystem the commit is modifying.  But yeah,
> looking at ‘git log’ can be inspiring.  :-)

OK.  Makes sense.  Thanks for taking the time to explain!

[...]

>> About why providing options such as --root or --root-options in the
>> first place; I pondered about this as well, especially after making the
>> file systems from operating system able to be mounted with all their
>> (file system independent -- more on that later) options.  A reason I
>> came up with was that it allows to experiment at the GRUB command line
>> and change the root device, or perhaps the root options.  One use case
>> would be debugging the right options to pass to a file system driver in
>> case of a mistake in the operating system declaration.
>
> Yes, that makes sense.  It’s certainly useful to have ‘--root’ at least
> as an option.
>
>>> The main issue I see with this change is that mount(2) takes raw strings
>>> for the options.  There’s a convention to have those strings look like
>>> “KEY1=VALUE1,KEY2=VALUE2”, but it’s just a convention.
>>>
>>> As a rule of thumb, I’d rather have our interface be as close as
>>> possible to the actual mount(2) interface, which means taking strings.
>>>
>>> Now, we can surely add helper procedures to parse options that follow
>>> the above conventions.
>>>
>>> WDYT?

I don't feel too strongly about it, so I will adapt to your preference.

[...]

>> Are we really targeting mount(2)?  The commit
>> 9d3053819dfd834a1c29a03427c41d8524b8a7d5 (which you co-authored :-))
>> mentions 'man 8 mount' for the file system options.
>
> Right, mount(8) documents file system options that can be passed to
> mount(2).
>
> What does it mean to target mount(8) vs. mount(2)?  To me, mount(8) is a
> CLI to mount(2) that provides additional features to make the CLI more
> convenient: the “defaults” option, a way to pass mount(2) flags as
> options (like “ro”, “remount”, “bind”), /etc/fstab handling, etc.
>
> Guix System handles /etc/fstab differently and “defaults” makes little
> sense in our API (one can just use leave the default value of the
> ‘options’ field.)
>
> I think mount(8) is actually a good illustration of what not to do.  It
> ends up mixing things that are separate in the mount(2) API, and that
> doesn’t improve clarity and future-proof-ness (what if a file system has
> a “bind” option, etc.).

I think I agree.  It may be cleaner to use the API of mount(2), while
mapping the low level stuff (binary flags?) to saner symbols, like it's
already done.

> But again, I think the helper procedures that you propose to move back
> and forth between the string and the alist representations are very
> welcome.  I just wouldn’t hard-code that directly in our API.

Okay.  I'll keep them around in (gnu system file-systems); they're handy
when we need to check the presence of an option in the option string.

> WDYT?
>
>>>> From 67135c925b07f2e077b4cd852e07178691a10164 Mon Sep 17 00:00:00 2001
>>>> From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
>>>> Date: Tue, 11 Feb 2020 14:14:36 -0500
>>>> Subject: [PATCH 6/9] gnu: linux-boot: Honor the "--root-options" kernel
>>>>  argument.
>>>>
>>>> * gnu/build/linux-boot.scm (boot-system): Parse the "--root-options" kernel
>>>> argument, and use it when calling `mount-root-file-system'.  Update doc.
>>>> * doc/guix.texi (Initial RAM Disk): Document the use of the "--root-options"
>>>> argument.
>>>
>>> Hmm do we really need this extra option?  :-)
>>
>> It is not strictly needed but allows the user to experiment/troubleshoot
>> with the init RAM disk from GRUB as discussed earlier for --root.  Do
>> you think it has enough value to be kept?
>
> I’d rather avoid it for now.  Less code is better.  :-)

Done!

>>> (Also, in hindsight, I think it was a mistake to call them
>>> ‘--something’.  Following the common naming convention, we should rather
>>> call these options ‘gnu.something’.)
>>
>> Is this convention detailed somewhere?  I haven't found it in 'Standards'.
>
> It’s a convention of the Linux kernel, I don’t know if it’s documented.

Well, with Hurd support coming, perhaps the standards used for a
particular kernel are not too relevant? :-)

One last thing that I'll try to understand before considering this
finished: I think adding a special properties field to the file system
record might not be necessary, as it seems that the subvol option can
take an absolute path as well as a relative path (I thought it only
accepted subvolume *names*.)

Will send the 3rd revision soon.

Thank you!

Maxim
diff mbox series

Patch

From 6cf2ece21683e98544f8f46675aef58d5a7231fd Mon Sep 17 00:00:00 2001
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Date: Sun, 14 Jul 2019 20:50:23 +0900
Subject: [PATCH 8/9] bootloader: grub: Allow booting from a Btrfs subvolume.

* gnu/bootloader/grub.scm (grub-configuration-file) [btrfs-subvolume-path]:
New parameter.  When it is defined, prepend its value to the kernel and
initrd file paths.
* gnu/bootloader/depthcharge.scm (depthcharge-configuration-file): Adapt.
* gnu/bootloader/extlinux.scm (extlinux-configuration-file): Likewise.
* gnu/system/file-systems.scm (btrfs-subvolume?)
(btrfs-store-subvolume-path): New procedures.
* gnu/system.scm (operating-system-bootcfg): Specify the Btrfs subvolume path
of the GNU store to the `operating-system-bootcfg' procedure, using the new
BTRFS-SUBVOLUME-PATH argument.
* doc/guix.texi (File Systems): Add a Btrfs subsection to document the use of
subvolumes.  Document the new `properties' field of the `<file-system>'
record.
* gnu/tests/install.scm: Add test "btrfs-root-on-subvolume-os".
---
 doc/guix.texi                  | 114 +++++++++++++++++++++++++++++++++
 gnu/bootloader/depthcharge.scm |   3 +-
 gnu/bootloader/extlinux.scm    |   3 +-
 gnu/bootloader/grub.scm        |  42 +++++++-----
 gnu/system.scm                 |   9 ++-
 gnu/system/file-systems.scm    |  51 +++++++++++++++
 gnu/tests/install.scm          |  87 +++++++++++++++++++++++++
 7 files changed, 290 insertions(+), 19 deletions(-)

diff --git a/doc/guix.texi b/doc/guix.texi
index d6bfbd7b55..f0956f965a 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -11442,6 +11442,13 @@  a dependency of @file{/sys/fs/cgroup/cpu} and
 
 Another example is a file system that depends on a mapped device, for
 example for an encrypted partition (@pxref{Mapped Devices}).
+
+@item @code{properties} (default: @code{'()})
+This is a list of key-value pairs that can be used to specify properties
+not captured by other fields.  For example, the top level path of a
+Btrfs subvolume within its Btrfs pool can be specified using the
+@code{btrfs-subvolume-path} property (@pxref{Btrfs file system}).
+
 @end table
 @end deftp
 
@@ -11491,6 +11498,113 @@  and unmount user-space FUSE file systems.  This requires the
 @code{fuse.ko} kernel module to be loaded.
 @end defvr
 
+@node Btrfs file system
+@subsection Btrfs file system
+
+The Btrfs has special features, such as subvolumes, that merit being
+explained in more details.  The following section attempts to cover
+basic as well as complex uses of a Btrfs file system with the Guix
+System.
+
+In its simplest usage, a Btrfs file system can be described, for
+example, by:
+
+@lisp
+(file-system
+  (mount-point "/home")
+  (type "btrfs")
+  (device (file-system-label "my-home")))
+@end lisp
+
+The example below is more complex, as it makes use of a Btrfs
+subvolume, named @code{rootfs}.  The parent Btrfs file system is labeled
+@code{my-btrfs-pool}, and is located on an encrypted device (hence the
+dependency on @code{mapped-devices}):
+
+@example
+(file-system
+  (device (file-system-label "my-btrfs-pool"))
+  (mount-point "/")
+  (type "btrfs")
+  (options '("defaults" ("subvol" . "rootfs"))
+  (dependencies mapped-devices))
+@end example
+
+Some bootloaders, for example GRUB, only mount a Btrfs partition at its
+top level during the early boot, and rely on their configuration to
+refer to the correct subvolume path within that top level.  The
+bootloaders operating in this way typically produce their configuration
+on a running system where the Btrfs partitions are already mounted and
+where the subvolume information is readily available.  As an example,
+@command{grub-mkconfig}, the configuration generator command shipped
+with GRUB, reads @file{/proc/self/mountinfo} to determine the top-level
+path of a subvolume.
+
+The Guix System produces a bootloader configuration using the operating
+system configuration as its sole input; it is therefore necessary to
+extract the subvolume name on which @file{/gnu/store} lives (if any)
+from that operating system configuration.  To better illustrate,
+consider a subvolume named 'rootfs' which contains the root file system
+data.  In such situation, the GRUB bootloader would only see the top
+level of the root Btrfs partition, e.g.:
+
+@example
+/                   (top level)
+├── rootfs          (subvolume directory)
+    ├── gnu         (normal directory)
+        ├── store   (normal directory)
+[...]
+@end example
+
+Thus, the subvolume name must be prepended to the @file{/gnu/store} path
+of the kernel and initrd binaries in the GRUB configuration in order for
+those to be found.
+
+The next example shows a nested hierarchy of subvolumes and
+directories:
+
+@example
+/                   (top level)
+├── rootfs          (subvolume)
+    ├── gnu         (normal directory)
+        ├── store   (subvolume)
+[...]
+@end example
+
+This scenario would work without mounting the 'store' subvolume.
+Mounting 'rootfs' is sufficient, since the subvolume name matches its
+intended mount point in the file system hierarchy.
+
+Finally, a more contrived example of nested subvolumes:
+
+@example
+/                           (top level)
+├── root-snapshots          (subvolume)
+    ├── root-current        (subvolume)
+        ├── guix-store      (subvolume)
+[...]
+@end example
+
+Here, the 'guix-store' module name doesn't match its intended mount
+point, so it is necessary to mount it.  The layout cannot simply be
+described by the <file-system> record, so it is required to specify the
+exact path at which the subvolume exists within the top level of its
+parent file system.  This can be achieved by attaching a
+@code{btrfs-subvolume-path} property to the corresponding file system
+record:
+
+@lisp
+(file-system
+  ...
+  (properties '((btrfs-subvolume-path
+                 . "/root-snapshots/root-current/guix-store"))))
+@end lisp
+
+The default behavior of Guix is to assume that a subvolume exists
+directly at the root of the top volume hierarchy.  When this is not the
+case, the above property must be used for the system to boot correctly
+when using a GRUB based bootloader.
+
 @node Mapped Devices
 @section Mapped Devices
 
diff --git a/gnu/bootloader/depthcharge.scm b/gnu/bootloader/depthcharge.scm
index 58cc3f3932..0a50374bd9 100644
--- a/gnu/bootloader/depthcharge.scm
+++ b/gnu/bootloader/depthcharge.scm
@@ -82,7 +82,8 @@ 
 (define* (depthcharge-configuration-file config entries
                                          #:key
                                          (system (%current-system))
-                                         (old-entries '()))
+                                         (old-entries '())
+                                         #:allow-other-keys)
   (match entries
     ((entry)
      (let ((kernel (menu-entry-linux entry))
diff --git a/gnu/bootloader/extlinux.scm b/gnu/bootloader/extlinux.scm
index 5b4dd84965..6b5ff298e7 100644
--- a/gnu/bootloader/extlinux.scm
+++ b/gnu/bootloader/extlinux.scm
@@ -28,7 +28,8 @@ 
 (define* (extlinux-configuration-file config entries
                                       #:key
                                       (system (%current-system))
-                                      (old-entries '()))
+                                      (old-entries '())
+                                      #:allow-other-keys)
   "Return the U-Boot configuration file corresponding to CONFIG, a
 <u-boot-configuration> object, and where the store is available at STORE-FS, a
 <file-system> object.  OLD-ENTRIES is taken to be a list of menu entries
diff --git a/gnu/bootloader/grub.scm b/gnu/bootloader/grub.scm
index b99f5fa4f4..c9794c35c2 100644
--- a/gnu/bootloader/grub.scm
+++ b/gnu/bootloader/grub.scm
@@ -4,6 +4,7 @@ 
 ;;; Copyright © 2017 Leo Famulari <leo@famulari.name>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2019 Jan (janneke) Nieuwenhuizen <janneke@gnu.org>
+;;; Copyright © 2020 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -327,35 +328,46 @@  code."
 (define* (grub-configuration-file config entries
                                   #:key
                                   (system (%current-system))
-                                  (old-entries '()))
+                                  (old-entries '())
+                                  btrfs-subvolume-path)
   "Return the GRUB configuration file corresponding to CONFIG, a
 <bootloader-configuration> object, and where the store is available at
 STORE-FS, a <file-system> object.  OLD-ENTRIES is taken to be a list of menu
-entries corresponding to old generations of the system."
+entries corresponding to old generations of the system.  BTRFS-SUBVOLUME-PATH
+may be used to specify on which subvolume a Btrfs root file system resides."
   (define all-entries
     (append entries (bootloader-configuration-menu-entries config)))
   (define (menu-entry->gexp entry)
-    (let ((device (menu-entry-device entry))
-          (device-mount-point (menu-entry-device-mount-point entry))
-          (label (menu-entry-label entry))
-          (kernel (menu-entry-linux entry))
-          (arguments (menu-entry-linux-arguments entry))
-          (initrd (menu-entry-initrd entry)))
+    (let* ((device (menu-entry-device entry))
+           (device-mount-point (menu-entry-device-mount-point entry))
+           (label (menu-entry-label entry))
+           (arguments (menu-entry-linux-arguments entry))
+           (kernel* (strip-mount-point
+                     device-mount-point (menu-entry-linux entry)))
+           (initrd* (strip-mount-point
+                     device-mount-point (menu-entry-initrd entry)))
+           (kernel (if btrfs-subvolume-path
+                       #~(string-append #$btrfs-subvolume-path #$kernel*)
+                       kernel*))
+           (initrd (if btrfs-subvolume-path
+                       #~(string-append #$btrfs-subvolume-path #$initrd*)
+                       initrd*)))
       ;; Here DEVICE is the store and DEVICE-MOUNT-POINT is its mount point.
       ;; Use the right file names for KERNEL and INITRD in case
       ;; DEVICE-MOUNT-POINT is not "/", meaning that the store is on a
       ;; separate partition.
-      (let ((kernel  (strip-mount-point device-mount-point kernel))
-            (initrd  (strip-mount-point device-mount-point initrd)))
-        #~(format port "menuentry ~s {
+
+      ;; When BTRFS-SUBVOLUME-PATH is defined, prepend it the kernel and
+      ;; initrd paths, to allow booting from a Btrfs subvolume.
+      #~(format port "menuentry ~s {
   ~a
   linux ~a ~a
   initrd ~a
 }~%"
-                  #$label
-                  #$(grub-root-search device kernel)
-                  #$kernel (string-join (list #$@arguments))
-                  #$initrd))))
+                #$label
+                #$(grub-root-search device kernel)
+                #$kernel (string-join (list #$@arguments))
+                #$initrd)))
   (define sugar
     (eye-candy config
                (menu-entry-device (first all-entries))
diff --git a/gnu/system.scm b/gnu/system.scm
index 2e6d03272d..ebc8bf1db8 100644
--- a/gnu/system.scm
+++ b/gnu/system.scm
@@ -5,6 +5,7 @@ 
 ;;; Copyright © 2016 Chris Marusich <cmmarusich@gmail.com>
 ;;; Copyright © 2017 Mathieu Othacehe <m.othacehe@gmail.com>
 ;;; Copyright © 2019 Meiyo Peng <meiyo.peng@gmail.com>
+;;; Copyright © 2019 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -992,19 +993,23 @@  entry."
 (define* (operating-system-bootcfg os #:optional (old-entries '()))
   "Return the bootloader configuration file for OS.  Use OLD-ENTRIES,
 a list of <menu-entry>, to populate the \"old entries\" menu."
-  (let* ((root-fs         (operating-system-root-file-system os))
+  (let* ((file-systems    (operating-system-file-systems os))
+         (root-fs         (operating-system-root-file-system os))
          (root-device     (file-system-device root-fs))
          (params          (operating-system-boot-parameters
                            os root-device
                            #:system-kernel-arguments? #t))
          (entry           (boot-parameters->menu-entry params))
          (bootloader-conf (operating-system-bootloader os)))
+
     (define generate-config-file
       (bootloader-configuration-file-generator
        (bootloader-configuration-bootloader bootloader-conf)))
 
     (generate-config-file bootloader-conf (list entry)
-                          #:old-entries old-entries)))
+                          #:old-entries old-entries
+                          #:btrfs-subvolume-path (btrfs-store-subvolume-path
+                                                  file-systems))))
 
 (define* (operating-system-boot-parameters os root-device
                                            #:key system-kernel-arguments?)
diff --git a/gnu/system/file-systems.scm b/gnu/system/file-systems.scm
index 2c3c159d04..daef1c9d72 100644
--- a/gnu/system/file-systems.scm
+++ b/gnu/system/file-systems.scm
@@ -21,7 +21,9 @@ 
   #:use-module (ice-9 match)
   #:use-module (rnrs bytevectors)
   #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-2)
   #:use-module (srfi srfi-9)
+  #:use-module (srfi srfi-26)
   #:use-module (srfi srfi-9 gnu)
   #:use-module (guix records)
   #:use-module (gnu system uuid)
@@ -44,9 +46,12 @@ 
             file-system-create-mount-point?
             file-system-dependencies
             file-system-location
+            file-system-properties
 
             file-system-type-predicate
             file-system-independent-mount-option?
+            btrfs-subvolume?
+            btrfs-store-subvolume-path
 
             file-system-label
             file-system-label?
@@ -112,6 +117,8 @@ 
                        (default #f))
   (dependencies     file-system-dependencies      ; list of <file-system>
                     (default '()))                ; or <mapped-device>
+  (properties       file-system-properties        ; list of name-value pairs
+                    (default '()))
   (location         file-system-location
                     (default (current-source-location))
                     (innate)))
@@ -582,4 +589,48 @@  system has the given TYPE."
     (or (string-prefix-ci? "x-" option-name)
         (member option-name %file-system-independent-mount-options))))
 
+(define (btrfs-subvolume? fs)
+  "Predicate to check if FS, a file-system object, is a Btrfs subvolume."
+  (and-let* ((btrfs-file-system? (string= "btrfs" (file-system-type fs)))
+             (option-keys (map (match-lambda
+                                 ((key . value) key)
+                                 (key key))
+                               (file-system-options fs))))
+    (find (cut string-prefix? "subvol" <>) option-keys)))
+
+(define (btrfs-store-subvolume-path file-systems)
+  "Return the subvolume path within the Btrfs top level onto which the store
+is located.  When the BTRFS-SUBVOLUME-PATH file system property is not set, it
+is assumed that the store subvolume path is a located at the root of the top
+level of the file system."
+
+  (define (find-mount-point-fs mount-point file-systems)
+    (find (lambda (fs)
+            (string= mount-point (file-system-mount-point fs)))
+          file-systems))
+
+  ;; Find a subvolume mounted at either /gnu/store, /gnu, or /.
+  (let loop ((mount-point (%store-prefix)))
+    (let ((mount-point-fs (find-mount-point-fs mount-point file-systems)))
+      (cond
+       ((string-null? mount-point)
+        #f)                             ;store is not on a Btrfs subvolume
+       ((and=> mount-point-fs btrfs-subvolume?)
+        (let* ((fs-options (file-system-options mount-point-fs))
+               (subvolid (assoc-ref fs-options "subvolid"))
+               (subvol (assoc-ref fs-options "subvol")))
+          (or (assoc-ref (file-system-properties mount-point-fs)
+                         "btrfs-subvolume-path")
+              (and=> subvol (cut string-append "/" <>))
+              (error "The store is on a Btrfs subvolume, but the \
+subvolume name is unknown.
+Hint: Define the \"btrfs-subvolume-path\" file system property or
+use the \"subvol\" Btrfs file system option."))))
+       (else
+        (loop
+         (cond ((string-suffix? "/" mount-point)
+                (string-drop-right mount-point 1))
+               ((string-take mount-point
+                             (1+ (string-index-right mount-point #\/)))))))))))
+
 ;;; file-systems.scm ends here
diff --git a/gnu/tests/install.scm b/gnu/tests/install.scm
index d475bda2c7..b32130c2f3 100644
--- a/gnu/tests/install.scm
+++ b/gnu/tests/install.scm
@@ -44,6 +44,7 @@ 
             %test-raid-root-os
             %test-encrypted-root-os
             %test-btrfs-root-os
+            %test-btrfs-root-on-subvolume-os
             %test-jfs-root-os))
 
 ;;; Commentary:
@@ -811,6 +812,92 @@  build (current-guix) and then store a couple of full system images.")
                          (command (qemu-command/writable-image image)))
       (run-basic-test %btrfs-root-os command "btrfs-root-os")))))
 
+
+;;;
+;;; Btrfs root file system on a subvolume.
+;;;
+
+(define-os-with-source (%btrfs-root-on-subvolume-os
+                        %btrfs-root-on-subvolume-os-source)
+  ;; The OS we want to install.
+  (use-modules (gnu) (gnu tests) (srfi srfi-1))
+
+  (operating-system
+    (host-name "hurd")
+    (timezone "America/Montreal")
+    (locale "en_US.UTF-8")
+    (bootloader (bootloader-configuration
+                 (bootloader grub-bootloader)
+                 (target "/dev/vdb")))
+    (kernel-arguments '("console=ttyS0"))
+    (file-systems (cons* (file-system
+                           (device (file-system-label "btrfs-pool"))
+                           (mount-point "/")
+                           (options '(("subvol" . "rootfs")
+                                      ("compress" . "zstd")))
+                           (type "btrfs"))
+                         (file-system
+                           (device (file-system-label "btrfs-pool"))
+                           (mount-point "/home")
+                           (options '(("subvol" . "homefs")
+                                      ("compress" . "lzo")))
+                           (type "btrfs"))
+                         %base-file-systems))
+    (users (cons (user-account
+                  (name "charlie")
+                  (group "users")
+                  (supplementary-groups '("wheel" "audio" "video")))
+                 %base-user-accounts))
+    (services (cons (service marionette-service-type
+                             (marionette-configuration
+                              (imported-modules '((gnu services herd)
+                                                  (guix combinators)))))
+                    %base-services))))
+
+(define %btrfs-root-on-subvolume-installation-script
+  ;; Shell script of a simple installation.
+  "\
+. /etc/profile
+set -e -x
+guix --version
+
+export GUIX_BUILD_OPTIONS=--no-grafts
+ls -l /run/current-system/gc-roots
+parted --script /dev/vdb mklabel gpt \\
+  mkpart primary ext2 1M 3M \\
+  mkpart primary ext2 3M 2G \\
+  set 1 boot on \\
+  set 1 bios_grub on
+mkfs.btrfs -L btrfs-pool /dev/vdb2
+mount /dev/vdb2 /mnt
+btrfs subvolume create /mnt/rootfs
+btrfs subvolume create /mnt/homefs
+herd start cow-store /mnt/rootfs
+mkdir /mnt/rootfs/etc
+cp /etc/target-config.scm /mnt/rootfs/etc/config.scm
+guix system build /mnt/rootfs/etc/config.scm
+guix system init /mnt/rootfs/etc/config.scm /mnt/rootfs --no-substitutes
+sync
+reboot\n")
+
+(define %test-btrfs-root-on-subvolume-os
+  (system-test
+   (name "btrfs-root-on-subvolume-os")
+   (description
+    "Test basic functionality of an OS installed like one would do by hand.
+This test is expensive in terms of CPU and storage usage since we need to
+build (current-guix) and then store a couple of full system images.")
+   (value
+    (mlet* %store-monad
+        ((image
+          (run-install %btrfs-root-on-subvolume-os
+                       %btrfs-root-on-subvolume-os-source
+                       #:script
+                       %btrfs-root-on-subvolume-installation-script))
+         (command (qemu-command/writable-image image)))
+      (run-basic-test %btrfs-root-on-subvolume-os command
+                      "btrfs-root-on-subvolume-os")))))
+
 
 ;;;
 ;;; JFS root file system.
-- 
2.23.0