[bug#76501,0/1] Fix deployment of smaller Hetzner instances

Message ID cover.1740312673.git.roman@burningswell.com
Headers
Series Fix deployment of smaller Hetzner instances |

Message

Roman Scherer Feb. 23, 2025, 12:14 p.m. UTC
  Hello Guix,

Fabio Natali reached out to me that there is an "out of disk space" issue when
deploying smaller instances with the hetzner-environment-type.

I thought I tried it on smaller instances. But either I didn't try it on the
smaller instances, or another theory: I tried it back in the day, but Guix got
larger in the meantime?

Looking closer I discovered that the size of the root partition of the rescue
system depends on the instance type, and is a lot smaller on those smaller ones.

On a cax11 instance, booted into the rescue system from which a minimal Guix
system is installed, "df -h" shows:

Filesystem                    Size  Used Avail Use% Mounted on
udev                          1.9G     0  1.9G   0% /dev
[2a01:4ff:ff00::b007:1]:/nfs  1.2T  999G  151G  87% /root/.oldroot/nfs
overlay                       1.9G  1.9G     0 100% /
tmpfs                         1.9G     0  1.9G   0% /dev/shm
tmpfs                         768M  864K  767M   1% /run
tmpfs                         5.0M     0  5.0M   0% /run/lock
/dev/sda1                      38G   44K   36G   1% /mnt
/dev/sda15                    244M  152K  244M   1% /mnt/boot/efi
tmpfs                         384M     0  384M   0% /run/user/0

The 1.9G of / on the the rescue system seem to be not enough to host Guix
installed as a foreign distro on the rescue system, plus the packages needed
to install the new Guix bootstrap system.

To fix this I came up with the following solution:

- before installing Guix on the rescue system, I make sure that /gnu/store has
  enough space.

- this is done by bind mounting /mnt/tmp/gnu/store (here /mnt is the root of
  the new Guix system having more disk space) to /gnu/store.

- then Guix is installed with apt-get on the rescue system using the store
  that points into the tmp directory of the new Guix system.

- A minimal Guix system is installed onto /mnt, rebooted and from there the
  final operating system config is applied. When the minimal Guix system
  boots, the /tmp/gnu/store is gone and not used anymore.

This seems to work. I have tried it and Fabio also reported that it works for
him.

What do you think of this strategy? Is there a better one?

I attached a patch for this and also changed the instance type that are used
in the tests to smaller ones, so this is covered for the future. Could you
please review it?

Fabio also asked me why I choose to use medium sized instances as the default
instead of the smallest. My thinking was so people trying this for the first
time have a good experience and not have to deal with an under-powered
instance. I would leave it that way. If you think we should default to smaller
ones, please let me know.

Unfortunatly the above solution did not work with the smallest CPX11 instance,
with 2 VCPUs, 2 GB of RAM and 40 GB disk space.

The rescue system only 970M:

root@rescue /usr/lib # df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                          961M     0  961M   0% /dev
[2a01:4ff:ff00::b007:1]:/nfs  1.2T  999G  151G  87% /root/.oldroot/nfs
overlay                       970M  821M  150M  85% /
tmpfs                         970M     0  970M   0% /dev/shm
tmpfs                         388M  668K  388M   1% /run
tmpfs                         5.0M     0  5.0M   0% /run/lock
/dev/sda1                      38G  1.3G   34G   4% /mnt
/dev/sda15                    241M  142K  241M   1% /mnt/boot/efi
tmpfs                         194M     0  194M   0% /run/user/0

Installing Guix via apt-get works. But when installing the minimal bootstrap
Guix system it fails with:

...
downloading from https://ci.guix.gnu.org/nar/lzip/lclbcq0jds63zal1p55g6v0mwz90s44y-guile-git-0.5.2 ...
downloading from https://ci.guix.gnu.org/nar/gzip/g2ajyl8xk9aarxrgjbng2hkj3qm2v0z2-tar-1.34 ...
downloading from https://ci.guix.gnu.org/nar/gzip/v06gnr579r0jmr36aha3wkbd1y27ccg7-disarchive-0.4.0 ...
downloading from https://ci.guix.gnu.org/nar/lzip/9nvx97hr8kkr26gzwni2fblfn0yq0xjw-guix-1.4.0rc2 ...

error (ignored): aborting transaction: cannot rollback - no transaction is active
guix system: error: committing transaction: database or disk is full

Not sure what to do about that. I added a note to the manual that CPX11
instances are not supported at the moment.

Thanks Roman.

Roman Scherer (1):
  gnu: machine: hetzner: Fix deployment on smaller instances.

 doc/guix.texi             | 4 +++-
 gnu/machine/hetzner.scm   | 9 ++++++++-
 tests/machine/hetzner.scm | 4 ++--
 3 files changed, 13 insertions(+), 4 deletions(-)


base-commit: 5f4c785fc3caa0fd960ebcf9c1ea6ab396b96f25
--
2.48.1