diff mbox series

[bug#49107,core-updates] gnu: fontconfig: Use (locally) deterministic caching

Message ID d2647ed721b9f12b1267427409455e385ffaa12d.1624063731.git.iskarian@mgsn.dev
State Accepted
Headers show
Series [bug#49107,core-updates] gnu: fontconfig: Use (locally) deterministic caching | expand

Checks

Context Check Description
cbaines/applying patch fail View Laminar job
cbaines/issue success View issue

Commit Message

Sarah Morgensen June 19, 2021, 12:52 a.m. UTC
Make fontconfig use directory contents rather than modification time to
determine cache validity (by pretending that mtime is broken).

* gnu/packages/patches/fontconfig-cache-ignore-mtime.patch: New file.
* gnu/local.mk (dist_patch_DATA): Register it.
* gnu/packages/fontutils.scm (fontconfig)[source]: Use it.
[arguments]: Unset SOURCE_DATE_EPOCH for tests.
---
 gnu/local.mk                                      |  1 +
 gnu/packages/fontutils.scm                        |  7 ++++++-
 .../patches/fontconfig-cache-ignore-mtime.patch   | 15 +++++++++++++++
 3 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gnu/packages/patches/fontconfig-cache-ignore-mtime.patch

Hello Guix,

This patch attempts to make fontconfig's caching work seamlessly on Guix,
instead of requiring users to manually run `fc-cache -f` after installing or
removing fonts. This addresses <https://issues.guix.gnu.org/18640>.

Fontconfig usually uses a directory's mtime as its checksum. However, when
fontconfig detects a "broken mtime" filesystem, it will generate a directory
checksum from the directory listing contents. This is slightly slower, as it has
to stat all the files in all font directories. Unconditionally enabling this
mode should get us more regular behavior.

I am not confident this method is fully deterministic; particular filesystem
capabilities may still be an implicit imput in the checksums. This should only
matter when distributing pre-generated caches.

Fontconfig does honor SOURCE_DATE_EPOCH, but without a reliable mtime,
fontconfig has no way of knowing when to update such a cache. SOURCE_DATE_EPOCH
is now disabled for the tests because they assume working cache invalidation.

I have tested this patch on x86-64.


base-commit: bcdc13454c4afab37b650d4bbfa95e539060619f

Comments

Ludovic Courtès July 7, 2021, 9:16 p.m. UTC | #1
Hi,

Sarah Morgensen <iskarian@mgsn.dev> skribis:

> Make fontconfig use directory contents rather than modification time to
> determine cache validity (by pretending that mtime is broken).
>
> * gnu/packages/patches/fontconfig-cache-ignore-mtime.patch: New file.
> * gnu/local.mk (dist_patch_DATA): Register it.
> * gnu/packages/fontutils.scm (fontconfig)[source]: Use it.
> [arguments]: Unset SOURCE_DATE_EPOCH for tests.
> ---
>  gnu/local.mk                                      |  1 +
>  gnu/packages/fontutils.scm                        |  7 ++++++-
>  .../patches/fontconfig-cache-ignore-mtime.patch   | 15 +++++++++++++++
>  3 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gnu/packages/patches/fontconfig-cache-ignore-mtime.patch
>
> Hello Guix,
>
> This patch attempts to make fontconfig's caching work seamlessly on Guix,
> instead of requiring users to manually run `fc-cache -f` after installing or
> removing fonts. This addresses <https://issues.guix.gnu.org/18640>.
>
> Fontconfig usually uses a directory's mtime as its checksum. However, when
> fontconfig detects a "broken mtime" filesystem, it will generate a directory
> checksum from the directory listing contents. This is slightly slower, as it has
> to stat all the files in all font directories. Unconditionally enabling this
> mode should get us more regular behavior.

Nice, sounds like an improvement!

Does Fontconfig stats all these files every time an application starts?
Did you compare ‘strace -c some app’ with and without this change, to
get an idea of what it costs?

> I am not confident this method is fully deterministic; particular filesystem
> capabilities may still be an implicit imput in the checksums. This should only
> matter when distributing pre-generated caches.
>
> Fontconfig does honor SOURCE_DATE_EPOCH, but without a reliable mtime,
> fontconfig has no way of knowing when to update such a cache. SOURCE_DATE_EPOCH
> is now disabled for the tests because they assume working cache invalidation.

So tests fail is we leave SOURCE_DATE_EPOCH, right?

Thanks for addressing this longstanding issue!  (And apologies for the
delay…  Now’s a good time to get semi-high-level changes like this one
in ‘core-updates’.)

Ludo’.
Sarah Morgensen July 8, 2021, 1:17 a.m. UTC | #2
Hello,

Thanks for taking a look at this.

Ludovic Courtès <ludo@gnu.org> writes:

>> Fontconfig usually uses a directory's mtime as its checksum. However, when
>> fontconfig detects a "broken mtime" filesystem, it will generate a directory
>> checksum from the directory listing contents. This is slightly slower, as it has
>> to stat all the files in all font directories. Unconditionally enabling this
>> mode should get us more regular behavior.
>
> Nice, sounds like an improvement!
>
> Does Fontconfig stats all these files every time an application starts?
> Did you compare ‘strace -c some app’ with and without this change, to
> get an idea of what it costs?

I *believe* it does so whenever an application calls FcFontList or such,
which is usually on startup. I haven't done tracing with an application
that actually uses FcFontList, but I just now ran a naive test, along
the lines of:

  $ fc-cache -rf
  $ strace -c fc-list

The old fc-list:

--8<---------------cut here---------------start------------->8---
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 28.50    0.000226           0       228       142 openat
 22.19    0.000176           1       136        26 access
 11.10    0.000088           0       133        70 stat
  5.93    0.000047           0        86           close
  5.55    0.000044           2        22           fstatfs
  5.30    0.000042           0        54        37 readlink
  4.54    0.000036           4         8           munmap
  3.78    0.000030           0        70           mmap
  2.65    0.000021           1        20           write
  2.52    0.000020           0       104           read
  2.52    0.000020           0        40           fstat
  1.89    0.000015           1         8           fadvise64
  1.77    0.000014           1        10           getpid
  0.88    0.000007           0        10           brk
  0.63    0.000005           0        35           getrandom
  0.25    0.000002           2         1         1 ioctl
  0.00    0.000000           0        15           mprotect
  0.00    0.000000           0         2           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         4           getdents64
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           prlimit64
------ ----------- ----------- --------- --------- ----------------
100.00    0.000793           0       993       276 total
--8<---------------cut here---------------end--------------->8---

And the patched fc-list:

--8<---------------cut here---------------start------------->8---
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 17.19    0.000125           0       126        11 access
 14.44    0.000105           0       126        52 openat
 12.24    0.000089           1        55           mmap
 10.32    0.000075           0        99           read
  8.67    0.000063           4        15           mprotect
  8.25    0.000060           0        94         6 newfstatat
  5.50    0.000040           0        74           close
  5.23    0.000038           0        58        40 readlink
  4.68    0.000034          34         1           set_tid_address
  3.30    0.000024           0        38           getrandom
  3.16    0.000023           1        13           pread64
  1.65    0.000012           0        26           getdents64
  1.51    0.000011           1         9           brk
  1.10    0.000008           4         2           rt_sigaction
  0.83    0.000006           0         9           munmap
  0.55    0.000004           4         1           rt_sigprocmask
  0.55    0.000004           4         1           prlimit64
  0.41    0.000003           3         1           arch_prctl
  0.41    0.000003           3         1           set_robust_list
  0.00    0.000000           0        20           write
  0.00    0.000000           0         1         1 ioctl
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0        11           fstatfs
  0.00    0.000000           0         8           fadvise64
------ ----------- ----------- --------- --------- ----------------
100.00    0.000727           0       791       110 total
--8<---------------cut here---------------end--------------->8---

Now that is unexpected! There are actually less stats and opens. I'm
stumped! My profile has about 20 font packages (about 450 actual files)
installed. For reference, for both versions, `fc-cache -rf` yields about
3700 stats; and `fc-cache` yeilds about 300...

I suppose I should be more careful about speaking from theory!

>
>> I am not confident this method is fully deterministic; particular filesystem
>> capabilities may still be an implicit imput in the checksums. This should only
>> matter when distributing pre-generated caches.
>>
>> Fontconfig does honor SOURCE_DATE_EPOCH, but without a reliable mtime,
>> fontconfig has no way of knowing when to update such a cache. SOURCE_DATE_EPOCH
>> is now disabled for the tests because they assume working cache invalidation.
>
> So tests fail is we leave SOURCE_DATE_EPOCH, right?

Correct.

> Thanks for addressing this longstanding issue!  (And apologies for the
> delay…  Now’s a good time to get semi-high-level changes like this one
> in ‘core-updates’.)

I still wish we could address it "properly" by generating a cache at
profile generation time... but it looks like fontconfig embeds the full
paths of fonts in the cache (including username, since fonts are under
~/.guix-profile), so I doubt such a cache would work. I plan to try it
eventually anyway (or perhaps someone else will), but in the meantime,
this looks like the 80% solution.

--
Sarah
Ludovic Courtès July 12, 2021, 8:30 a.m. UTC | #3
Hi,

Sarah Morgensen <iskarian@mgsn.dev> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> Does Fontconfig stats all these files every time an application starts?
>> Did you compare ‘strace -c some app’ with and without this change, to
>> get an idea of what it costs?
>
> I *believe* it does so whenever an application calls FcFontList or such,
> which is usually on startup. I haven't done tracing with an application
> that actually uses FcFontList, but I just now ran a naive test, along
> the lines of:
>
>   $ fc-cache -rf
>   $ strace -c fc-list
>
> The old fc-list:
>
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  28.50    0.000226           0       228       142 openat
>  22.19    0.000176           1       136        26 access
>  11.10    0.000088           0       133        70 stat
>   5.93    0.000047           0        86           close
>   5.55    0.000044           2        22           fstatfs
>   5.30    0.000042           0        54        37 readlink
>   4.54    0.000036           4         8           munmap
>   3.78    0.000030           0        70           mmap
>   2.65    0.000021           1        20           write
>   2.52    0.000020           0       104           read
>   2.52    0.000020           0        40           fstat
>   1.89    0.000015           1         8           fadvise64
>   1.77    0.000014           1        10           getpid
>   0.88    0.000007           0        10           brk
>   0.63    0.000005           0        35           getrandom
>   0.25    0.000002           2         1         1 ioctl
>   0.00    0.000000           0        15           mprotect
>   0.00    0.000000           0         2           rt_sigaction
>   0.00    0.000000           0         1           rt_sigprocmask
>   0.00    0.000000           0         1           execve
>   0.00    0.000000           0         1           arch_prctl
>   0.00    0.000000           0         1           futex
>   0.00    0.000000           0         4           getdents64
>   0.00    0.000000           0         1           set_tid_address
>   0.00    0.000000           0         1           set_robust_list
>   0.00    0.000000           0         1           prlimit64
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.000793           0       993       276 total
>
>
> And the patched fc-list:
>
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  17.19    0.000125           0       126        11 access
>  14.44    0.000105           0       126        52 openat
>  12.24    0.000089           1        55           mmap
>  10.32    0.000075           0        99           read
>   8.67    0.000063           4        15           mprotect
>   8.25    0.000060           0        94         6 newfstatat
>   5.50    0.000040           0        74           close
>   5.23    0.000038           0        58        40 readlink
>   4.68    0.000034          34         1           set_tid_address
>   3.30    0.000024           0        38           getrandom
>   3.16    0.000023           1        13           pread64
>   1.65    0.000012           0        26           getdents64
>   1.51    0.000011           1         9           brk
>   1.10    0.000008           4         2           rt_sigaction
>   0.83    0.000006           0         9           munmap
>   0.55    0.000004           4         1           rt_sigprocmask
>   0.55    0.000004           4         1           prlimit64
>   0.41    0.000003           3         1           arch_prctl
>   0.41    0.000003           3         1           set_robust_list
>   0.00    0.000000           0        20           write
>   0.00    0.000000           0         1         1 ioctl
>   0.00    0.000000           0         1           execve
>   0.00    0.000000           0         1           sysinfo
>   0.00    0.000000           0        11           fstatfs
>   0.00    0.000000           0         8           fadvise64
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.000727           0       791       110 total
>
> Now that is unexpected! There are actually less stats and opens. I'm
> stumped! My profile has about 20 font packages (about 450 actual files)
> installed. For reference, for both versions, `fc-cache -rf` yields about
> 3700 stats; and `fc-cache` yeilds about 300...

Well, even better.  :-)

I went ahead and applied the patch.  ‘fontconfig-minimal’ builds fine;
‘fonconfig’ (with documentation) fails to build its PDF documentation,
but that’s not related to this change.

> I still wish we could address it "properly" by generating a cache at
> profile generation time... but it looks like fontconfig embeds the full
> paths of fonts in the cache (including username, since fonts are under
> ~/.guix-profile), so I doubt such a cache would work. I plan to try it
> eventually anyway (or perhaps someone else will), but in the meantime,
> this looks like the 80% solution.

Yeah.  It could be that each font package could contain its own cache,
and the profile hook would just assemble all these caches (provided the
file format makes it possible without too much of a headache).

Thanks!

Ludo’.
diff mbox series

Patch

diff --git a/gnu/local.mk b/gnu/local.mk
index c57e587e84..1d26c9e7c7 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1012,6 +1012,7 @@  dist_patch_DATA =						\
   %D%/packages/patches/fifo-map-remove-catch.hpp.patch		\
   %D%/packages/patches/findutils-localstatedir.patch		\
   %D%/packages/patches/flann-cmake-3.11.patch			\
+  %D%/packages/patches/fontconfig-cache-ignore-mtime.patch	\
   %D%/packages/patches/foobillard++-pkg-config.patch		\
   %D%/packages/patches/foomatic-filters-CVE-2015-8327.patch	\
   %D%/packages/patches/foomatic-filters-CVE-2015-8560.patch	\
diff --git a/gnu/packages/fontutils.scm b/gnu/packages/fontutils.scm
index dbce5beba8..0ee51a792e 100644
--- a/gnu/packages/fontutils.scm
+++ b/gnu/packages/fontutils.scm
@@ -14,6 +14,7 @@ 
 ;;; Copyright © 2020 Nicolas Goaziou <mail@nicolasgoaziou.fr>
 ;;; Copyright © 2021 Maxim Cournoyer <maxim.cournoyer@gmail.com>
 ;;; Copyright © 2020, 2021 Nicolas Goaziou <mail@nicolasgoaziou.fr>
+;;; Copyright © 2021 Sarah Morgensen <iskarian@mgsn.dev>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -331,7 +332,8 @@  Font Format (WOFF).")
                   "https://www.freedesktop.org/software/"
                   "fontconfig/release/fontconfig-" version ".tar.xz"))
             (sha256 (base32
-                     "1850q4k80yxma5g3yxkvyv8i5a3xqzswwml8gjy3jmywx8qqd5pa"))))
+                     "1850q4k80yxma5g3yxkvyv8i5a3xqzswwml8gjy3jmywx8qqd5pa"))
+             (patches (search-patches "fontconfig-cache-ignore-mtime.patch"))))
      (build-system gnu-build-system)
      ;; In Requires or Requires.private of fontconfig.pc.
      (propagated-inputs `(("expat" ,expat)
@@ -362,6 +364,9 @@  Font Format (WOFF).")
         (modify-phases %standard-phases
           (add-before 'check 'skip-problematic-tests
             (lambda _
+              ;; SOURCE_DATE_EPOCH doesn't make sense when ignoring mtime
+              (unsetenv "SOURCE_DATE_EPOCH")
+
               (substitute* "test/run-test.sh"
                 ;; The crbug1004254 test attempts to fetch fonts from the
                 ;; network.
diff --git a/gnu/packages/patches/fontconfig-cache-ignore-mtime.patch b/gnu/packages/patches/fontconfig-cache-ignore-mtime.patch
new file mode 100644
index 0000000000..b6e942ee10
--- /dev/null
+++ b/gnu/packages/patches/fontconfig-cache-ignore-mtime.patch
@@ -0,0 +1,15 @@ 
+Pretend that stat's mtime is broken, so that the fontconfig cache does not
+depend upon modification time to determine if a cache is stale.
+
+diff --git a/src/fcstat.c b/src/fcstat.c
+index 5a2bd7c..d603a96 100644
+--- a/src/fcstat.c
++++ b/src/fcstat.c
+@@ -431,6 +431,7 @@ FcIsFsMmapSafe (int fd)
+ FcBool
+ FcIsFsMtimeBroken (const FcChar8 *dir)
+ {
++    return FcTrue;
+     int fd = FcOpen ((const char *) dir, O_RDONLY);
+ 
+     if (fd != -1)