diff mbox series

[bug#41658] fixes / improvements for (guix store database)

Message ID 87o8pu5cjx.fsf@cune.org
State Accepted
Headers show
Series [bug#41658] fixes / improvements for (guix store database) | expand

Checks

Context Check Description
cbaines/applying patch fail View Laminar job

Commit Message

Caleb Ristvedt June 8, 2020, 5:52 a.m. UTC
Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Thanks for the thorough investigation and for the patches!
>
> Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:
>
>> From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001
>> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
>> Date: Mon, 1 Jun 2020 18:50:07 -0500
>> Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing
>>  statement reset
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
>> keep re-preparing statements that are frequently used.  Unfortunately it
>> doesn't quite emulate the semantics of sqlite_finalize properly, because it
>> doesn't cause a commit if the statement being finalized is the last "active"
>> statement.  We work around this by wrapping sqlite-finalize with our own
>> version that ensures sqlite-reset is called, which does The Right Thing™.
>>
>> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>>   sqlite-finalize from (sqlite3).
>
> Nice.  It would be great if you could report it upstream (Danny and/or
> myself can then patch it directly in guile-sqlite3 and push out a
> release) and refer to the issue from here.

Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12,
with corresponding pull request
https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.

> We can have this patch locally in the meantime, unless it would break
> once the new guile-sqlite3 is out.  WDYT?

I've attached an updated patch series that both includes the
guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the
workaround for situations where older guile-sqlite3's must be used (for
example, when building guix from scratch on foreign distros that haven't
incorporated the fix yet). The only changes are the addition of the
now-second patch and fixing up some spacing in the comment in the first
patch.

>> From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001
>> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
>> Date: Mon, 1 Jun 2020 21:43:14 -0500
>> Subject: [PATCH 3/4] database: ensure update-or-insert is run within a
>>  transaction
>>
>> update-or-insert can break if an insert occurs between when it decides whether
>> to update or insert and when it actually performs that operation.  Putting the
>> check and the update/insert operation in the same transaction ensures that the
>> update/insert will only succeed if no other write has occurred in the middle.
>>
>> * guix/store/database.scm (call-with-savepoint): new procedure.
>>   (update-or-insert): use call-with-savepoint to ensure the read and the
>>   insert/update occur within the same transaction.
>
> That’s a bit beyond my understanding, but I think you can also push this
> one.  :-)

Basically, it's like combining the body of two separate compare-and-swap
loops into a single compare-and-swap loop. This ensures that the view is
consistent (since if it isn't, the "compare" will fail and we'll
retry). It addresses a problem that doesn't exist in practice yet, since
update-or-insert is only called from within a call-with-transaction
currently. But if someone ever wanted to call it from outside of a
call-with-transaction, this would ensure that it still worked correctly.

> Make sure “make check TESTS=tests/store-database.scm” is still happy.

Works on my system.

Related question: does berlin export /var/guix over NFS as per
http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so,
that could interact poorly with our use of WAL mode:

"All processes using a database must be on the same host computer; WAL
does not work over a network filesystem." -
https://sqlite.org/wal.html.

- reepca

Comments

Ludovic Courtès June 9, 2020, 8:42 a.m. UTC | #1
Hi,

Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:


[...]

>> Nice.  It would be great if you could report it upstream (Danny and/or
>> myself can then patch it directly in guile-sqlite3 and push out a
>> release) and refer to the issue from here.
>
> Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12,
> with corresponding pull request
> https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.

Awesome, thank you.

>> We can have this patch locally in the meantime, unless it would break
>> once the new guile-sqlite3 is out.  WDYT?
>
> I've attached an updated patch series that both includes the
> guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the
> workaround for situations where older guile-sqlite3's must be used (for
> example, when building guix from scratch on foreign distros that haven't
> incorporated the fix yet). The only changes are the addition of the
> now-second patch and fixing up some spacing in the comment in the first
> patch.

OK.

>>> * guix/store/database.scm (call-with-savepoint): new procedure.
>>>   (update-or-insert): use call-with-savepoint to ensure the read and the
>>>   insert/update occur within the same transaction.
>>
>> That’s a bit beyond my understanding, but I think you can also push this
>> one.  :-)
>
> Basically, it's like combining the body of two separate compare-and-swap
> loops into a single compare-and-swap loop. This ensures that the view is
> consistent (since if it isn't, the "compare" will fail and we'll
> retry). It addresses a problem that doesn't exist in practice yet, since
> update-or-insert is only called from within a call-with-transaction
> currently. But if someone ever wanted to call it from outside of a
> call-with-transaction, this would ensure that it still worked correctly.

Makes sense, thanks for explaining.

> Related question: does berlin export /var/guix over NFS as per
> http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so,
> that could interact poorly with our use of WAL mode:

No, it doesn’t.  (Also, in the setup described above, there’s only one
guix-daemon instance and it accesses the database via the local file
system.)

> From 614213c80a7ea15f7aab9502e6c33206ac089d05 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 18:50:07 -0500
> Subject: [PATCH 1/5] database: work around guile-sqlite3 bug preventing
>  statement reset
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
> keep re-preparing statements that are frequently used.  Unfortunately it
> doesn't quite emulate the semantics of sqlite_finalize properly, because it
> doesn't cause a commit if the statement being finalized is the last "active"
> statement (see https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  We
> work around this by wrapping sqlite-finalize with our own version that ensures
> sqlite-reset is called, which does The Right Thing™.
>
> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>   sqlite-finalize from (sqlite3).

[...]

> +(define (sqlite-finalize stmt)
> +  ;; As of guile-sqlite3 0.1.0, cached statements aren't reset when
> +  ;; sqlite-finalize is invoked on them (see
> +  ;; https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  This can
> +  ;; cause problems with automatically-started transactions:

I think it’s enough to link to the upstream issue, which has the problem
well documented.

> From e3cf7be4491f465d3041933596d3caad1ea64e83 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Sun, 7 Jun 2020 22:30:41 -0500
> Subject: [PATCH 2/5] gnu: guile-sqlite3: add patch to fix sqlite-finalize bug
>
> Adds patch that fixes
> https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12.  This can be
> discarded once the patch is integrated into the next guile-sqlite3 release.
> Note that the patch is identical to the pull request at
> https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.
>
> * gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch: new
>   patch.
> * gnu/packages/guile.scm (guile-sqlite3): use it.
> * gnu/local.mk (dist_patch_DATA): add it.

I’d skip it: we have a workaround and the release may be out soon.

Danny, thoughts on getting a new release out?

The rest is still fine with me, thank you!

Ludo’.
diff mbox series

Patch

From af87a556dab110ed7a6ef2fa1584778cc60be682 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 22:15:21 -0500
Subject: [PATCH 5/5] database: separate transaction-handling and
 retry-handling.

Previously call-with-transaction would both retry when SQLITE_BUSY errors were
thrown and do what its name suggested (start and rollback/commit a
transaction).  This changes it to do only what its name implies, which
simplifies its implementation.  Retrying is provided by the new
call-with-SQLITE_BUSY-retrying procedure.

* guix/store/database.scm (call-with-transaction): no longer restarts, new
  #:restartable? argument controls whether "begin" or "begin immediate" is
  used.
  (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction,
  call-with-retrying-savepoint): new procedures.
  (register-items): use call-with-retrying-transaction to preserve old
  behavior.

* .dir-locals.el (call-with-retrying-transaction,
  call-with-retrying-savepoint): add indentation information.
---
 .dir-locals.el          |  2 ++
 guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------
 2 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index d9c81b2a48..b88ec7a795 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,7 +90,9 @@ 
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2))
    (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
+   (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 14aaeef176..4921e9f0e2 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -99,27 +99,44 @@  create it and initialize it as a new database."
 ;; XXX: missing in guile-sqlite3@0.1.0
 (define SQLITE_BUSY 5)
 
-(define (call-with-transaction db proc)
-  "Start a transaction with DB (make as many attempts as necessary) and run
-PROC.  If PROC exits abnormally, abort the transaction, otherwise commit the
-transaction after it finishes."
+(define (call-with-SQLITE_BUSY-retrying thunk)
+  "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY
+errors."
   (catch 'sqlite-error
+    thunk
+    (lambda (key who code errmsg)
+      (if (= code SQLITE_BUSY)
+          (call-with-SQLITE_BUSY-retrying thunk)
+          (throw key who code errmsg)))))
+
+
+
+(define* (call-with-transaction db proc #:key restartable?)
+  "Start a transaction with DB and run PROC.  If PROC exits abnormally, abort
+the transaction, otherwise commit the transaction after it finishes.
+RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple
+times.  This may reduce contention for the database somewhat."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+  ;; We might use begin immediate here so that if we need to retry, we figure
+  ;; that out immediately rather than because some SQLITE_BUSY exception gets
+  ;; thrown partway through PROC - in which case the part already executed
+  ;; (which may contain side-effects!) might have to be executed again for
+  ;; every retry.
+  (exec (if restartable? "begin;" "begin immediate;"))
+  (catch #t
     (lambda ()
-      ;; We use begin immediate here so that if we need to retry, we
-      ;; figure that out immediately rather than because some SQLITE_BUSY
-      ;; exception gets thrown partway through PROC - in which case the
-      ;; part already executed (which may contain side-effects!) would be
-      ;; executed again for every retry.
-      (sqlite-exec db "begin immediate;")
-      (let ((result (proc)))
-        (sqlite-exec db "commit;")
-        result))
-    (lambda (key who error description)
-      (if (= error SQLITE_BUSY)
-          (call-with-transaction db proc)
-          (begin
-            (sqlite-exec db "rollback;")
-            (throw 'sqlite-error who error description))))))
+      (let-values ((result (proc)))
+        (exec "commit;")
+        (apply values result)))
+    (lambda args
+      ;; The roll back may or may not have occurred automatically when the
+      ;; error was generated. If it has occurred, this does nothing but signal
+      ;; an error. If it hasn't occurred, this needs to be done.
+      (false-if-exception (exec "rollback;"))
+      (apply throw args))))
+
 (define* (call-with-savepoint db proc
                               #:optional (savepoint-name "SomeSavepoint"))
   "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
@@ -141,6 +158,18 @@  prior to returning."
     (lambda ()
       (exec (string-append "RELEASE " savepoint-name ";")))))
 
+(define* (call-with-retrying-transaction db proc #:key restartable?)
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-transaction db proc #:restartable? restartable?))))
+
+(define* (call-with-retrying-savepoint db proc
+                                       #:optional (savepoint-name
+                                                   "SomeSavepoint"))
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-savepoint db proc savepoint-name))))
+
 (define %default-database-file
   ;; Default location of the store database.
   (string-append %store-database-directory "/db.sqlite"))
@@ -433,7 +462,7 @@  Write a progress report to LOG-PORT."
   (mkdir-p db-dir)
   (parameterize ((sql-schema schema))
     (with-database (string-append db-dir "/db.sqlite") db
-      (call-with-transaction db
+      (call-with-retrying-transaction db
           (lambda ()
             (let* ((prefix   (format #f "registering ~a items" (length items)))
                    (progress (progress-reporter/bar (length items)
-- 
2.26.2