diff mbox series

[bug#41658] fixes / improvements for (guix store database)

Message ID 87ftbernat.fsf@cune.org
State Accepted
Headers show
Series [bug#41658] fixes / improvements for (guix store database) | expand

Checks

Context Check Description
cbaines/applying patch fail View Laminar job

Commit Message

Caleb Ristvedt June 2, 2020, 6:31 a.m. UTC
After some pondering about why the database might be locked so
frequently, this is what I've managed to come up with. The first patch
is the most likely to actually help with that, and the others mostly
involve improving robustness.

Ideally we'd come up with a test to quantify how much these kinds of
changes affect contention over the database. For now, though, all that I
can think of is seeing how this affects the systems that have had issues
with that.

- reepca

Comments

Ludovic Courtès June 4, 2020, 4:40 p.m. UTC | #1
Hi,

Thanks for the thorough investigation and for the patches!

Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:

> From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 18:50:07 -0500
> Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing
>  statement reset
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
> keep re-preparing statements that are frequently used.  Unfortunately it
> doesn't quite emulate the semantics of sqlite_finalize properly, because it
> doesn't cause a commit if the statement being finalized is the last "active"
> statement.  We work around this by wrapping sqlite-finalize with our own
> version that ensures sqlite-reset is called, which does The Right Thing™.
>
> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>   sqlite-finalize from (sqlite3).

Nice.  It would be great if you could report it upstream (Danny and/or
myself can then patch it directly in guile-sqlite3 and push out a
release) and refer to the issue from here.

We can have this patch locally in the meantime, unless it would break
once the new guile-sqlite3 is out.  WDYT?

> From ee24ab21122b1c75a7d67d7062550e15e54ab62f Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 19:21:43 -0500
> Subject: [PATCH 2/4] database: rewrite query procedures in terms of
>  with-statement.
>
> Most of our queries would fail to finalize their statements properly if sqlite
> returned an error during their execution.  This resolves that, and also makes
> them somewhat more concise as a side-effect.
>
> This also makes some small changes to improve certain queries where behavior
> was strange or overly verbose.
>
> * guix/store/database.scm (call-with-statement): new procedure.
>   (with-statement): new macro.
>   (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to
>   use with-statement.
>   (update-or-insert): factor last-insert-row-id out of the end of both
>   branches.
>   (add-references): remove pointless last-insert-row-id call.
>
> * .dir-locals.el (with-statement): add indenting information.

LGTM!

> From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 21:43:14 -0500
> Subject: [PATCH 3/4] database: ensure update-or-insert is run within a
>  transaction
>
> update-or-insert can break if an insert occurs between when it decides whether
> to update or insert and when it actually performs that operation.  Putting the
> check and the update/insert operation in the same transaction ensures that the
> update/insert will only succeed if no other write has occurred in the middle.
>
> * guix/store/database.scm (call-with-savepoint): new procedure.
>   (update-or-insert): use call-with-savepoint to ensure the read and the
>   insert/update occur within the same transaction.

That’s a bit beyond my understanding, but I think you can also push this
one.  :-)

Make sure “make check TESTS=tests/store-database.scm” is still happy.

Thanks a lot!

Ludo’.
Danny Milosavljevic June 4, 2020, 5 p.m. UTC | #2
Hi Ludo,
Hi Caleb,

On Thu, 04 Jun 2020 18:40:35 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> Nice.  It would be great if you could report it upstream (Danny and/or
> myself can then patch it directly in guile-sqlite3 and push out a
> release) and refer to the issue from here.

I agree.  It's easy to change sqlite-finalize in guile-sqlite3 to
call sqlite-reset, basically just adapt

(define sqlite-finalize
  (let ((f (pointer->procedure
            int
            (dynamic-func "sqlite3_finalize" libsqlite3)
            (list '*))))
    (lambda (stmt)
      ;; Note: When STMT is cached, this is a no-op.  This ensures caching
      ;; actually works while still separating concerns: users can turn
      ;; caching on and off without having to change the rest of their code.
      (when (and (stmt-live? stmt)
                 (not (stmt-cached? stmt)))
        (let ((p (stmt-pointer stmt)))
          (sqlite-remove-statement! (stmt->db stmt) stmt)
          (set-stmt-live?! stmt #f)
          (f p))))))

so that it calls sqlite-reset in the "when"'s new "else" branch there.

(we could also always call sqlite3_reset on sqlite-finalize anyway, it wouldn't
hurt but it wouldn't help either)

I agree that sqlite-finalize should model sqlite's finalization behavior as
much as possible.

Also, the comment about this being a no-op is not true then anymore.

We should definitely also pick up Caleb's comment upstream:

+  ;; Cached statements aren't reset when sqlite-finalize is invoked on
+  ;; them. This can cause problems with automatically-started transactions:
+  ;;
+  ;; "An implicit transaction (a transaction that is started automatically,
+  ;; not a transaction started by BEGIN) is committed automatically when the
+  ;; last active statement finishes. A statement finishes when its last cursor
+  ;; closes, which is guaranteed to happen when the prepared statement is
+  ;; reset or finalized. Some statements might "finish" for the purpose of
+  ;; transaction control prior to being reset or finalized, but there is no
+  ;; guarantee of this."
+  ;;
+  ;; Thus, it's possible for an implicitly-started transaction to hang around
+  ;; until sqlite-reset is called when the cached statement is next
+  ;; used. Because the transaction is committed automatically only when the
+  ;; *last active statement* finishes, the implicitly-started transaction may
+  ;; later be upgraded to a write transaction (!) and this non-reset statement
+  ;; will still be keeping the transaction from committing until it is next
+  ;; used or the database connection is closed. This has the potential to make
+  ;; (exclusive) write access to the database necessary for much longer than
+  ;; it should be.
+  ;;
+  ;; (see https://www.sqlite.org/lang_transaction.html)

@Caleb:

Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues
and pull request so this is auditable?
Ludovic Courtès June 5, 2020, 4:19 p.m. UTC | #3
Hi Danny!

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> I agree.  It's easy to change sqlite-finalize in guile-sqlite3 to
> call sqlite-reset, basically just adapt

[...]

> @Caleb:
>
> Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues
> and pull request so this is auditable?

Agreed.

Danny, once this is merged upstream, could you tag a new release?  There
are a few other useful improvements in there.

Thanks,
Ludo’.
diff mbox series

Patch

From e30271728dfb23324c981d226c752b17689c9eef Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 22:15:21 -0500
Subject: [PATCH 4/4] database: separate transaction-handling and
 retry-handling.

Previously call-with-transaction would both retry when SQLITE_BUSY errors were
thrown and do what its name suggested (start and rollback/commit a
transaction).  This changes it to do only what its name implies, which
simplifies its implementation.  Retrying is provided by the new
call-with-SQLITE_BUSY-retrying procedure.

* guix/store/database.scm (call-with-transaction): no longer restarts, new
  #:restartable? argument controls whether "begin" or "begin immediate" is
  used.
  (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction,
  call-with-retrying-savepoint): new procedures.
  (register-items): use call-with-retrying-transaction to preserve old
  behavior.

* .dir-locals.el (call-with-retrying-transaction,
  call-with-retrying-savepoint): add indentation information.
---
 .dir-locals.el          |  2 ++
 guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------
 2 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index ef25cb100a..e9dccd0511 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,7 +90,9 @@ 
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2))
    (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
+   (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 3955c48b1f..2a78379dac 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -99,27 +99,44 @@  create it and initialize it as a new database."
 ;; XXX: missing in guile-sqlite3@0.1.0
 (define SQLITE_BUSY 5)
 
-(define (call-with-transaction db proc)
-  "Start a transaction with DB (make as many attempts as necessary) and run
-PROC.  If PROC exits abnormally, abort the transaction, otherwise commit the
-transaction after it finishes."
+(define (call-with-SQLITE_BUSY-retrying thunk)
+  "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY
+errors."
   (catch 'sqlite-error
+    thunk
+    (lambda (key who code errmsg)
+      (if (= code SQLITE_BUSY)
+          (call-with-SQLITE_BUSY-retrying thunk)
+          (throw key who code errmsg)))))
+
+
+
+(define* (call-with-transaction db proc #:key restartable?)
+  "Start a transaction with DB and run PROC.  If PROC exits abnormally, abort
+the transaction, otherwise commit the transaction after it finishes.
+RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple
+times.  This may reduce contention for the database somewhat."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+  ;; We might use begin immediate here so that if we need to retry, we figure
+  ;; that out immediately rather than because some SQLITE_BUSY exception gets
+  ;; thrown partway through PROC - in which case the part already executed
+  ;; (which may contain side-effects!) might have to be executed again for
+  ;; every retry.
+  (exec (if restartable? "begin;" "begin immediate;"))
+  (catch #t
     (lambda ()
-      ;; We use begin immediate here so that if we need to retry, we
-      ;; figure that out immediately rather than because some SQLITE_BUSY
-      ;; exception gets thrown partway through PROC - in which case the
-      ;; part already executed (which may contain side-effects!) would be
-      ;; executed again for every retry.
-      (sqlite-exec db "begin immediate;")
-      (let ((result (proc)))
-        (sqlite-exec db "commit;")
-        result))
-    (lambda (key who error description)
-      (if (= error SQLITE_BUSY)
-          (call-with-transaction db proc)
-          (begin
-            (sqlite-exec db "rollback;")
-            (throw 'sqlite-error who error description))))))
+      (let-values ((result (proc)))
+        (exec "commit;")
+        (apply values result)))
+    (lambda args
+      ;; The roll back may or may not have occurred automatically when the
+      ;; error was generated. If it has occurred, this does nothing but signal
+      ;; an error. If it hasn't occurred, this needs to be done.
+      (false-if-exception (exec "rollback;"))
+      (apply throw args))))
+
 (define* (call-with-savepoint db proc
                               #:optional (savepoint-name "SomeSavepoint"))
   "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
@@ -141,6 +158,18 @@  prior to returning."
     (lambda ()
       (exec (string-append "RELEASE " savepoint-name ";")))))
 
+(define* (call-with-retrying-transaction db proc #:key restartable?)
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-transaction db proc #:restartable? restartable?))))
+
+(define* (call-with-retrying-savepoint db proc
+                                       #:optional (savepoint-name
+                                                   "SomeSavepoint"))
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-savepoint db proc savepoint-name))))
+
 (define %default-database-file
   ;; Default location of the store database.
   (string-append %store-database-directory "/db.sqlite"))
@@ -431,7 +460,7 @@  Write a progress report to LOG-PORT."
   (mkdir-p db-dir)
   (parameterize ((sql-schema schema))
     (with-database (string-append db-dir "/db.sqlite") db
-      (call-with-transaction db
+      (call-with-retrying-transaction db
           (lambda ()
             (let* ((prefix   (format #f "registering ~a items" (length items)))
                    (progress (progress-reporter/bar (length items)
-- 
2.26.2