diff mbox series

[bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad.

Message ID 20230214073314.2651-1-paren@disroot.org
State New
Headers show
Series [bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad. | expand

Commit Message

\( Feb. 14, 2023, 7:33 a.m. UTC
* website/posts/dissecting-guix-2-store-monad.md: New blog post.
---
Oops, forgot to ``git commit -a''... :/  See v4 for the changelog.

 .../posts/dissecting-guix-2-store-monad.md    | 567 ++++++++++++++++++
 1 file changed, 567 insertions(+)
 create mode 100644 website/posts/dissecting-guix-2-store-monad.md


base-commit: fe113595b6f7d8a1e1a0b814521f02783f9209c3

Comments

Simon Tournier Feb. 14, 2023, 6:01 p.m. UTC | #1
Hi,

On mar., 14 févr. 2023 at 07:33, "\( via Guix-patches" via <guix-patches@gnu.org> wrote:

> +# Yes, No, Maybe So
> +
> +Let's instead implement another M of functional programming, _`maybe`_ values,
> +representing a value that may or may not exist.  For instance, there could be a
> +procedure that attempts to pop a stack, returing the result if there is one, or
                                            --^
                                          Typo

s/returing/returning


Cheers,
simon
\( Feb. 14, 2023, 7:26 p.m. UTC | #2
On Tue Feb 14, 2023 at 6:01 PM GMT, Simon Tournier wrote:
>                                             --^
>                                           Typo
>
> s/returing/returning

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

    -- (
diff mbox series

Patch

diff --git a/website/posts/dissecting-guix-2-store-monad.md b/website/posts/dissecting-guix-2-store-monad.md
new file mode 100644
index 0000000..13b9cbb
--- /dev/null
+++ b/website/posts/dissecting-guix-2-store-monad.md
@@ -0,0 +1,567 @@ 
+title: Dissecting Guix, Part 2: The Store Monad
+date: TBC
+author: (
+tags: Dissecting Guix, Functional package management, Programming interfaces, Scheme API
+---
+Hello again!
+
+In [the last post](https://guix.gnu.org/en/blog/2023/dissecting-guix-part-1-derivations/),
+we briefly mentioned the `with-store` and `run-with-store` macros.  Today, we'll
+be looking at those in further detail, along with the related monad library and
+the [`%store-monad`](https://guix.gnu.org/manual/devel/en/html_node/The-Store-Monad.html)!
+
+Typically, we use monads to chain operations together, and the `%store-monad` is
+no different; it's used to combine operations that work on the Guix store (for
+instance, creating derivations, building derivations, or adding data files to
+the store).
+
+However, monads are a little hard to explain, and from a distance, they seem to
+be quite incomprehensible.  So, I want you to erase them from your mind for now.
+We'll come back to them later.  And be aware that if you can't seem to get your
+head around them, it's okay; you can understand most of the architecture of Guix
+without understanding monads.
+
+# Yes, No, Maybe So
+
+Let's instead implement another M of functional programming, _`maybe`_ values,
+representing a value that may or may not exist.  For instance, there could be a
+procedure that attempts to pop a stack, returing the result if there is one, or
+`nothing` if the stack has no elements.
+
+`maybe` is a very common feature of statically-typed functional languages, and
+you'll see it all over the place in Haskell and OCaml code. However, Guile is
+dynamically typed, so we usually use ad-hoc `#f` values as the "null value"
+instead of a proper "nothing" or "none".
+
+Just for fun, though, we'll implement a proper `maybe` in Guile.  Fire up that
+REPL once again, and let's import a bunch of modules that we'll need:
+
+```scheme
+(use-modules (ice-9 match)
+             (srfi srfi-9))
+```
+
+We'll implement `maybe` as a record with two fields, `is?` and `value`.  If the
+value contains something, `is?` will be `#t` and `value` will contain the thing
+in question, and if it's empty, `is?`'ll be `#f`.
+
+```scheme
+(define-record-type <maybe>
+  (make-maybe is? value)
+  maybe?
+  (is? maybe-is?)
+  (value maybe-value))
+```
+
+Now we'll define constructors for the two possible states:
+
+```scheme
+(define (something value)
+  (make-maybe #t value))
+
+(define (nothing)
+  (make-maybe #f #f)) ;the value here doesn't matter; we'll just use #f
+```
+
+And make some silly functions that return optional values:
+
+```scheme
+(define (remove-a str)
+  (if (eq? (string-ref str 0) #\a)
+      (something (substring str 1))
+      (nothing)))
+
+(define (remove-b str)
+  (if (eq? (string-ref str 0) #\b)
+      (something (substring str 1))
+      (nothing)))
+      
+(remove-a "ahh")
+⇒ #<<maybe> is?: #t value: "hh">
+
+(remove-a "ooh")
+⇒ #<<maybe> is?: #f value: #f>
+
+(remove-b "bad")
+⇒ #<<maybe> is?: #t value: "ad">
+```
+
+But what if we want to compose the results of these functions?
+
+# Keeping Your Composure
+
+As you might have guessed, this is not fun.  Cosplaying as a compiler backend
+typically isn't.
+
+```scheme
+(let ((t1 (remove-a "abcd")))
+  (if (maybe-is? t1)
+      (remove-b (maybe-value t1))
+      (nothing)))
+⇒ #<<maybe> is?: #t value: "cd">
+
+(let ((t1 (remove-a "bbcd")))
+  (if (maybe-is? t1)
+      (remove-b (maybe-value t1))
+      (nothing)))
+⇒ #<<maybe> is?: #f value: #f>
+```
+
+I can almost hear the heckling.  Even worse, composing three:
+
+```scheme
+(let* ((t1 (remove-a "abad"))
+       (t2 (if (maybe-is? t1)
+               (remove-b (maybe-value t1))
+               (nothing))))
+  (if (maybe-is? t2)
+      (remove-a (maybe-value t2))
+      (nothing)))
+⇒ #<<maybe> is?: #t value: "d">
+```
+
+So, how do we go about making this more bearable?  Well, one way could be to
+make `remove-a` and `remove-b` accept `maybe`s:
+
+```scheme
+(define (remove-a ?str)
+  (match ?str
+    (($ <maybe> #t str)
+     (if (eq? (string-ref str 0) #\a)
+         (something (substring str 1))
+         (nothing)))
+    (_ (nothing))))
+
+(define (remove-b ?str)
+  (match ?str
+    (($ <maybe> #t str)
+     (if (eq? (string-ref str 0) #\b)
+         (something (substring str 1))
+         (nothing)))
+    (_ (nothing))))
+```
+
+Not at all pretty, but it works!
+
+```scheme
+(remove-b (remove-a (something "abc")))
+⇒ #<<maybe> is?: #t value: "c">
+```
+
+Still, our procedures now require quite a bit of boilerplate.  Might there be a
+better way?
+
+# The Ties That `>>=` Us
+
+First of all, we'll revert to our original definitions of `remove-a` and
+`remove-b`, that is to say, the ones that take a regular value and return a
+`maybe`.
+
+```scheme
+(define (remove-a str)
+  (if (eq? (string-ref str 0) #\a)
+      (something (substring str 1))
+      (nothing)))
+
+(define (remove-b str)
+  (if (eq? (string-ref str 0) #\b)
+      (something (substring str 1))
+      (nothing)))
+```
+
+What if tried introducing higher-order procedures (procedures that accept other
+procedures as arguments) into the equation?  Because we're functional
+programmers and we have an unhealthy obsession with that sort of thing.
+
+```scheme
+(define (maybe-chain maybe proc)
+  (if (maybe-is? maybe)
+      (proc (maybe-value maybe))
+      (nothing)))
+
+(maybe-chain (something "abc")
+             remove-a)
+⇒ #<<maybe> is?: #t value: "bc">
+
+(maybe-chain (nothing)
+             remove-a)
+⇒ #<<maybe> is?: #f value: #f>
+```
+
+It lives!  To make it easier to compose procedures like this, we'll define a
+macro that allows us to perform any number of sequenced operations with only one
+composition form:
+
+```scheme
+(define-syntax maybe-chain*
+  (syntax-rules ()
+    ((_ maybe proc)
+     (maybe-chain maybe proc))
+    ((_ maybe proc rest ...)
+     (maybe-chain* (maybe-chain maybe proc)
+                   rest ...))))
+
+(maybe-chain* (something "abad")
+              remove-a
+              remove-b
+              remove-a)
+⇒ #<<maybe> is?: #t value: "d">
+```
+
+Congratulations, you've just implemented the `bind` operation, commonly written
+as `>>=`, for our `maybe` type.  And it turns out that a monad is just any
+container-like value for which `>>=` (along with another procedure called
+`return`, which wraps a given value in the simplest possible form of a monad)
+has been implemented.
+
+A more formal definition would be that a monad is a mathematical object composed
+of three parts: a type, a `bind` function, and a `return` function.  So, how do
+monads relate to Guix?
+
+# New Wheel, Old Wheel
+
+Now that we've reinvented the wheel, we'd better learn to use the original
+wheel.  Guix provides a generic, high-level monads library, along with the two
+generic monads `%identity-monad` and `%state-monad`, and the Guix-specific
+`%store-monad`.  Since `maybe` is not one of them, let's integrate our version
+into the Guix monad system!
+
+First we'll import the module that provides the aforementioned library:
+
+```scheme
+(use-modules (guix monads))
+```
+
+To define a monad's behaviour in Guix, we simply use the `define-monad` macro,
+and provide two procedures: `bind`, and `return`.
+
+```scheme
+(define-monad %maybe-monad
+  (bind maybe-chain)
+  (return something))
+```
+
+`bind` is just the procedure that we use to compose monadic procedure calls
+together, and `return` is the procedure that wraps values in the most basic form
+of the monad.  A properly implemented `bind` and `return` must follow the
+so-called _monad laws_:
+
+1. `(bind (return x) proc)` must be equivalent to `(proc x)`.
+2. `(bind monad return)` must be equivalent to just `monad`.
+3. `(bind (bind monad proc-1) proc-2)` must be equivalent to
+   `(bind monad (lambda (x) (bind (proc-1 x) proc-2)))`.
+
+Let's verify that our `maybe-chain` and `something` procedures adhere to the
+monad laws:
+
+```scheme
+(define (mlaws-proc-1 x)
+  (something (+ x 1)))
+
+(define (mlaws-proc-2 x)
+  (something (+ x 2)))
+  
+;; First law: the left identity.
+(equal? (maybe-chain (something 0)
+                     mlaws-proc-1)
+        (mlaws-proc-1 0))
+⇒ #t
+ 
+;; Second law: the right identity.
+(equal? (maybe-chain (something 0)
+                     something)
+        (something 0))
+⇒ #t
+
+;; Third law: associativity.
+(equal? (maybe-chain (maybe-chain (something 0)
+                                  mlaws-proc-1)
+                     mlaws-proc-2)
+        (maybe-chain (something 0)
+                     (lambda (x)
+                       (maybe-chain (mlaws-proc-1 x)
+                                    mlaws-proc-2))))
+⇒ #t
+```
+
+Now that we know they're valid, we can use the `with-monad` macro to tell Guix
+to use these specific implementations of `bind` and `return`, and the `>>=`
+macro to thread monads through procedure calls!
+
+```scheme
+(with-monad %maybe-monad
+  (>>= (something "aabbc")
+       remove-a
+       remove-a
+       remove-b
+       remove-b))
+⇒ #<<maybe> is?: #t value: "c">
+```
+
+We can also now use `return`:
+
+```scheme
+(with-monad %maybe-monad
+  (return 32))
+⇒ #<<maybe> is?: #t value: 32>
+```
+
+But Guix provides many higher-level interfaces than `>>=` and `return`, as we
+will see.  There's `mbegin`, which evaluates monadic expressions without binding
+them to symbols, returning the last one.  This, however, isn't particularly
+useful with our `%maybe-monad`, as it's only really usable if the monadic
+operations within have side effects, just like the non-monadic `begin`.
+
+There's also `mlet` and `mlet*`, which _do_ bind the results of monadic
+expressions to symbols, and are essentially equivalent to a chain of
+`(>>= MEXPR (lambda (BINDING) ...))`:
+
+```scheme
+;; This is equivalent...
+(mlet* %maybe-monad ((str -> "abad") ;non-monadic binding uses the -> symbol
+                     (str1 (remove-a str))
+                     (str2 (remove-b str)))
+  (remove-a str))
+⇒ #<<maybe> is?: #t value: "d">
+
+;; ...to this:
+(with-monad %maybe-monad
+  (>>= (return "abad")
+       (lambda (str)
+         (remove-a str))
+       (lambda (str1)
+         (remove-b str))
+       (lambda (str2)
+         (remove-a str))))
+```
+
+Various abstractions over these two exist too, such as `mwhen` (a `when` plus an
+`mbegin`), `munless` (an `unless` plus an `mbegin`), and `mparameterize`
+(dynamically-scoped value rebinding, like `parameterize`, in a monadic context).
+`lift` takes a procedure and a monad and creates a new procedure that returns
+a monadic value.
+
+There are also interfaces for manipulating lists wrapped in monads; `listm`
+creates such a list, `sequence` turns a list of monads into a list wrapped in a
+monad, and the `anym`, `mapm`, and `foldm` procedures are like their non-monadic
+equivalents, except that they return lists wrapped in monads.
+
+This is all well and good, you may be thinking, but why does Guix need a monad
+library, anyway?  The answer is technically that it doesn't.  But building on
+the monad API makes a lot of things much easier, and to learn why, we're going
+to look at one of Guix's built-in monads.
+
+# In a State
+
+Guix implements a monad called `%state-monad`, and it works with single-argument
+procedures returning two values.  Behold:
+
+```scheme
+(with-monad %state-monad
+  (return 33))
+⇒ #<procedure 21dc9a0 at <unknown port>:1106:22 (state)>
+```
+
+The `run-with-state` value turns this procedure into an actually useful value,
+or, rather, two values:
+
+```scheme
+(run-with-state (with-monad %state-monad (return 33))
+  (list "foo" "bar" "baz"))
+⇒ 33
+⇒ ("foo" "bar" "baz")
+```
+
+What can this actually do for us, though? Well, it gets interesting if we do
+some `>>=`ing:
+
+```scheme
+(define state-seq
+  (mlet* %state-monad ((number (return 33)))
+    (state-push number)))
+result
+⇒ #<procedure 7fcb6f466960 at <unknown port>:1484:24 (state)>
+
+(run-with-state state-seq (list 32))
+⇒ (32)
+⇒ (33 32)
+
+(run-with-state state-seq (list 30 99))
+⇒ (30 99)
+⇒ (33 30 99)
+```
+
+What is `state-push`?  It's a monadic procedure for `%state-monad` that takes
+whatever's currently in the first value (the primary value) and pushes it onto
+the second value (the state value), which is assumed to be a list, returning the
+old state value as the primary value and the new list as the state value.
+
+So, when we do `(run-with-state result (list 32))`, we're passing `(list 32)` as
+the initial state value, and then the `>>=` form passes that and `33` to
+`state-push`.  What `%state-monad` allows us to do is thread together some
+procedures that require some kind of state, while essentially pretending the
+state value is stored globally, like you might do in, say, C, and then retrieve
+both the final state and the result at the end!
+
+If you're a bit confused, don't worry.  We'll write some of our own
+`%state-monad`-based monadic procedures and hopefully all will become clear.
+Consider, for instance, the
+[Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number), in which
+each value is computed by adding the previous two.  We could use the
+`%state-monad` to compute Fibonacci numbers by storing the previous number as
+the primary value and the number before that as the state value:
+
+```scheme
+(define (fibonacci-thing value)
+  (lambda (state)
+    (values (+ value state)
+            value)))
+```
+
+Now we can feed our Fibonacci-generating procedure the first value using
+`run-with-state` and the second using `return`:
+
+```scheme
+(run-with-state
+    (mlet* %state-monad ((starting (return 1))
+                         (n1 (fibonacci-thing starting))
+                         (n2 (fibonacci-thing n1)))
+      (fibonacci-thing n2))
+  0)
+⇒ 3
+⇒ 2
+
+(run-with-state
+    (mlet* %state-monad ((starting (return 1))
+                         (n1 (fibonacci-thing starting))
+                         (n2 (fibonacci-thing n1))
+                         (n3 (fibonacci-thing n2))
+                         (n4 (fibonacci-thing n3))
+                         (n5 (fibonacci-thing n4)))
+      (fibonacci-thing n5))
+  0)
+⇒ 13
+⇒ 8
+```
+
+This is all very nifty, and possibly useful in general, but what does this have
+to do with Guix?  Well, many Guix store-based operations are meant to be used
+in concert with yet another monad, called the `%store-monad`.  But if we look at
+`(guix store)`, where `%store-monad` is defined...
+
+```scheme
+(define-alias %store-monad %state-monad)
+(define-alias store-return state-return)
+(define-alias store-bind state-bind)
+```
+
+It was all a shallow façade!  All the "store monad" is is a special case of the
+state monad, where a value representing the store is passed as the state value.
+
+# Lies, Damned Lies, and Abstractions
+
+We mentioned that, technically, we didn't need monads for Guix.  Indeed, many
+(now deprecated) procedures take a store value as the argument, such as
+`build-expression->derivation`.  However, monads are far more elegant and
+simplify store code by quite a bit.
+
+`build-expression->derivation`, being deprecated, should never of course be
+used.  For one thing, it uses the "quoted build expression" style, rather than
+G-expressions (we'll discuss gexps another time).  The best way to create a
+derivation from some basic build code is to use the new-fangled
+`gexp->derivation` procedure:
+
+```scheme
+(use-modules (guix gexp)
+             (gnu packages irc))
+
+(define symlink-irssi
+  (gexp->derivation "link-to-irssi"
+    #~(symlink #$(file-append irssi "/bin/irssi") #$output)))
+⇒ #<procedure 7fddcc7b81e0 at guix/gexp.scm:1180:2 (state)>
+```
+
+You don't have to understand the `#~(...)` form yet, only everything surrounding
+it.  We can see that this `gexp->derivation` returns a procedure taking the
+initial state (store), just like our `%state-monad` procedures did, and like we
+used `run-with-state` to pass the initial state to a `%state-monad` monadic
+value, we use our old friend `run-with-store` when we have a `%store-monad`
+monadic value!
+
+```scheme
+(define symlink-irssi-drv
+  (with-store store
+    (run-with-store store
+      symlink-irssi)))
+⇒ #<derivation /gnu/store/q7kwwl4z6psifnv4di1p1kpvlx06fmyq-link-to-irssi.drv => /gnu/store/6a94niigx4ii0ldjdy33wx9anhifr25x-link-to-irssi 7fddb7ef52d0>
+```
+
+Let's just check this derivation is as expected by reading the code from the
+builder script.
+
+```scheme
+(define symlink-irssi-builder
+  (list-ref (derivation-builder-arguments symlink-irssi-drv) 1))
+
+(call-with-input-file symlink-irssi-builder
+  (lambda (port)
+    (read port)))
+    
+⇒ (symlink
+   "/gnu/store/hrlmypx1lrdjlxpkqy88bfrzg5p0bn6d-irssi-1.4.3/bin/irssi"
+   ((@ (guile) getenv) "out"))
+```
+
+And indeed, it symlinks the `irssi` binary to the output path.  Some other,
+higher-level, monadic procedures include `interned-file`, which copies a file
+from outside the store into it, and `text-file`, which copies some text into it.
+Generally, these procedures aren't used, as there are higher-level procedures
+that perform similar functions (which we will discuss later), but for the sake
+of this blog post, here's an example:
+
+```scheme
+(with-store store
+  (run-with-store store
+    (text-file "unmatched-paren"
+      "( <paren@disroot.org>")))
+⇒ "/gnu/store/v6smacxvdk4yvaa3s3wmd54lixn1dp3y-unmatched-paren"
+```
+
+# Conclusion
+
+What have we learned about monads?  The key points we can take away are:
+
+1. Monads are a way of composing together procedures and values that are wrapped
+   in containers that give them extra context, like `maybe` values.
+2. Guix provides a high-level monad library that compensates for Guile's lack of
+   static typing or an interface-like system.
+3. The `(guix monads)` module provides the state monad, which allows you to
+   thread state through procedures, allowing you to essentially pretend it's a
+   global variable that's modified by each procedure.
+4. Guix uses the store monad frequently to thread a store connection through
+   procedures that need it.
+5. The store monad is really just the state monad in disguise, where the state
+   value is used to thread the store object through monadic procedures.
+
+If you've read this post in its entirety but still don't yet quite get it, don't
+worry.  Try to modify and tinker about with the examples, and ask any questions
+on the IRC channel `#guix:libera.chat` and mailing list at `help-guix@gnu.org`,
+and hopefully it will all click eventually!
+
+#### About GNU Guix
+
+[GNU Guix](https://guix.gnu.org) is a transactional package manager and
+an advanced distribution of the GNU system that [respects user
+freedom](https://www.gnu.org/distros/free-system-distribution-guidelines.html).
+Guix can be used on top of any system running the Hurd or the Linux
+kernel, or it can be used as a standalone operating system distribution
+for i686, x86_64, ARMv7, AArch64 and POWER9 machines.
+
+In addition to standard package management features, Guix supports
+transactional upgrades and roll-backs, unprivileged package management,
+per-user profiles, and garbage collection.  When used as a standalone
+GNU/Linux distribution, Guix offers a declarative, stateless approach to
+operating system configuration management.  Guix is highly customizable
+and hackable through [Guile](https://www.gnu.org/software/guile)
+programming interfaces and extensions to the
+[Scheme](http://schemers.org) language.