[bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad.

Message ID	20230214073314.2651-1-paren@disroot.org
State	New
Headers	show Return-Path: <guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org> Subject: [bug#61214] [PATCH guix-artwork v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad. References: <20230201172821.3072-1-paren@disroot.org> In-Reply-To: <20230201172821.3072-1-paren@disroot.org> Resent-From: "(" <paren@disroot.org> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org> Resent-CC: guix-patches@gnu.org Resent-Date: Tue, 14 Feb 2023 07:34:02 +0000 Resent-Message-ID: <handler.61214.B61214.167636000219731@debbugs.gnu.org> Resent-Sender: help-debbugs@gnu.org To: 61214@debbugs.gnu.org Cc: "\(" <paren@disroot.org>, ludo@gnu.org, mail@cbaines.net, zimoun.toutoune@gmail.com Date: Tue, 14 Feb 2023 07:33:14 +0000 Message-Id: <20230214073314.2651-1-paren@disroot.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Reply-to: "\(" <paren@disroot.org> From: "\( via Guix-patches" via <guix-patches@gnu.org> Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches
Series	[bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad. \| expand [bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad.

diff --git a/website/posts/dissecting-guix-2-store-monad.md b/website/posts/dissecting-guix-2-store-monad.md new file mode 100644 index 0000000..13b9cbb --- /dev/null +++ b/website/posts/dissecting-guix-2-store-monad.md @@ -0,0 +1,567 @@ +title: Dissecting Guix, Part 2: The Store Monad +date: TBC +author: ( +tags: Dissecting Guix, Functional package management, Programming interfaces, Scheme API +--- +Hello again! + +In [the last post](https://guix.gnu.org/en/blog/2023/dissecting-guix-part-1-derivations/), +we briefly mentioned the `with-store` and `run-with-store` macros. Today, we'll +be looking at those in further detail, along with the related monad library and +the [`%store-monad`](https://guix.gnu.org/manual/devel/en/html_node/The-Store-Monad.html)! + +Typically, we use monads to chain operations together, and the `%store-monad` is +no different; it's used to combine operations that work on the Guix store (for +instance, creating derivations, building derivations, or adding data files to +the store). + +However, monads are a little hard to explain, and from a distance, they seem to +be quite incomprehensible. So, I want you to erase them from your mind for now. +We'll come back to them later. And be aware that if you can't seem to get your +head around them, it's okay; you can understand most of the architecture of Guix +without understanding monads. + +# Yes, No, Maybe So + +Let's instead implement another M of functional programming, _`maybe`_ values, +representing a value that may or may not exist. For instance, there could be a +procedure that attempts to pop a stack, returing the result if there is one, or +`nothing` if the stack has no elements. + +`maybe` is a very common feature of statically-typed functional languages, and +you'll see it all over the place in Haskell and OCaml code. However, Guile is +dynamically typed, so we usually use ad-hoc `#f` values as the "null value" +instead of a proper "nothing" or "none". + +Just for fun, though, we'll implement a proper `maybe` in Guile. Fire up that +REPL once again, and let's import a bunch of modules that we'll need: + +```scheme +(use-modules (ice-9 match) + (srfi srfi-9)) +``` + +We'll implement `maybe` as a record with two fields, `is?` and `value`. If the +value contains something, `is?` will be `#t` and `value` will contain the thing +in question, and if it's empty, `is?`'ll be `#f`. + +```scheme +(define-record-type <maybe> + (make-maybe is? value) + maybe? + (is? maybe-is?) + (value maybe-value)) +``` + +Now we'll define constructors for the two possible states: + +```scheme +(define (something value) + (make-maybe #t value)) + +(define (nothing) + (make-maybe #f #f)) ;the value here doesn't matter; we'll just use #f +``` + +And make some silly functions that return optional values: + +```scheme +(define (remove-a str) + (if (eq? (string-ref str 0) #\a) + (something (substring str 1)) + (nothing))) + +(define (remove-b str) + (if (eq? (string-ref str 0) #\b) + (something (substring str 1)) + (nothing))) + +(remove-a "ahh") +⇒ #<<maybe> is?: #t value: "hh"> + +(remove-a "ooh") +⇒ #<<maybe> is?: #f value: #f> + +(remove-b "bad") +⇒ #<<maybe> is?: #t value: "ad"> +``` + +But what if we want to compose the results of these functions? + +# Keeping Your Composure + +As you might have guessed, this is not fun. Cosplaying as a compiler backend +typically isn't. + +```scheme +(let ((t1 (remove-a "abcd"))) + (if (maybe-is? t1) + (remove-b (maybe-value t1)) + (nothing))) +⇒ #<<maybe> is?: #t value: "cd"> + +(let ((t1 (remove-a "bbcd"))) + (if (maybe-is? t1) + (remove-b (maybe-value t1)) + (nothing))) +⇒ #<<maybe> is?: #f value: #f> +``` + +I can almost hear the heckling. Even worse, composing three: + +```scheme +(let* ((t1 (remove-a "abad")) + (t2 (if (maybe-is? t1) + (remove-b (maybe-value t1)) + (nothing)))) + (if (maybe-is? t2) + (remove-a (maybe-value t2)) + (nothing))) +⇒ #<<maybe> is?: #t value: "d"> +``` + +So, how do we go about making this more bearable? Well, one way could be to +make `remove-a` and `remove-b` accept `maybe`s: + +```scheme +(define (remove-a ?str) + (match ?str + (($ <maybe> #t str) + (if (eq? (string-ref str 0) #\a) + (something (substring str 1)) + (nothing))) + (_ (nothing)))) + +(define (remove-b ?str) + (match ?str + (($ <maybe> #t str) + (if (eq? (string-ref str 0) #\b) + (something (substring str 1)) + (nothing))) + (_ (nothing)))) +``` + +Not at all pretty, but it works! + +```scheme +(remove-b (remove-a (something "abc"))) +⇒ #<<maybe> is?: #t value: "c"> +``` + +Still, our procedures now require quite a bit of boilerplate. Might there be a +better way? + +# The Ties That `>>=` Us + +First of all, we'll revert to our original definitions of `remove-a` and +`remove-b`, that is to say, the ones that take a regular value and return a +`maybe`. + +```scheme +(define (remove-a str) + (if (eq? (string-ref str 0) #\a) + (something (substring str 1)) + (nothing))) + +(define (remove-b str) + (if (eq? (string-ref str 0) #\b) + (something (substring str 1)) + (nothing))) +``` + +What if tried introducing higher-order procedures (procedures that accept other +procedures as arguments) into the equation? Because we're functional +programmers and we have an unhealthy obsession with that sort of thing. + +```scheme +(define (maybe-chain maybe proc) + (if (maybe-is? maybe) + (proc (maybe-value maybe)) + (nothing))) + +(maybe-chain (something "abc") + remove-a) +⇒ #<<maybe> is?: #t value: "bc"> + +(maybe-chain (nothing) + remove-a) +⇒ #<<maybe> is?: #f value: #f> +``` + +It lives! To make it easier to compose procedures like this, we'll define a +macro that allows us to perform any number of sequenced operations with only one +composition form: + +```scheme +(define-syntax maybe-chain* + (syntax-rules () + ((_ maybe proc) + (maybe-chain maybe proc)) + ((_ maybe proc rest ...) + (maybe-chain* (maybe-chain maybe proc) + rest ...)))) + +(maybe-chain* (something "abad") + remove-a + remove-b + remove-a) +⇒ #<<maybe> is?: #t value: "d"> +``` + +Congratulations, you've just implemented the `bind` operation, commonly written +as `>>=`, for our `maybe` type. And it turns out that a monad is just any +container-like value for which `>>=` (along with another procedure called +`return`, which wraps a given value in the simplest possible form of a monad) +has been implemented. + +A more formal definition would be that a monad is a mathematical object composed +of three parts: a type, a `bind` function, and a `return` function. So, how do +monads relate to Guix? + +# New Wheel, Old Wheel + +Now that we've reinvented the wheel, we'd better learn to use the original +wheel. Guix provides a generic, high-level monads library, along with the two +generic monads `%identity-monad` and `%state-monad`, and the Guix-specific +`%store-monad`. Since `maybe` is not one of them, let's integrate our version +into the Guix monad system! + +First we'll import the module that provides the aforementioned library: + +```scheme +(use-modules (guix monads)) +``` + +To define a monad's behaviour in Guix, we simply use the `define-monad` macro, +and provide two procedures: `bind`, and `return`. + +```scheme +(define-monad %maybe-monad + (bind maybe-chain) + (return something)) +``` + +`bind` is just the procedure that we use to compose monadic procedure calls +together, and `return` is the procedure that wraps values in the most basic form +of the monad. A properly implemented `bind` and `return` must follow the +so-called _monad laws_: + +1. `(bind (return x) proc)` must be equivalent to `(proc x)`. +2. `(bind monad return)` must be equivalent to just `monad`. +3. `(bind (bind monad proc-1) proc-2)` must be equivalent to + `(bind monad (lambda (x) (bind (proc-1 x) proc-2)))`. + +Let's verify that our `maybe-chain` and `something` procedures adhere to the +monad laws: + +```scheme +(define (mlaws-proc-1 x) + (something (+ x 1))) + +(define (mlaws-proc-2 x) + (something (+ x 2))) + +;; First law: the left identity. +(equal? (maybe-chain (something 0) + mlaws-proc-1) + (mlaws-proc-1 0)) +⇒ #t + +;; Second law: the right identity. +(equal? (maybe-chain (something 0) + something) + (something 0)) +⇒ #t + +;; Third law: associativity. +(equal? (maybe-chain (maybe-chain (something 0) + mlaws-proc-1) + mlaws-proc-2) + (maybe-chain (something 0) + (lambda (x) + (maybe-chain (mlaws-proc-1 x) + mlaws-proc-2)))) +⇒ #t +``` + +Now that we know they're valid, we can use the `with-monad` macro to tell Guix +to use these specific implementations of `bind` and `return`, and the `>>=` +macro to thread monads through procedure calls! + +```scheme +(with-monad %maybe-monad + (>>= (something "aabbc") + remove-a + remove-a + remove-b + remove-b)) +⇒ #<<maybe> is?: #t value: "c"> +``` + +We can also now use `return`: + +```scheme +(with-monad %maybe-monad + (return 32)) +⇒ #<<maybe> is?: #t value: 32> +``` + +But Guix provides many higher-level interfaces than `>>=` and `return`, as we +will see. There's `mbegin`, which evaluates monadic expressions without binding +them to symbols, returning the last one. This, however, isn't particularly +useful with our `%maybe-monad`, as it's only really usable if the monadic +operations within have side effects, just like the non-monadic `begin`. + +There's also `mlet` and `mlet*`, which _do_ bind the results of monadic +expressions to symbols, and are essentially equivalent to a chain of +`(>>= MEXPR (lambda (BINDING) ...))`: + +```scheme +;; This is equivalent... +(mlet* %maybe-monad ((str -> "abad") ;non-monadic binding uses the -> symbol + (str1 (remove-a str)) + (str2 (remove-b str))) + (remove-a str)) +⇒ #<<maybe> is?: #t value: "d"> + +;; ...to this: +(with-monad %maybe-monad + (>>= (return "abad") + (lambda (str) + (remove-a str)) + (lambda (str1) + (remove-b str)) + (lambda (str2) + (remove-a str)))) +``` + +Various abstractions over these two exist too, such as `mwhen` (a `when` plus an +`mbegin`), `munless` (an `unless` plus an `mbegin`), and `mparameterize` +(dynamically-scoped value rebinding, like `parameterize`, in a monadic context). +`lift` takes a procedure and a monad and creates a new procedure that returns +a monadic value. + +There are also interfaces for manipulating lists wrapped in monads; `listm` +creates such a list, `sequence` turns a list of monads into a list wrapped in a +monad, and the `anym`, `mapm`, and `foldm` procedures are like their non-monadic +equivalents, except that they return lists wrapped in monads. + +This is all well and good, you may be thinking, but why does Guix need a monad +library, anyway? The answer is technically that it doesn't. But building on +the monad API makes a lot of things much easier, and to learn why, we're going +to look at one of Guix's built-in monads. + +# In a State + +Guix implements a monad called `%state-monad`, and it works with single-argument +procedures returning two values. Behold: + +```scheme +(with-monad %state-monad + (return 33)) +⇒ #<procedure 21dc9a0 at <unknown port>:1106:22 (state)> +``` + +The `run-with-state` value turns this procedure into an actually useful value, +or, rather, two values: + +```scheme +(run-with-state (with-monad %state-monad (return 33)) + (list "foo" "bar" "baz")) +⇒ 33 +⇒ ("foo" "bar" "baz") +``` + +What can this actually do for us, though? Well, it gets interesting if we do +some `>>=`ing: + +```scheme +(define state-seq + (mlet* %state-monad ((number (return 33))) + (state-push number))) +result +⇒ #<procedure 7fcb6f466960 at <unknown port>:1484:24 (state)> + +(run-with-state state-seq (list 32)) +⇒ (32) +⇒ (33 32) + +(run-with-state state-seq (list 30 99)) +⇒ (30 99) +⇒ (33 30 99) +``` + +What is `state-push`? It's a monadic procedure for `%state-monad` that takes +whatever's currently in the first value (the primary value) and pushes it onto +the second value (the state value), which is assumed to be a list, returning the +old state value as the primary value and the new list as the state value. + +So, when we do `(run-with-state result (list 32))`, we're passing `(list 32)` as +the initial state value, and then the `>>=` form passes that and `33` to +`state-push`. What `%state-monad` allows us to do is thread together some +procedures that require some kind of state, while essentially pretending the +state value is stored globally, like you might do in, say, C, and then retrieve +both the final state and the result at the end! + +If you're a bit confused, don't worry. We'll write some of our own +`%state-monad`-based monadic procedures and hopefully all will become clear. +Consider, for instance, the +[Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number), in which +each value is computed by adding the previous two. We could use the +`%state-monad` to compute Fibonacci numbers by storing the previous number as +the primary value and the number before that as the state value: + +```scheme +(define (fibonacci-thing value) + (lambda (state) + (values (+ value state) + value))) +``` + +Now we can feed our Fibonacci-generating procedure the first value using +`run-with-state` and the second using `return`: + +```scheme +(run-with-state + (mlet* %state-monad ((starting (return 1)) + (n1 (fibonacci-thing starting)) + (n2 (fibonacci-thing n1))) + (fibonacci-thing n2)) + 0) +⇒ 3 +⇒ 2 + +(run-with-state + (mlet* %state-monad ((starting (return 1)) + (n1 (fibonacci-thing starting)) + (n2 (fibonacci-thing n1)) + (n3 (fibonacci-thing n2)) + (n4 (fibonacci-thing n3)) + (n5 (fibonacci-thing n4))) + (fibonacci-thing n5)) + 0) +⇒ 13 +⇒ 8 +``` + +This is all very nifty, and possibly useful in general, but what does this have +to do with Guix? Well, many Guix store-based operations are meant to be used +in concert with yet another monad, called the `%store-monad`. But if we look at +`(guix store)`, where `%store-monad` is defined... + +```scheme +(define-alias %store-monad %state-monad) +(define-alias store-return state-return) +(define-alias store-bind state-bind) +``` + +It was all a shallow façade! All the "store monad" is is a special case of the +state monad, where a value representing the store is passed as the state value. + +# Lies, Damned Lies, and Abstractions + +We mentioned that, technically, we didn't need monads for Guix. Indeed, many +(now deprecated) procedures take a store value as the argument, such as +`build-expression->derivation`. However, monads are far more elegant and +simplify store code by quite a bit. + +`build-expression->derivation`, being deprecated, should never of course be +used. For one thing, it uses the "quoted build expression" style, rather than +G-expressions (we'll discuss gexps another time). The best way to create a +derivation from some basic build code is to use the new-fangled +`gexp->derivation` procedure: + +```scheme +(use-modules (guix gexp) + (gnu packages irc)) + +(define symlink-irssi + (gexp->derivation "link-to-irssi" + #~(symlink #$(file-append irssi "/bin/irssi") #$output))) +⇒ #<procedure 7fddcc7b81e0 at guix/gexp.scm:1180:2 (state)> +``` + +You don't have to understand the `#~(...)` form yet, only everything surrounding +it. We can see that this `gexp->derivation` returns a procedure taking the +initial state (store), just like our `%state-monad` procedures did, and like we +used `run-with-state` to pass the initial state to a `%state-monad` monadic +value, we use our old friend `run-with-store` when we have a `%store-monad` +monadic value! + +```scheme +(define symlink-irssi-drv + (with-store store + (run-with-store store + symlink-irssi))) +⇒ #<derivation /gnu/store/q7kwwl4z6psifnv4di1p1kpvlx06fmyq-link-to-irssi.drv => /gnu/store/6a94niigx4ii0ldjdy33wx9anhifr25x-link-to-irssi 7fddb7ef52d0> +``` + +Let's just check this derivation is as expected by reading the code from the +builder script. + +```scheme +(define symlink-irssi-builder + (list-ref (derivation-builder-arguments symlink-irssi-drv) 1)) + +(call-with-input-file symlink-irssi-builder + (lambda (port) + (read port))) + +⇒ (symlink + "/gnu/store/hrlmypx1lrdjlxpkqy88bfrzg5p0bn6d-irssi-1.4.3/bin/irssi" + ((@ (guile) getenv) "out")) +``` + +And indeed, it symlinks the `irssi` binary to the output path. Some other, +higher-level, monadic procedures include `interned-file`, which copies a file +from outside the store into it, and `text-file`, which copies some text into it. +Generally, these procedures aren't used, as there are higher-level procedures +that perform similar functions (which we will discuss later), but for the sake +of this blog post, here's an example: + +```scheme +(with-store store + (run-with-store store + (text-file "unmatched-paren" + "( <paren@disroot.org>"))) +⇒ "/gnu/store/v6smacxvdk4yvaa3s3wmd54lixn1dp3y-unmatched-paren" +``` + +# Conclusion + +What have we learned about monads? The key points we can take away are: + +1. Monads are a way of composing together procedures and values that are wrapped + in containers that give them extra context, like `maybe` values. +2. Guix provides a high-level monad library that compensates for Guile's lack of + static typing or an interface-like system. +3. The `(guix monads)` module provides the state monad, which allows you to + thread state through procedures, allowing you to essentially pretend it's a + global variable that's modified by each procedure. +4. Guix uses the store monad frequently to thread a store connection through + procedures that need it. +5. The store monad is really just the state monad in disguise, where the state + value is used to thread the store object through monadic procedures. + +If you've read this post in its entirety but still don't yet quite get it, don't +worry. Try to modify and tinker about with the examples, and ask any questions +on the IRC channel `#guix:libera.chat` and mailing list at `help-guix@gnu.org`, +and hopefully it will all click eventually! + +#### About GNU Guix + +[GNU Guix](https://guix.gnu.org) is a transactional package manager and +an advanced distribution of the GNU system that [respects user +freedom](https://www.gnu.org/distros/free-system-distribution-guidelines.html). +Guix can be used on top of any system running the Hurd or the Linux +kernel, or it can be used as a standalone operating system distribution +for i686, x86_64, ARMv7, AArch64 and POWER9 machines. + +In addition to standard package management features, Guix supports +transactional upgrades and roll-backs, unprivileged package management, +per-user profiles, and garbage collection. When used as a standalone +GNU/Linux distribution, Guix offers a declarative, stateless approach to +operating system configuration management. Guix is highly customizable +and hackable through [Guile](https://www.gnu.org/software/guile) +programming interfaces and extensions to the +[Scheme](http://schemers.org) language.

[bug#61214,guix-artwork,v5] website: posts: Add Dissecting Guix, Part 2: The Store Monad.

Commit Message

Comments

Patch