diff mbox series

[bug#68271,1/3] guix: utils: Add delete-duplicates/sort.

Message ID ce0156e7528f6a6a54f0469774b41604e7872a22.1704488002.git.mail@cbaines.net
State New
Headers show
Series Make some deduplicating speedups. | expand

Commit Message

Christopher Baines Jan. 5, 2024, 8:53 p.m. UTC
Similar to delete-duplicates, but sorts the list first and then compares the
pairs in the list. This is faster than delete-duplicates for larger lists.

* guix/utils.scm (delete-duplicates): New procedure.

Change-Id: Ibc2897cdb7c76be0d0e099bc47fee005a88bea2e
---
 guix/utils.scm | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)


base-commit: deeb7d1f53d7ddfa977b3eadd760312bbd0a2509

Comments

Ludovic Courtès Jan. 12, 2024, 1:53 p.m. UTC | #1
Christopher Baines <mail@cbaines.net> skribis:

> Similar to delete-duplicates, but sorts the list first and then compares the
> pairs in the list. This is faster than delete-duplicates for larger lists.

We’re dealing with small lists here though (derivation inputs).  Did you
try to time the benefit?

There’s a complexity/benefit tradeoff and I wonder if it cuts it.

> +(define* (delete-duplicates/sort unsorted-lst less #:optional (equal? eq?))
> +  (if (null? unsorted-lst)
> +      unsorted-lst

This ‘if’ is unnecessary.

> +      (let ((sorted-lst (sort unsorted-lst
> +                              ;; Sort in the reverse order
> +                              (lambda (a b) (eq? #f (less a b))))))

Just: (negate less).

> +        (let loop ((lst (cdr sorted-lst))
> +                   (last-element (car sorted-lst))
> +                   (result (list (car sorted-lst))))
> +          (if (null? lst)
> +              result
> +              (let ((current-element (car lst)))

Please follow the convention of using ‘match’ rather than car caddr
caddddar (info "(guix) Data Types and Pattern Matching"):

  (match lst
    (()
     result)
    ((head . tail)
     …))

Ludo’.
diff mbox series

Patch

diff --git a/guix/utils.scm b/guix/utils.scm
index e4e9d922e7..c1c967ee6c 100644
--- a/guix/utils.scm
+++ b/guix/utils.scm
@@ -146,6 +146,8 @@  (define-module (guix utils)
             edit-expression
             delete-expression
 
+            delete-duplicates/sort
+
             filtered-port
             decompressed-port
             call-with-decompressed-port
@@ -502,6 +504,32 @@  (define (delete-expression source-properties)
   "Delete the expression specified by SOURCE-PROPERTIES."
   (edit-expression source-properties (const "") #:include-trailing-newline? #t))
 
+
+;;;
+;;; Lists.
+;;;
+
+(define* (delete-duplicates/sort unsorted-lst less #:optional (equal? eq?))
+  (if (null? unsorted-lst)
+      unsorted-lst
+      (let ((sorted-lst (sort unsorted-lst
+                              ;; Sort in the reverse order
+                              (lambda (a b) (eq? #f (less a b))))))
+        (let loop ((lst (cdr sorted-lst))
+                   (last-element (car sorted-lst))
+                   (result (list (car sorted-lst))))
+          (if (null? lst)
+              result
+              (let ((current-element (car lst)))
+                (if (equal? current-element last-element)
+                    (loop (cdr lst)
+                          last-element
+                          result)
+                    (loop (cdr lst)
+                          current-element
+                          (cons current-element
+                                result)))))))))
+
 
 ;;;
 ;;; Keyword arguments.