diff mbox series

[bug#60410,1/7] xapian: Index several terms as boolean and without positions.

Message ID 20221229202400.28565-1-arunisaac@systemreboot.net
State New
Headers show
Series mumi: Boolean prefixes in xapian indexing and others | expand

Commit Message

Arun Isaac Dec. 29, 2022, 8:23 p.m. UTC
* mumi/xapian.scm (index-files): Index bug number, submitter, authors,
owner, severity, tags, status, file and msgids as boolean terms. Index
bug number, severity, tags, status, file and msgids without position
information.
---
 mumi/xapian.scm | 65 ++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 51 insertions(+), 14 deletions(-)

Comments

Ricardo Wurmus Dec. 31, 2022, 6:09 p.m. UTC | #1
Hi Arun,

thank you for your patches!  I applied them all and then ran

    ./pre-inst-env scripts/mumi fetch

but got this error:

    worker error: (keyword-argument-error #f Unrecognized keyword () (#:positions?))

> +             ;; searching separate fields as in subject:foo, from:bar,
> +             ;; etc. We do not keep track of the within document
> +             ;; frequencies of terms that will be used for boolean
> +             ;; filtering. We do not generate position information for
> +             ;; fields that will not need phrase searching or NEAR
> +             ;; searches.
> +             (index-text! term-generator
> +                          bugid
> +                          #:prefix "B"
> +                          #:wdf-increment 0
> +                          #:positions? #f)

I made sure to update to guile-xapian 0.2.1, the latest commit, as far
as I can tell.
Arun Isaac Dec. 31, 2022, 11:02 p.m. UTC | #2
Hi Ricardo,

>     worker error: (keyword-argument-error #f Unrecognized keyword ()
>     (#:positions?))

Oops! It looks like I have been working with some unpublished
guile-xapian code. I have pushed those guile-xapian commits, released
guile-xapian 0.3.0 and updated the Guix guile-xapian package. Hopefully,
it should work now. Could you try again?

Thanks,
Arun
Ricardo Wurmus Jan. 1, 2023, 12:14 p.m. UTC | #3
Hi Arun,

>>     worker error: (keyword-argument-error #f Unrecognized keyword ()
>>     (#:positions?))
>
> Oops! It looks like I have been working with some unpublished
> guile-xapian code. I have pushed those guile-xapian commits, released
> guile-xapian 0.3.0 and updated the Guix guile-xapian package. Hopefully,
> it should work now. Could you try again?

Thank you, thisk works!
I applied the changes.
diff mbox series

Patch

diff --git a/mumi/xapian.scm b/mumi/xapian.scm
index 68169e8..06a54cd 100644
--- a/mumi/xapian.scm
+++ b/mumi/xapian.scm
@@ -1,6 +1,6 @@ 
 ;;; mumi -- Mediocre, uh, mail interface
 ;;; Copyright © 2020, 2022 Ricardo Wurmus <rekado@elephly.net>
-;;; Copyright © 2020 Arun Isaac <arunisaac@systemreboot.net>
+;;; Copyright © 2020, 2022 Arun Isaac <arunisaac@systemreboot.net>
 ;;;
 ;;; This program is free software: you can redistribute it and/or
 ;;; modify it under the terms of the GNU Affero General Public License
@@ -119,20 +119,57 @@  messages and index their contents in the Xapian database at DBPATH."
                   (term-generator (make-term-generator #:stem (make-stem "en")
                                                        #:document doc)))
              ;; Index fields with a suitable prefix. This allows for
-             ;; searching separate fields as in subject:foo,
-             ;; from:bar, etc.
-             (index-text! term-generator bugid #:prefix "B")
-             (index-text! term-generator submitter #:prefix "A")
-             (index-text! term-generator authors #:prefix "XA")
+             ;; searching separate fields as in subject:foo, from:bar,
+             ;; etc. We do not keep track of the within document
+             ;; frequencies of terms that will be used for boolean
+             ;; filtering. We do not generate position information for
+             ;; fields that will not need phrase searching or NEAR
+             ;; searches.
+             (index-text! term-generator
+                          bugid
+                          #:prefix "B"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          submitter
+                          #:prefix "A"
+                          #:wdf-increment 0)
+             (index-text! term-generator
+                          authors
+                          #:prefix "XA"
+                          #:wdf-increment 0)
              (index-text! term-generator subjects #:prefix "S")
-             (index-text! term-generator (or (bug-owner bug) "") #:prefix "XO")
-             (index-text! term-generator (or (bug-severity bug) "normal") #:prefix "XS")
-             (index-text! term-generator (or (bug-tags bug) "") #:prefix "XT")
-             (index-text! term-generator (cond
-                                          ((bug-done bug) "done")
-                                          (else "open")) #:prefix "XSTATUS")
-             (index-text! term-generator file #:prefix "F")
-             (index-text! term-generator msgids #:prefix "XU")
+             (index-text! term-generator
+                          (or (bug-owner bug) "")
+                          #:prefix "XO"
+                          #:wdf-increment 0)
+             (index-text! term-generator
+                          (or (bug-severity bug) "normal")
+                          #:prefix "XS"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          (or (bug-tags bug) "")
+                          #:prefix "XT"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          (cond
+                           ((bug-done bug) "done")
+                           (else "open"))
+                          #:prefix "XSTATUS"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          file
+                          #:prefix "F"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          msgids
+                          #:prefix "XU"
+                          #:wdf-increment 0
+                          #:positions? #f)
 
              ;; Index subject and body without prefixes for general
              ;; search.