From patchwork Thu Dec 29 20:23:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arun Isaac X-Patchwork-Id: 45691 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 14B9F27BBF0; Thu, 29 Dec 2022 20:25:28 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 3DCFD27BBE9 for ; Thu, 29 Dec 2022 20:25:27 +0000 (GMT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pAzSf-00075i-AM; Thu, 29 Dec 2022 15:25:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzSY-00072m-ON for guix-patches@gnu.org; Thu, 29 Dec 2022 15:25:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pAzSY-0005Xk-FS for guix-patches@gnu.org; Thu, 29 Dec 2022 15:25:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pAzSY-0007Re-BF for guix-patches@gnu.org; Thu, 29 Dec 2022 15:25:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#60410] [PATCH 2/7] xapian: Declare some prefixes as boolean. Resent-From: Arun Isaac Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Thu, 29 Dec 2022 20:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60410 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 60410@debbugs.gnu.org, Ricardo Wurmus Cc: Arun Isaac Received: via spool by 60410-submit@debbugs.gnu.org id=B60410.167234545528505 (code B ref 60410); Thu, 29 Dec 2022 20:25:02 +0000 Received: (at 60410) by debbugs.gnu.org; 29 Dec 2022 20:24:15 +0000 Received: from localhost ([127.0.0.1]:32996 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzRn-0007Pg-Fg for submit@debbugs.gnu.org; Thu, 29 Dec 2022 15:24:15 -0500 Received: from mugam.systemreboot.net ([139.59.75.54]:60018) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzRh-0007Oz-Mm for 60410@debbugs.gnu.org; Thu, 29 Dec 2022 15:24:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vw43Ev6NNC1FNSe67JTbpPLmDxSes1fNIbwm3ZE5HWk=; b=mUH0RGQ3mKQiogLGEQwQwsfZkm i7q3aZc1LS0Fdr0VQlRnIt8VqZZI0t+t1ik2yBJGUMIrrtXq0Co9rzawW+nZN2B/S4UEe4e2Pgckp wL2WDv2Wk93atz79hns9eCG7DYEE4o2fHWifkcmDNYHoFtuhlbmiHWcTnt8MzlhNakF96kYTJGKrT J3ePeT+9CCdTa91WYGUuzEAOYEHcwC5dyydtj6hVknT3qZE1h5DfG7C2rSOIBBeS19QW32Ysm52QN nc7pZ/vH68oBBrHw5tXGqoF3uS+VMpFTamZ6sVtWaKVjELs8ZeUGD9zxhgrL+dsl1aJJCy6RosGI7 ICwByMaw==; Received: from [192.168.2.1] (port=38338 helo=localhost.localdomain) by systemreboot.net with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1pAzRe-000oIe-2r; Fri, 30 Dec 2022 01:54:07 +0530 From: Arun Isaac Date: Thu, 29 Dec 2022 20:23:55 +0000 Message-Id: <20221229202400.28565-2-arunisaac@systemreboot.net> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221229201809.27997-1-arunisaac@systemreboot.net> References: <20221229201809.27997-1-arunisaac@systemreboot.net> MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches Some prefixes will only ever be used to filter the rest of the query and not for matching approximately using relevance weighting schemes. Such prefixes should be indexed as boolean prefixes. * mumi/xapian.scm (parse-query*): Support boolean prefixes. (search): Declare author, msgid, owner, severity, status, submitter and tag as boolean prefixes. --- mumi/xapian.scm | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/mumi/xapian.scm b/mumi/xapian.scm index 06a54cd..7bf84d3 100644 --- a/mumi/xapian.scm +++ b/mumi/xapian.scm @@ -249,7 +249,7 @@ messages and index their contents in the Xapian database at DBPATH." (invalid (pk invalid ""))) token)) -(define* (parse-query* querystring #:key stemmer stemming-strategy (prefixes '())) +(define* (parse-query* querystring #:key stemmer stemming-strategy (prefixes '()) (boolean-prefixes '())) (let ((queryparser (new-QueryParser)) (date-range-processor (new-DateRangeProcessor 0 "date:" 0)) (mdate-range-processor (new-DateRangeProcessor 1 "mdate:" 0))) @@ -261,6 +261,10 @@ messages and index their contents in the Xapian database at DBPATH." ((field . prefix) (QueryParser-add-prefix queryparser field prefix))) prefixes) + (for-each (match-lambda + ((field . prefix) + (QueryParser-add-boolean-prefix queryparser field prefix))) + boolean-prefixes) (QueryParser-add-rangeprocessor queryparser date-range-processor) (QueryParser-add-rangeprocessor queryparser mdate-range-processor) (let ((query (QueryParser-parse-query queryparser querystring @@ -324,14 +328,14 @@ intact." ;; prefixes for field search. (query (parse-query* querystring* #:stemmer (make-stem "en") - #:prefixes '(("submitter" . "A") - ("author" . "XA") - ("subject" . "S") - ("owner" . "XO") - ("severity" . "XS") - ("tag" . "XT") - ("status" . "XSTATUS") - ("msgid" . "XU")))) + #:prefixes '(("subject" . "S")) + #:boolean-prefixes '(("author" . "XA") + ("msgid" . "XU") + ("owner" . "XO") + ("severity" . "XS") + ("status" . "XSTATUS") + ("submitter" . "A") + ("tag" . "XT")))) (enq (enquire db query))) ;; Collapse on mergedwith value (Enquire-set-collapse-key enq 2 1)