From patchwork Thu Feb 27 20:41:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arun Isaac X-Patchwork-Id: 20458 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 86B4327BBE4; Thu, 27 Feb 2020 20:43:19 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,T_DKIM_INVALID,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTP id C083E27BBEB for ; Thu, 27 Feb 2020 20:43:18 +0000 (GMT) Received: from localhost ([::1]:38030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7Q06-0005Fg-84 for patchwork@mira.cbaines.net; Thu, 27 Feb 2020 15:43:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59481) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7Pzr-0005F1-K4 for guix-patches@gnu.org; Thu, 27 Feb 2020 15:43:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7Pzq-0008VA-Kd for guix-patches@gnu.org; Thu, 27 Feb 2020 15:43:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:54383) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j7Pzq-0008V4-Ho for guix-patches@gnu.org; Thu, 27 Feb 2020 15:43:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j7Pzq-0008KW-Dp for guix-patches@gnu.org; Thu, 27 Feb 2020 15:43:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#39258] [PATCH 0/4] Xapian for Guix package search References: In-Reply-To: Resent-From: Arun Isaac Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Thu, 27 Feb 2020 20:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39258 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: To: 39258@debbugs.gnu.org Cc: Arun Isaac , mail@ambrevar.xyz, ludo@gnu.org, zimon.toutoune@gmail.com Received: via spool by 39258-submit@debbugs.gnu.org id=B39258.158283613931926 (code B ref 39258); Thu, 27 Feb 2020 20:43:02 +0000 Received: (at 39258) by debbugs.gnu.org; 27 Feb 2020 20:42:19 +0000 Received: from localhost ([127.0.0.1]:60347 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7Pz9-0008Ir-9E for submit@debbugs.gnu.org; Thu, 27 Feb 2020 15:42:19 -0500 Received: from mugam.systemreboot.net ([139.59.75.54]:37292) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7Pz5-0008IJ-ID for 39258@debbugs.gnu.org; Thu, 27 Feb 2020 15:42:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Transfer-Encoding:MIME-Version: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=KPrBpAstwRR8GKo437fM9sJZbczJYvO3b2sEglK3y98=; b=RYts7wfsaPjOlUx6vfUcCbiDRs vk8HYHoGWTlQYeaqqB9OlssSvTP6LfcsgANtkQLGDiuhXLY2Svt/p0K1M/ozRmr1/OKihN7nQsjq9 MARvcc5TCq755Vv4QqxrDmZ61vvGuyyaKmxg7Yh+ZHWXx8na5bEm4tiY/136NVuxwz/o=; Received: from [192.168.2.1] (helo=steel.lan) by systemreboot.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1j7Pyz-001ckB-3W; Fri, 28 Feb 2020 02:12:09 +0530 From: Arun Isaac Date: Fri, 28 Feb 2020 02:11:46 +0530 Message-Id: <20200227204150.30985-1-arunisaac@systemreboot.net> X-Mailer: git-send-email 2.23.0 MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches Hi, I have finally got xapian working for package search. Some comments follow. * Speed improvement Despite search-package-index in gnu/packages.scm taking only around 1.5ms, I see an overall speedup in `guix search` of only a factor of 2 -- from around 2s to around 1s. I wonder what else in `guix search` is taking up so much time. * Currently indexing only the package descriptions In this patchset, I have only indexed the package descriptions. In the next version of this patchset, I will index all other terms as specified in %package-metrics of guix/ui.scm. * Should I add guile-xapian as a propagated input to guix in gnu/packages/package-management.scm? * Drop regexp search support In this patchset, I have retained the older regexp search support. But, I think we should drop it and only have xapian search. In cases where the search index is not authoritative, we can build an in-memory xapian search index on the fly and use it to search. This will slow down the search, but will ensure our search results are consistent and do not depend on the authoritativeness of the search index. * Commit messages Except for patch 1, I am not sure what prefixes (build-self, gnu, etc.) to use in the first line of the commit message. Some advice there would be helpful. Regards, Arun. Arun Isaac (4): gnu: Add guile-xapian. build-self: Add guile-xapian to Guix dependencies. gnu: Generate xapian package search index. gnu: Use xapian index for package search. build-aux/build-self.scm | 11 ++++++++ gnu/packages.scm | 44 ++++++++++++++++++++++++++++- gnu/packages/guile-xyz.scm | 50 ++++++++++++++++++++++++++++++++- guix/channels.scm | 34 ++++++++++++++++++++++- guix/scripts/package.scm | 57 ++++++++++++++++++++++---------------- guix/self.scm | 7 ++++- 6 files changed, 175 insertions(+), 28 deletions(-)