From patchwork Wed Oct 12 11:24:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lars-Dominik Braun X-Patchwork-Id: 43348 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 6542427BBEA; Wed, 12 Oct 2022 12:25:15 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id AD44627BBE9 for ; Wed, 12 Oct 2022 12:25:14 +0100 (BST) Received: from localhost ([::1]:53582 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oiZrN-0000fm-Kw for patchwork@mira.cbaines.net; Wed, 12 Oct 2022 07:25:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38234) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oiZrC-0000fe-7K for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:56824) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oiZrB-0006jM-Uw for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oiZrB-0000G5-P8 for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#58136] [PATCH] ui: Improve sort order when searching package names. Resent-From: Lars-Dominik Braun Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 12 Oct 2022 11:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58136 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: zimoun Cc: ludo@gnu.org, 58136@debbugs.gnu.org Received: via spool by 58136-submit@debbugs.gnu.org id=B58136.1665573858931 (code B ref 58136); Wed, 12 Oct 2022 11:25:01 +0000 Received: (at 58136) by debbugs.gnu.org; 12 Oct 2022 11:24:18 +0000 Received: from localhost ([127.0.0.1]:55899 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oiZqU-0000Ex-0U for submit@debbugs.gnu.org; Wed, 12 Oct 2022 07:24:18 -0400 Received: from mail-wr1-f51.google.com ([209.85.221.51]:34467) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oiZqS-0000El-2v for 58136@debbugs.gnu.org; Wed, 12 Oct 2022 07:24:16 -0400 Received: by mail-wr1-f51.google.com with SMTP id b4so25784466wrs.1 for <58136@debbugs.gnu.org>; Wed, 12 Oct 2022 04:24:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leibniz-psychology-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=MrfnaiYkrXg/oz6p3rCZ467AMtTMt5CjEXJzoyzHU1M=; b=O2MTMSq+BPERh4/auVY34PTalkgEcPEzJpiGeT3xWzcJRMGbQSWDZOxUWGNRx0EPVT mYMJyFXvsMBZt6CXlErY7eM3Asz4ZwxIxnUxghv6nfiMv3XLriOegX2Ul2XjTUkL8ak0 Zvt4i1JMq4ihKubYgI+rW1198W/7pWOEGeIcBUjv+cL5g24TwKhKNe4ezB5y41l0TpN+ bE0DhmuSMLdR4D3kpX9X3vOQLn+BQAngWotdlf1x93sMImdkiJPWzD341kis/cjium5Q qNGrpGA0lCc5QLriBqwB3CmwRvwfwHPYlwfDbMRsWA/8cWHKScg/zXjuZSnDoM8zgKwm JW/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=MrfnaiYkrXg/oz6p3rCZ467AMtTMt5CjEXJzoyzHU1M=; b=6bVH05uMy9+is01b7JgjWviHRwCI+2Tcgw8IeHnLJ0C2mk4Mdr1BmELO97tsHb1pTP daj4wjky2/8mOy63PRFYpcp/wWPzBkwgaogx3GRhp40XyNMzr7e27QxCTZc6zfBepZ+O rtPrNBNXQXnTJsYSRQcdLLxKf9ZGqFjQoR0opzsZvAUPbGM7oB4Dt9P/yaKZUgK3DKAX Xf68HlOpEVMlXAyf0nyWP3sLk2Gn9ZUA2NL1Q0HLad93QM2cjIvMKzqe3epqJqogjb3Z oLBNLKPf3eOwFL19ayiiika26/nOph6TZoGi1p54/lY/2Cnccxc+RnOrmtOAtohca6Pf abmQ== X-Gm-Message-State: ACrzQf2s40Yf3BnJQTFZDkOc10yXThga9dG342P0yVNUxdSLB2Zfm31H EbCLab6FT5p0irevn38y7HEDJJxr3BIw3n+xa/T5kRuPhyyr3w6EAreJ3WLvS3O76l5rIbNoRvm dPLrW8bfxHBP2TU3b3lUsyS8Hwg2yf4irlnRkxMnUbISKY+YwGacXMG0rYokC3g3OeS70zFfDtH 7UbHE= X-Google-Smtp-Source: AMsMyM48etzAk6iiUFVBA50abfDNao+EMp5IfPQoQJfkVDkVsAKIV06BVALMRyk62GGJOUDb7N6K0w== X-Received: by 2002:a05:6000:788:b0:22e:412b:7959 with SMTP id bu8-20020a056000078800b0022e412b7959mr17943239wrb.491.1665573849958; Wed, 12 Oct 2022 04:24:09 -0700 (PDT) Received: from localhost (opensense.uni-trier.de. [136.199.1.50]) by smtp.gmail.com with ESMTPSA id g17-20020a05600c001100b003c6bbe910fdsm1877617wmc.9.2022.10.12.04.24.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Oct 2022 04:24:09 -0700 (PDT) Date: Wed, 12 Oct 2022 13:24:08 +0200 From: Lars-Dominik Braun Message-ID: References: <86wn9na82p.fsf@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <86wn9na82p.fsf@gmail.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches Hi simon, > In addition to your proposal which LGTM, maybe we could also use the > ’upstream-name’ properties. Most of the time, the Guix name matches the > upstream name, but sometimes not. Although, it would not fix the issue > for ggplot2 since there is no upstream-name for this package. :-) I agree that using the upstream-name would be a good idea. > 2. set the “namespace” weight to 1 (or 2 if you prefer) > > Otherwise, for example, generic name as CSV could artificially bump > the relevance and hide relevant packages. For instance, compare > > guix search csv The issue here is we don’t know what the user is searching for. If we add more weight to the package name then usually libraries (rust-csv, ghc-csv, …) win. Imo a search for “csv” should return tools to manipulate CSV files like csvkit, csvdiff, xlsx2csv, … Just like “json” should yield tools like jq, json.sh and possibly others which I cannot find right now. But maybe I’m searching for a C library that parses CSV instead. And then what…? As for ggplot2, the particular issue seems to be that scores are added for each match and the description for some of our packages contains “ggplot2” alot. So I tried using MAX instead of +, which works, but results in little variation of scores and thus weird sort order (descending by name). It does not feel like an improvement either. Cheers, Lars diff --git a/guix/packages.scm b/guix/packages.scm index 94e464cd01..9934501cdb 100644 --- a/guix/packages.scm +++ b/guix/packages.scm @@ -86,6 +86,7 @@ (define-module (guix packages) this-package package-name package-upstream-name + package-upstream-name* package-version package-full-name package-source @@ -657,6 +658,38 @@ (define (package-upstream-name package) (or (assq-ref (package-properties package) 'upstream-name) (package-name package))) +(define (package-upstream-name* package) + "Return the upstream name of PACKAGE, which could be different from the name +it has in Guix." + (let ((namespaces (list "cl-" + "ecl-" + "emacs-" + "ghc-" + "go-" + "guile-" + "java-" + "julia-" + "lua-" + "minetest-" + "node-" + "ocaml-" + "perl-" + "python-" + "r-" + "ruby-" + "rust-" + "sbcl-" + "texlive-")) + (name (package-name package))) + (or (assq-ref (package-properties package) 'upstream-name) + (let loop ((prefixes namespaces)) + (match prefixes + ('() name) + ((prefix rest ...) + (if (string-prefix? prefix name) + (substring name (string-length prefix)) + (loop (cdr prefixes))))))))) + (define (hidden-package p) "Return a \"hidden\" version of P--i.e., one that 'fold-packages' and thus, user interfaces, ignores." diff --git a/guix/ui.scm b/guix/ui.scm index dad2b853ac..da16a50f9f 100644 --- a/guix/ui.scm +++ b/guix/ui.scm @@ -1623,10 +1623,23 @@ (define (relevance obj regexps metrics) (define (score regexp str) (fold-matches regexp str 0 (lambda (m score) - (+ score - (if (string=? (match:substring m) str) - 5 ;exact match - 1))))) + (let* ((start (- (match:start m) 1)) + (end (match:end m)) + (left (if (>= start 0) (string-ref str start) #f)) + (right (if (< end (string-length str)) (string-ref str end) #f)) + (delimiter-classes '(Cc Cf Pd Pe Pf Pi Po Ps Sk Zs Zl Zp)) + (delim-left (or (member (and=> left char-general-category) delimiter-classes) (eq? left #f))) + (delim-right (or (member (and=> right char-general-category) delimiter-classes) (eq? right #f)))) + (max score + (cond + ;; regexp is a full match for str. + ((and (eq? left #f) (eq? right #f)) 4) + ;; regexp matches a single word in str. + ((and delim-left delim-right) 3) + ;; regexp matches the beginning or end of a word in str. + ((or delim-left delim-right) 2) + ;; Everything else. + (#t 1))))))) (define (regexp->score regexp) (let ((score-regexp (lambda (str) (score regexp str)))) @@ -1635,10 +1648,11 @@ (define (regexp->score regexp) ((field . weight) (match (field obj) (#f relevance) + ('() relevance) ((? string? str) - (+ relevance (* (score-regexp str) weight))) + (max relevance (* (score-regexp str) weight))) ((lst ...) - (+ relevance (* weight (apply + (map score-regexp lst))))))))) + (max relevance (* weight (apply max (map score-regexp lst))))))))) 0 metrics))) (let loop ((regexps regexps) @@ -1655,7 +1669,8 @@ (define (regexp->score regexp) (define %package-metrics ;; Metrics used to compute the "relevance score" of a package against a set ;; of regexps. - `((,package-name . 4) + `((,package-name . 5) + (,package-upstream-name* . 1) ;; Match against uncommon outputs. (,(lambda (package)