From patchwork Sun Sep 8 00:09:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Graves X-Patchwork-Id: 67748 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 8969E27BBE2; Sun, 8 Sep 2024 01:10:28 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_VALIDITY_CERTIFIED, RCVD_IN_VALIDITY_RPBL,RCVD_IN_VALIDITY_SAFE,SPF_HELO_PASS autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 60B8827BBE9 for ; Sun, 8 Sep 2024 01:10:26 +0100 (BST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sn5VE-00063l-G0; Sat, 07 Sep 2024 20:10:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn5VB-00062q-CZ for guix-patches@gnu.org; Sat, 07 Sep 2024 20:10:01 -0400 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sn5VB-0001o7-3c for guix-patches@gnu.org; Sat, 07 Sep 2024 20:10:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:From:To:Subject; bh=PvuTKGFqA5+jNvVIiou4FFNmN9QhyLWPLkTSuGlYyG4=; b=k61QuQlsc/glhUFEXZcXRzJ0kdazeaoOTG2mz6drX3ewgQJeLQN4MzF2MQeUO0+sFqiiFq3hLNITLUhJgsN7VA515MKcARgcSxdy5WmaH4IynGcY5jPzSoUFRmvsKkZBskwmnr3+mBtMKHmphD+oKfu8k8UncewKhSsYkVBWcOtf9Sfc6fN2aaecU5g9hRuMckmMSBjh2GCg8zE3VveSiYY9YXgQ8BzWjKqNGuFJQCT3b1O0I5KSXzddCxOpj6L5TMgZvHy7SUjLEJRHyK/tAttEQtm/scePhoA6vbfw/9hlj0MnDGjDXXlRpo3y5LAwwhBtH/srXJu2hbmZL+TriQ==; Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sn5VC-0005HO-3h for guix-patches@gnu.org; Sat, 07 Sep 2024 20:10:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#73115] [PATCH] gnu: Add python-sentence-transformers. Resent-From: Nicolas Graves Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Sun, 08 Sep 2024 00:10:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 73115 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 73115@debbugs.gnu.org Cc: ngraves@ngraves.fr X-Debbugs-Original-To: guix-patches@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.172575419420270 (code B ref -1); Sun, 08 Sep 2024 00:10:02 +0000 Received: (at submit) by debbugs.gnu.org; 8 Sep 2024 00:09:54 +0000 Received: from localhost ([127.0.0.1]:57755 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sn5V3-0005Gr-PB for submit@debbugs.gnu.org; Sat, 07 Sep 2024 20:09:54 -0400 Received: from lists.gnu.org ([209.51.188.17]:33626) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sn5V0-0005Gf-Qc for submit@debbugs.gnu.org; Sat, 07 Sep 2024 20:09:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn5Uy-00062H-KH for guix-patches@gnu.org; Sat, 07 Sep 2024 20:09:48 -0400 Received: from 4.mo576.mail-out.ovh.net ([46.105.42.102]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sn5Us-0001nL-Fx for guix-patches@gnu.org; Sat, 07 Sep 2024 20:09:48 -0400 Received: from director8.ghost.mail-out.ovh.net (unknown [10.109.139.3]) by mo576.mail-out.ovh.net (Postfix) with ESMTP id 4X1VhJ4FRFz1nht for ; Sun, 8 Sep 2024 00:09:36 +0000 (UTC) Received: from ghost-submission-55b549bf7b-mnlw7 (unknown [10.108.42.240]) by director8.ghost.mail-out.ovh.net (Postfix) with ESMTPS id 0A7611FD58; Sun, 8 Sep 2024 00:09:35 +0000 (UTC) Received: from ngraves.fr ([37.59.142.110]) by ghost-submission-55b549bf7b-mnlw7 with ESMTPSA id djCSJz/r3GYUwQAAPR9d2Q (envelope-from ); Sun, 08 Sep 2024 00:09:35 +0000 Authentication-Results: garm.ovh; auth=pass (GARM-110S004d9bc809f-b723-4a19-abe8-4f3d86fdad2c, 1B24502937BE6AAC558E984F82BC0F282F1693D3) smtp.auth=ngraves@ngraves.fr X-OVh-ClientIp: 81.67.146.208 Date: Sun, 8 Sep 2024 02:09:24 +0200 Message-ID: <20240908000927.29091-1-ngraves@ngraves.fr> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-Ovh-Tracer-Id: 16951548997830566626 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: 0 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeeftddrudeigedgfedtucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenucenucfjughrpefhvfevufffkffoggfgsedtkeertdertddtnecuhfhrohhmpefpihgtohhlrghsucfirhgrvhgvshcuoehnghhrrghvvghssehnghhrrghvvghsrdhfrheqnecuggftrfgrthhtvghrnhepvdffvdfghffffedtvefftdetkeetueejuedvtdekgfffffehhedulefhkeevtdehnecuffhomhgrihhnpehssggvrhhtrdhnvghtnecukfhppeduvdejrddtrddtrddupdekuddrieejrddugeeirddvtdekpdefjedrheelrddugedvrdduuddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpehnghhrrghvvghssehnghhrrghvvghsrdhfrhdpnhgspghrtghpthhtohepuddprhgtphhtthhopehguhhigidqphgrthgthhgvshesghhnuhdrohhrghdpoffvtefjohhsthepmhhoheejiedpmhhouggvpehsmhhtphhouhht DKIM-Signature: a=rsa-sha256; bh=PvuTKGFqA5+jNvVIiou4FFNmN9QhyLWPLkTSuGlYyG4=; c=relaxed/relaxed; d=ngraves.fr; h=From; s=ovhmo4487190-selector1; t=1725754176; v=1; b=hRBqDAWe91/HMUXAZn4ri1ZOju9NJn1RQ/pEpllkheM0Z8vprao8hegvxZO4H0jVZtZTg61g 1yC5labkLR92hEcBn3wfrTLHAt48QDDBrZCWvPlu4Gkbx9i4Dtn/IkhQQEMBPtjV6ycNnUD+4AO A8Mb9dEVHW3gFGA8CxD3QqRdkNsDbgLqxYwPwTI+q9TuS2xI4JKOSzZo5GQ7AIyf5vBkKUXHZNQ pSUxM93EnsN0P5Q99xhQwo8g0cvRmWSV7DBxV6JXVeubop46qu90JejClTJ3CD6jMYcxPLCBArM AO3uESKnGoVfUP1pOfC5CoMOvqFrRLp0t/kVxJcr+c4Rg== Received-SPF: pass client-ip=46.105.42.102; envelope-from=ngraves@ngraves.fr; helo=4.mo576.mail-out.ovh.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Nicolas Graves X-ACL-Warn: , Nicolas Graves via Guix-patches X-Patchwork-Original-From: Nicolas Graves via Guix-patches via From: Nicolas Graves Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches * gnu/packages/machine-learning.scm (python-sentence-transformers): New variable. Change-Id: Iedab56f6c2bdde12e654ba67695cd996122bdb0b --- gnu/packages/machine-learning.scm | 54 +++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm index 42842d7d61..b2da07e8f0 100644 --- a/gnu/packages/machine-learning.scm +++ b/gnu/packages/machine-learning.scm @@ -1239,6 +1239,60 @@ (define-public python-sentencepiece unsupervised text tokenizer.") (license license:asl2.0))) +(define-public python-sentence-transformers + (package + (name "python-sentence-transformers") + (version "3.0.1") + (source + (origin + (method url-fetch) + (uri (pypi-uri "sentence_transformers" version)) + (sha256 + (base32 "1xmzbyrlp6wa7adf42n67c544db17nz95b10ri603lf4gi9jqgca")))) + (build-system pyproject-build-system) + (arguments + (list + #:test-flags `(list + ;; Missing fixture / train or test data. + ;; Requires internet access. + "--ignore=tests/test_sentence_transformer.py" + "--ignore=tests/test_train_stsb.py" + "--ignore=tests/test_compute_embeddings.py" + "--ignore=tests/test_cross_encoder.py" + "--ignore=tests/test_model_card_data.py" + "--ignore=tests/test_multi_process.py" + "--ignore=tests/test_pretrained_stsb.py" + "-k" ,(string-append + "not test_LabelAccuracyEvaluator" + " and not test_ParaphraseMiningEvaluator" + " and not test_cmnrl_same_grad" + " and not test_paraphrase_mining" + " and not test_simple_encode")))) + (propagated-inputs (list python-huggingface-hub + python-numpy + python-pillow + python-scikit-learn + python-scipy + python-pytorch + python-tqdm + python-transformers)) + (native-inputs (list python-pytest)) + (home-page "https://www.SBERT.net") + (synopsis "Multilingual text embeddings") + (description "This framework provides an easy method to compute dense +vector representations for sentences, paragraphs, and images. The models are +based on transformer networks like BERT / RoBERTa / XLM-RoBERTa and achieve +state-of-the-art performance in various tasks. Text is embedded in vector +space such that similar text are closer and can efficiently be found using +cosine similarity. + +This package provides easy access to pretrained models for more than 100 +languages, fine-tuned for various use-cases. + +Further, this framework allows an easy fine-tuning of custom embeddings +models, to achieve maximal performance on your specific task.") + (license license:asl2.0))) + (define-public python-spacy-legacy (package (name "python-spacy-legacy")