From patchwork Fri Dec 23 22:20:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Denis 'GNUtoo' Carikli X-Patchwork-Id: 45553 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 884E627BBEB; Fri, 23 Dec 2022 22:21:15 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id E1B4D27BBE9 for ; Fri, 23 Dec 2022 22:21:13 +0000 (GMT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p8qPc-0005H9-Hg; Fri, 23 Dec 2022 17:21:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p8qPX-0005GM-PX for guix-patches@gnu.org; Fri, 23 Dec 2022 17:21:05 -0500 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1p8qPW-0000Pe-D2 for guix-patches@gnu.org; Fri, 23 Dec 2022 17:21:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1p8qPW-0002g6-8U for guix-patches@gnu.org; Fri, 23 Dec 2022 17:21:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#60288] [PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi Resent-From: Denis 'GNUtoo' Carikli Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Fri, 23 Dec 2022 22:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60288 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 60288@debbugs.gnu.org Cc: Denis 'GNUtoo' Carikli Received: via spool by 60288-submit@debbugs.gnu.org id=B60288.167183405510261 (code B ref 60288); Fri, 23 Dec 2022 22:21:02 +0000 Received: (at 60288) by debbugs.gnu.org; 23 Dec 2022 22:20:55 +0000 Received: from localhost ([127.0.0.1]:39144 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1p8qPP-0002fR-2X for submit@debbugs.gnu.org; Fri, 23 Dec 2022 17:20:55 -0500 Received: from cyberdimension.org ([80.67.179.20]:45664 helo=gnutoo.cyberdimension.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1p8qPK-0002f9-Fi for 60288@debbugs.gnu.org; Fri, 23 Dec 2022 17:20:51 -0500 Received: from gnutoo.cyberdimension.org (localhost [127.0.0.1]) by cyberdimension.org (OpenSMTPD) with ESMTP id 5569228d; Fri, 23 Dec 2022 22:16:32 +0000 (UTC) Received: from localhost.localdomain (localhost [::1]) by gnutoo.cyberdimension.org (OpenSMTPD) with ESMTP id b8bd9555; Fri, 23 Dec 2022 22:16:32 +0000 (UTC) From: Denis 'GNUtoo' Carikli Date: Fri, 23 Dec 2022 23:20:24 +0100 Message-Id: <20221223222024.13805-2-GNUtoo@cyberdimension.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221223222024.13805-1-GNUtoo@cyberdimension.org> References: <20221223222024.13805-1-GNUtoo@cyberdimension.org> MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches * gnu/packages/zim-files.scm (wikipedia_en_all_maxi): New variable. --- gnu/local.mk | 1 + gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) create mode 100644 gnu/packages/zim-files.scm diff --git a/gnu/local.mk b/gnu/local.mk index 5b8944f568..8957554fc2 100644 --- a/gnu/local.mk +++ b/gnu/local.mk @@ -643,6 +643,7 @@ GNU_SYSTEM_MODULES = \ %D%/packages/xfce.scm \ %D%/packages/zig.scm \ %D%/packages/zile.scm \ + %D%/packages/zim-files.scm \ %D%/packages/zwave.scm \ \ %D%/services.scm \ diff --git a/gnu/packages/zim-files.scm b/gnu/packages/zim-files.scm new file mode 100644 index 0000000000..49b7accb52 --- /dev/null +++ b/gnu/packages/zim-files.scm @@ -0,0 +1,86 @@ +;;; GNU Guix --- Functional package management for GNU +;;; Copyright © 2022 Denis 'GNUtoo' Carikli +;;; +;;; This file is part of GNU Guix. +;;; +;;; GNU Guix is free software; you can redistribute it and/or modify it +;;; under the terms of the GNU General Public License as published by +;;; the Free Software Foundation; either version 3 of the License, or (at +;;; your option) any later version. +;;; +;;; GNU Guix is distributed in the hope that it will be useful, but +;;; WITHOUT ANY WARRANTY; without even the implied warranty of +;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;;; GNU General Public License for more details. +;;; +;;; You should have received a copy of the GNU General Public License +;;; along with GNU Guix. If not, see . + +(define-module (gnu packages zim-files) + #:use-module (gnu packages) + #:use-module (guix build-system copy) + #:use-module (guix download) + #:use-module (guix gexp) + #:use-module (guix utils) + #:use-module ((guix licenses) #:prefix license:) + #:use-module (guix packages)) + +;;; Commentary: +;;; +;;; Many Guix contributors have a tendency to update packages in this +;;; way: they only update the package revision and then launch a build +;;; that fails just to make Guix tell them the right base32 hash. They +;;; then update the base32 hash and launch the build again. +;;; +;;; However some ZIM files are quite big. At the time of writing, +;;; wikipedia_en_all_maxi_2022-05.zim is about 89 GiB. +;;; +;;; So this approach will be time consuming as the second time Guix +;;; will restart downloading the same file from scratch. +;;; +;;; The solution to this issue is to download the sha256sums (for that +;;; simply append .sha256 to the URL of the ZIM file). It will give a +;;; file like that: +;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021 wikipedia_en_all_maxi_2022-05.zim +;;; +;;; You can then use this hash to compute the base32 with nix-hash: +;;; $ nix-hash --type sha256 --to-base32 \ +;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021 +;;; 08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi + +(define-public wikipedia-en-all-maxi + (package + (name "wikipedia-en-all-maxi") + (version "2022-05") + (source (origin + (method url-fetch) + (uri (string-append + "https://mirror.download.kiwix.org/zim/wikipedia/" + (string-replace-substring name "-" "_") + "_" version ".zim")) + (sha256 + (base32 + "08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi")))) + (build-system copy-build-system) + (arguments + (list + ;; We are not (yet) generating the zim file, so it doesn't make sense to + ;; build substitutes. + #:substitutable? #f + ;; If we use kiwix-serve, the path of the ZIM file needs to be passed to + ;; it. And if the filename has a version in it, we'd need to update the + ;; path manually each time the package is updated. We also need to + ;; change the filename to match the package name. + #:install-plan #~'((#$(string-append + (string-replace-substring name "-" "_") + "_" version ".zim") + #$(string-append "share/" name ".zim"))))) + (synopsis + "Complete English Wikipedia packed in a ZIM file, for offline usage with +Kiwix") + (description + "Wikipedia is a free Encyclopedia. This is the English version. It +contains all the articles, and all the medias (images, etc) present in +the articles in a scaled down resolution.") + (home-page "https://en.wikipedia.org/wiki/Main_Page") + (license license:cc-by-sa3.0)))