From patchwork Mon Oct 19 14:02:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Efraim Flashner X-Patchwork-Id: 24692 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 961DF27BBEE; Mon, 19 Oct 2020 15:04:14 +0100 (BST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 07D5227BBED for ; Mon, 19 Oct 2020 15:04:14 +0100 (BST) Received: from localhost ([::1]:36922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUVll-00016S-2t for patchwork@mira.cbaines.net; Mon, 19 Oct 2020 10:04:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51770) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUVla-00013w-MR for guix-patches@gnu.org; Mon, 19 Oct 2020 10:04:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:58618) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kUVla-0001eM-DD for guix-patches@gnu.org; Mon, 19 Oct 2020 10:04:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kUVla-0006sc-8c for guix-patches@gnu.org; Mon, 19 Oct 2020 10:04:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#44075] [PATCH] gnu: Add make-glibc-locales-collection. Resent-From: Efraim Flashner Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Mon, 19 Oct 2020 14:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44075 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Miguel =?utf-8?b?w4FuZ2Vs?= Arruga Vivas Cc: 44075@debbugs.gnu.org Received: via spool by 44075-submit@debbugs.gnu.org id=B44075.160311620526395 (code B ref 44075); Mon, 19 Oct 2020 14:04:02 +0000 Received: (at 44075) by debbugs.gnu.org; 19 Oct 2020 14:03:25 +0000 Received: from localhost ([127.0.0.1]:41931 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kUVkv-0006ra-Uj for submit@debbugs.gnu.org; Mon, 19 Oct 2020 10:03:25 -0400 Received: from flashner.co.il ([178.62.234.194]:36580) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kUVkq-0006rI-6W for 44075@debbugs.gnu.org; Mon, 19 Oct 2020 10:03:20 -0400 Received: from localhost (unknown [141.226.13.8]) by flashner.co.il (Postfix) with ESMTPSA id 2944840223; Mon, 19 Oct 2020 14:03:09 +0000 (UTC) Date: Mon, 19 Oct 2020 17:02:36 +0300 From: Efraim Flashner Message-ID: <20201019140236.GF9117@E5400> References: <20201019064739.4736-1-efraim@flashner.co.il> <87tuuqnym5.fsf@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87tuuqnym5.fsf@gmail.com> X-PGP-Key-ID: 0x41AAE7DCCA3D8351 X-PGP-Key: https://flashner.co.il/~efraim/efraim_flashner.asc X-PGP-Fingerprint: A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches On Mon, Oct 19, 2020 at 03:17:54PM +0200, Miguel Ángel Arruga Vivas wrote: > Hi Efraim, > > I've been taking a look into your patch. One issue are the comments > about utf8 and UTF-8, as the issue is already explained in > make-glibc[-utf8]-locales. Thanks for taking a look. For the utf8 vs UTF-8 there are a couple of comments in the code: The above phase does not install locales with names using the "normalized codeset." Thus, create symlinks like: en_US.utf8 -> en_US.UTF-8 and also: For backward compatibility with Guix <= 0.8.3, add "xx_YY.UTF-8". When I check on one of my Debian boxes: $localectl list-locales C.UTF-8 en_US.utf8 I've learned that 'C.UTF-8' isn't from upstream, Debian has a patch for it. Having written this patch a week or so ago and returning to it I can't tell you which is the correct one. I think it's best to offer both so it's not confusing which is correct. That works for the logic for accepting either in the list of locales, but I'm concerned that skipping the symlink to the other one could cause problems. > > Other point is: > > Efraim Flashner writes: > > +(define* (make-glibc-locales-collection > > + glibc > > + #:optional (locales > > + '(list "en_US.utf8" "en_US.ISO-8859-1"))) > > (... Removed for clarity ...) > > + ,locales) > > I would have used list there like (list ,@locales) or '(,@locales), this > looks a bit odd to my eyes at least. I'd expect this kind of calling code: > > (let ((locales '("de_CH.utf8" ... "de_DE.utf8")) > (my-glibc ...)) > (make-glibc-locales-collection myglibc locales)) > > Enforcing an extra quotation for no real reason on the calling site, as > strings are self-evaluating objects, and the use of the symbol list, > whose meaning depends on other context of execution, doesn't seem > necessary. Even worse, my example would raise an error as "de_CH.utf8" > is not a procedure. My scheme-foo isn't terribly strong, sometimes I just hack at it until the code does what I want, and this is one of those times. I've changed it so that locales takes a list and not an item that is a list. > What do you think about replacing make-glibc-utf8-locales with a call of > the new function (using that code) ensuring that the generated > derivation stays the same for that case (i.e. it's optimized for the > UTF-8 case)? This is what I originally wanted to do, but there's a glibc-locales buried in the bootstrap path so it's not so easy to just swap it out. I can make the change in core-updates. I'll play around with it and see if I can come out with the same derivation using a different function, but I'm not expecting it to turn out identical. > > Happy hacking! > Miguel > diff --git a/gnu/packages/base.scm b/gnu/packages/base.scm index c83775d8ee..4ea31c2ab6 100644 --- a/gnu/packages/base.scm +++ b/gnu/packages/base.scm @@ -62,7 +62,8 @@ #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:export (glibc - libiconv-if-needed)) + libiconv-if-needed + make-custom-glibc-locales)) ;;; Commentary: ;;; @@ -1106,6 +1107,69 @@ to the @code{share/locale} sub-directory of this package.") ,(version-major+minor (package-version glibc))))))))))) +(define* (make-glibc-locales-collection + glibc + #:optional (locales + (list "en_US.utf8" "en_US.ISO-8859-1"))) + ;; This list for testing + ;(list "el_GR.UTF-8" "en_US.utf8" "he_IL.ISO-8859-8" "ja_JP.EUC-JP" "zh_CN.GB18030" "zh_CN.GBK" "hy_AM.ARMSCII-8"))) + (package + (name "glibc-locales-collection") + (version (package-version glibc)) + (source #f) + (build-system trivial-build-system) + (arguments + `(#:modules ((guix build utils)) + #:builder + (begin + (use-modules (guix build utils)) + + (let* ((libc (assoc-ref %build-inputs "glibc")) + (gzip (assoc-ref %build-inputs "gzip")) + (out (assoc-ref %outputs "out")) + (localedir (string-append out "/lib/locale/" + ,(version-major+minor version)))) + ;; 'localedef' needs 'gzip'. + (setenv "PATH" (string-append libc "/bin:" gzip "/bin")) + + (mkdir-p localedir) + (for-each + (lambda (locale) + (let* ((contains-dot? (string-index locale #\.)) + (encoding-type (substring locale (1+ contains-dot?))) + (raw-locale (substring locale 0 contains-dot?)) + (utf8? (or (number? (string-contains locale ".utf8")) + (number? (string-contains locale ".UTF-8")))) + (file (if utf8? + (string-append localedir "/" raw-locale ".utf8") + (if (string-contains locale ".ISO") + (string-append localedir "/" raw-locale) + (string-append localedir "/" locale))))) + + (invoke "localedef" "--no-archive" + "--prefix" localedir + "-i" raw-locale + "-f" (if (equal? "utf8" encoding-type) + "UTF-8" + encoding-type) + file) + + ;; Is it utf8 or UTF-8? NO ONE KNOWS! + (when utf8? + (symlink (string-append raw-locale ".utf8") + (string-append localedir "/" + raw-locale ".UTF-8"))))) + (list ,@locales)) + #t)))) + (native-inputs `(("glibc" ,glibc) + ("gzip" ,gzip))) + (synopsis "Customizable collection of locales") + (description + "This package provides a custom collection of locales useful for +providing exactly the locales requested when size matters.") + (home-page (package-home-page glibc)) + (license (package-license glibc)))) + (define-public (make-glibc-utf8-locales glibc) (package (name "glibc-utf8-locales") @@ -1161,6 +1225,13 @@ test environments.") (define-public glibc-utf8-locales (make-glibc-utf8-locales glibc)) +(define-public en_us-glibc-locales + (package + (inherit (make-glibc-locales-collection + glibc + (list "en_US.utf8" "en_US.ISO-8859-1"))) + (name "en-us-glibc-locales"))) + ;; Packages provided to ease use of binaries linked against the previous libc. (define-public glibc-locales-2.29 (package (inherit (make-glibc-locales glibc-2.29))