From patchwork Wed Feb 10 07:52:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Tai X-Patchwork-Id: 26984 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id B4D1627BC25; Wed, 10 Feb 2021 07:53:10 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS, T_DKIM_INVALID,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 99AFF27BC23 for ; Wed, 10 Feb 2021 07:53:09 +0000 (GMT) Received: from localhost ([::1]:34644 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l9kJA-0007o2-Lw for patchwork@mira.cbaines.net; Wed, 10 Feb 2021 02:53:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l9kJ4-0007nu-DS for guix-patches@gnu.org; Wed, 10 Feb 2021 02:53:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:43830) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1l9kJ4-0003MV-68 for guix-patches@gnu.org; Wed, 10 Feb 2021 02:53:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1l9kJ4-00047B-2u for guix-patches@gnu.org; Wed, 10 Feb 2021 02:53:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#46376] [PATCH] gnu: tesseract-ocr: update to 4.1.1) Resent-From: Andy Tai Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 10 Feb 2021 07:53:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46376 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Jelle Licht Cc: 46376@debbugs.gnu.org Received: via spool by 46376-submit@debbugs.gnu.org id=B46376.161294356515791 (code B ref 46376); Wed, 10 Feb 2021 07:53:02 +0000 Received: (at 46376) by debbugs.gnu.org; 10 Feb 2021 07:52:45 +0000 Received: from localhost ([127.0.0.1]:55376 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1l9kIm-00046c-UI for submit@debbugs.gnu.org; Wed, 10 Feb 2021 02:52:45 -0500 Received: from mail-il1-f182.google.com ([209.85.166.182]:43911) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1l9kIk-00046P-9R for 46376@debbugs.gnu.org; Wed, 10 Feb 2021 02:52:43 -0500 Received: by mail-il1-f182.google.com with SMTP id q5so997251ilc.10 for <46376@debbugs.gnu.org>; Tue, 09 Feb 2021 23:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=atai-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=A6J/9XtNvuqb7YfOe8rH64KinXyxznXkMJ9hSoUdiC0=; b=VnHvZZDTkkrJehjiiU4HQuCRZ3SoAC8OV2c8hrAf1a4hIY/gT5Y0oyLkGwuaoRydfa nHdUa7/6Wc+N/+JoQsn0dKXoVm6sKXVHDwBjOGzmrW7uksL0XTeTiiIHXyEnaMRHlSSc H6UrnSf+8SKHAlWL5BBa2Xuy0z/bu1JR37kCC8rAUD8LS6STZhUehmqq5IlPXqRDm2e9 V9PeLyXvZW34AvTcfYz4nczK95Gj8sg9sNLZ9xvXoa1IfzzR9AQxD3DhOhmw0oz4+bG8 i4qXMoPZJvMGu9lgl6K5470A0IgpYsU79f4fVEoTBHCK/ifOJtXyK4GbRHQ5JF+hMAjS rk2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=A6J/9XtNvuqb7YfOe8rH64KinXyxznXkMJ9hSoUdiC0=; b=blaULkgYOBDLDDXqd8lP1mKfGTUHsb0NI3YzHefxj8DdLJe0SRWXAWTTq8CkfHxvxP CxrqWxM3Y0VNC1EZQGUvPGHkm+FEzX+vy6In8Ku+yH8IluIpWRU2fiF9BBa9VyMdinEA zCv9OG5+vsGyU9KlZ6MaiNN+GzXqD13UsfbWyzqkc72+L1WdscNWzJwbnJJHyFZQk0eS YVOUoOPHMSpVvNHqHh1Snxjvkcg0PmbSs2ziTdpsIOk1sQjaok+aUHTAPnFfQNErWMg7 ks2AGB50Ml9HjpqRDYkRYNMa+4hCKDIREZ7ynPcyFHKAUpHHCxUQUZBoiJ7173vvF+HT Eqpg== X-Gm-Message-State: AOAM533laI//CtB5GKhmMS0uANxLVYywpq3/KoVj+iqIZa5MZdI4CQIE qJ2MGK6PvWtW25tS4UjaQFKxFgaeZT8kj1ysFutMitCEcRI= X-Google-Smtp-Source: ABdhPJztn98McN4KH4Z+PVm8ame+SekUmtb2ovTfIvqezw558bKgunYzUTrdInDB3mHX3FjioV0eHWZe1w9V77YXMs0= X-Received: by 2002:a05:6e02:881:: with SMTP id z1mr1871119ils.288.1612943556205; Tue, 09 Feb 2021 23:52:36 -0800 (PST) MIME-Version: 1.0 References: <86a6sep7h0.fsf@posteo.net> <867dnhpi85.fsf@posteo.net> <86sg6450cl.fsf@posteo.net> In-Reply-To: <86sg6450cl.fsf@posteo.net> From: Andy Tai Date: Tue, 9 Feb 2021 23:52:00 -0800 Message-ID: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches updated patch, now tests build in parallel... the build order has to be explicitly set to make the training target built first also added some other optional dependencies; built in a GuixSD VM to ensure no dependency on non-Guix tools from host test run is disabled for now On Tue, Feb 9, 2021 at 2:43 PM Jelle Licht wrote: > > Hi Andy, > > Andy Tai writes: > > > Hi, I updated the patch to only build in serial, with "-j 1" > > > > and with this, everything, including tests, builds successfully. > > No such luck, for me at least. Are you certain you got it to build on > your end? Could you try with `--check`? > > I've had to work out the following things: > > - Patched out "" and "" to > refer to "baseapi.h" and "helpers.h" in "unittest/pagesegmode_test.cc". > > - Make sure the check phase takes place after running "make training" in > a phase. > > I still ended up with several failing tests, courtesy of it running > unsupported instructions on my cpu (educated guess: avx etc). Nothing > comes easy, I guess. > > Thanks, > - Jelle From d3287d74861824dc3a7e2b65e11cd33fa0a11b39 Mon Sep 17 00:00:00 2001 From: andy Tai Date: Tue, 9 Feb 2021 23:44:06 -0800 Subject: [PATCH] gnu: tesseract-ocr: Update to 4.1.1) * gnu/packages/ocr.scm (tesseract-ocr): Update to 4.1.1 --- gnu/packages/ocr.scm | 53 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 47 insertions(+), 6 deletions(-) diff --git a/gnu/packages/ocr.scm b/gnu/packages/ocr.scm index dc4930918a..1ea7a94085 100644 --- a/gnu/packages/ocr.scm +++ b/gnu/packages/ocr.scm @@ -3,6 +3,7 @@ ;;; Copyright © 2016, 2020 Efraim Flashner ;;; Copyright © 2019 Tobias Geerinckx-Rice ;;; Copyright © 2019 Alex Vong +;;; Copyright © 2021 Andy Tai ;;; ;;; This file is part of GNU Guix. ;;; @@ -26,8 +27,17 @@ #:use-module (guix git-download) #:use-module (guix build-system gnu) #:use-module (guix build-system python) + #:use-module (gnu packages) + #:use-module (gnu packages autotools) + #:use-module (gnu packages backup) + #:use-module (gnu packages check) #:use-module (gnu packages compression) + #:use-module (gnu packages curl) + #:use-module (gnu packages gtk) + #:use-module (gnu packages icu4c) + #:use-module (gnu packages pkg-config) #:use-module (gnu packages python) + #:use-module (gnu packages xml) #:use-module (gnu packages image)) (define-public ocrad @@ -52,25 +62,55 @@ it produces text in 8-bit or UTF-8 formats.") (license license:gpl3+))) (define-public tesseract-ocr + ;; some useful commits beyond last official stable release in release branch + (let ((commit "97079fa353557af6df86fd20b5d2e0dff5d8d5df") + (revision "1")) (package (name "tesseract-ocr") - (version "3.04.01") + (version (git-version "4.1.1" revision commit)) (source (origin (method git-fetch) (uri (git-reference (url "https://github.com/tesseract-ocr/tesseract") - (commit version))) + (commit commit) + ;; source git repo with submodules; ensure they are fetched + (recursive? #t))) (file-name (git-file-name name version)) (sha256 - (base32 "0h1x4z1h86n2gwknd0wck6gykkp99bmm02lg4a47a698g4az6ybv")))) + (base32 "0axwla82fpzp86lc553wp3hk0fz5dylw4as0jbf4hkqcyajlbzp4")))) (build-system gnu-build-system) (inputs - `(("leptonica" ,leptonica))) + `( ("cairo" ,cairo) + ("icu" ,icu4c) + ("leptonica" ,leptonica) + ("pango" ,pango))) + (native-inputs + `(("autoconf" ,autoconf) + ("autoconf-archive" ,autoconf-archive) + ("automake" ,automake) + ("googletest" ,googletest) + ("libarchive" ,libarchive) + ("libcurl" ,curl) + ("libtool" ,libtool) + ("libtiff" ,libtiff) + ("pkg-config" ,pkg-config) + ("python" ,python-wrapper) + ("xsltproc" ,libxslt))) (arguments '(#:configure-flags (let ((leptonica (assoc-ref %build-inputs "leptonica"))) - (list (string-append "LIBLEPT_HEADERSDIR=" leptonica "/include"))))) + (list (string-append "LIBLEPT_HEADERSDIR=" leptonica "/include"))) + #:phases + (modify-phases %standard-phases + (delete 'check) + (add-before 'build 'build-training + (lambda _ + (invoke "make" "training"))) + (add-after 'install 'install-training + (lambda _ + (invoke "make" "training-install") + #t))))) (home-page "https://github.com/tesseract-ocr/tesseract") (synopsis "Optical character recognition engine") (description @@ -79,7 +119,7 @@ high accuracy. It supports many languages, output text formatting, hOCR positional information and page layout analysis. Several image formats are supported through the Leptonica library. It can also detect whether text is monospaced or proportional.") - (license license:asl2.0))) + (license license:asl2.0)))) (define-public zinnia (let* ((commit "581faa8f6f15e4a7b21964be3a5ec36265c80e5b") @@ -151,3 +191,4 @@ that allows us to create any hand-written recognition systems with low-cost.") #t))))) (inputs `(("zinnia" ,zinnia))))) + -- 2.29.2