[bug#58140,2/6] gnu: Add kaldi-for-vosk.
Commit Message
* gnu/packages/machine-learning.scm (kaldi-for-vosk): New variable.
---
gnu/packages/machine-learning.scm | 98 +++++++++++++++++++++++++++++++
1 file changed, 98 insertions(+)
Comments
>+ (add-after 'unpack 'chdir
>+ (lambda _ (chdir "src") #t))
Trailing #t haven't been required since a long time.
On 28-09-2022 13:57, Nicolas Graves via Guix-patches via wrote:
> + (when (not (or (string-prefix? "x86_64" system)
> + (string-prefix? "i686" system)))
> + (substitute* "makefiles/linux_openblas.mk"
> + (("-msse -msse2") "")))
You are testing for the system it is being compiled on ('system'), not
the system it is being compiled for ('target'). For cross-compilation,
usually the latter is required. You may find
#$@(if (target-x86?) #~((substitute* ...) #~())
useful, it automatically handles cross-compilation, both 32-bit and
64-bit, a hypothetical i786 it it ever starts existing, i586 if we ever
lower the requirements ...
> + (string-append "OPENBLASROOT=\"" #$openblas "\""))
#$PACKAGE does not compose with --with-input. I recommend doing
something like
(dirname (dirname (search-input-file inputs "lib/libblas.so")))
instead. Likewise for other uses of #$PACKAGE.
> + (setenv "OPENFST_VER" #$(package-version openfst))
I recommend referring to 'openfst' with 'this-package-input', that way
when --with-input or --with-latest is used to change it to something
with a different version, the version should still be detetcxed properly.
> + (substitute* "./Makefile"
> + (("USE_SHARED\\?=0")
> + "USE_SHARED?=1")
IIUC, ?=0 means that 0 is the default, you can override it by setting
#:make-flags.
> dialects - English,
I think one of the special dashes (en dashes, em dashes, figure dash?
Don't know which one) would be appropriate here.
> +(define-public python-nerd-dictation
> + (let* ((commit "53ab129a5ee0f8b5df284e8cf2229219b732c59e")
> + (revision "0"))
> + (package
> + (name "python-nerd-dictation")
> + (version (git-version "0" revision commit))
Going by <https://github.com/ideasman42/nerd-dictation>, 'python-' is
not part of its name, you can drop the prefix AFAICT.
> + (synopsis "Offline speech-to-text for desktop Linux")
If it's Linux only, you can use the 'supported-systems' field for that,
see (gnu packages linux) for examples.
> + #$(file-append bash-minimal "/bin/bash")
You can use 'search-input-file' to avoid depending on input labels.
> + (let* ((out (assoc-ref %outputs "out"))
If you are using G-exps, you can replace (assoc-ref %outputs "out") with
its G-exp equivalent #$output.
> +(define openfst-for-vosk
> + (package
> + (inherit openfst)
> + (version "1.8.0")
> + (source
> + (origin
> + (method url-fetch)
> + (uri (string-append "http://www.openfst.org/twiki/pub/FST/"
> + "FstDownload/openfst-" version ".tar.gz"))
> + (sha256
> + (base32 "0h2lfhhihg63b804hrcljnkggijbjmp84i5g8q735wb09y9z2c4p"))))
Why select an older version? Would keeping the original (and more
up-to-date) version work? To avoid a name conflict between the openfst
(which would be inconvenient for "guix show", "guix install", "guix
shell"), you can override the 'name' field.
Greetings,
Maxime.
On Wed Sep 28, 2022 at 10:05 PM BST, Maxime Devos wrote:
> > + (string-append "OPENBLASROOT=\"" #$openblas "\""))
>
> #$PACKAGE does not compose with --with-input. I recommend doing
> something like
> (dirname (dirname (search-input-file inputs "lib/libblas.so")))
> instead. Likewise for other uses of #$PACKAGE.
Why not ``#$(this-package-input openblas)''?
-- (
> Trailing #t haven't been required since a long time.
A big part of the code, and in particular old forms, come from the code
of the current kaldi package. Should I also change the same code chunks
for kaldi in an additional patch ?
> If it's Linux only, you can use the 'supported-systems' field for that,
> see (gnu packages linux) for examples.
I don't really know that. Ydotool probably only work on Linux, since
they rely on linux keycodes. I don't know for X. Maybe someone should
test. Should I suppose it only supports Linux by default?
> Why select an older version? Would keeping the original (and more
> up-to-date) version work? To avoid a name conflict between the openfst
> (which would be inconvenient for "guix show", "guix install", "guix
> shell"), you can override the 'name' field.
No, it doesn't work and that's the reason why I used this version. It
might however work with the version that's present for kaldi (1.7.3
IIRC), I can test that. But the flags aren't the same, so we probably
should do another package anyway.
I didn't change the name, but I also haven't exported the variable
(define instead of define-public). I expected the package to not be
available through "guix search" or "guix install". Is that OK?
I've done a V3, testing it before sending.
--
Best regards,
Nicolas Graves
On 29-09-2022 10:20, Nicolas Graves wrote:
>
>> Trailing #t haven't been required since a long time.
> A big part of the code, and in particular old forms, come from the code
> of the current kaldi package. Should I also change the same code chunks
> for kaldi in an additional patch ?
That would be nice, but not required I'd say.
>> If it's Linux only, you can use the 'supported-systems' field for that,
>> see (gnu packages linux) for examples.
> I don't really know that. Ydotool probably only work on Linux, since
> they rely on linux keycodes. I don't know for X. Maybe someone should
> test. Should I suppose it only supports Linux by default?
I think that usually 'if it works on Linux, it probably can work on
similar-ish systems as well’ is a reasonable assumption, but perhaps
with the keycodes, it isn't.
However, if the problem is in 'ydotool', you can mention that in the
supported-systems of 'ydotool', 'supported-systems' has a kind of
implicit transitivity going by the use of
package-transitive-supported-systems in (guix ui).
>> Why select an older version? Would keeping the original (and more
>> up-to-date) version work? To avoid a name conflict between the openfst
>> (which would be inconvenient for "guix show", "guix install", "guix
>> shell"), you can override the 'name' field.
>
> No, it doesn't work and that's the reason why I used this version.
In that case, I recommend adding a comment to the definition, to avoid
the risk of someone 'helpfully' updating the package anyway, and an
upstream report, such that upstream can address the compatibility
problems with the new version.
> It
> might however work with the version that's present for kaldi (1.7.3
> IIRC), I can test that. But the flags aren't the same, so we probably
> should do another package anyway.
>
> I didn't change the name, but I also haven't exported the variable
> (define instead of define-public). I expected the package to not be
> available through "guix search" or "guix install". Is that OK?
I suppose it is OK, though personally I think it might be a bit
confusing, e.g. to people using "guix shell -D ..." ending up with a
package version in their environment that they can't find with "guix
search".
Greetings,
Maxime.
@@ -1596,6 +1596,104 @@ (define-public kaldi
written in C++.")
(license license:asl2.0))))
+(define kaldi-for-vosk
+ (let* ((commit "6417ac1dece94783e80dfbac0148604685d27579")
+ (revision "0")
+ (openfst openfst-for-vosk))
+ (package
+ (inherit kaldi)
+ (name "kaldi")
+ (version (git-version "0" revision commit))
+ (source
+ (origin
+ (method git-fetch)
+ (uri (git-reference
+ (url "https://github.com/alphacep/kaldi")
+ (commit commit)))
+ (file-name (git-file-name name version))
+ (sha256
+ (base32 "04xw2dpfvpla8skpk08azmgr9k97cd8hn83lj4l85q165gbzql4s"))))
+ (inputs
+ (list alsa-lib
+ ;; `(,gfortran "lib") ;; replaced by lapack
+ lapack
+ glib
+ gstreamer
+ jack-1
+ openblas
+ openfst
+ portaudio
+ python))
+ (arguments
+ (list
+ #:test-target "test"
+ #:make-flags ''("online2" "lm" "rnnlm")
+ #:phases
+ #~(modify-phases %standard-phases
+ (add-after 'unpack 'chdir
+ (lambda _ (chdir "src") #t))
+ (replace 'configure
+ (lambda* (#:key build system inputs outputs #:allow-other-keys)
+ (when (not (or (string-prefix? "x86_64" system)
+ (string-prefix? "i686" system)))
+ (substitute* "makefiles/linux_openblas.mk"
+ (("-msse -msse2") "")))
+ (substitute* "makefiles/default_rules.mk"
+ (("/bin/bash") (which "bash")))
+ (substitute* "Makefile"
+ (("ext_depend: check_portaudio")
+ "ext_depend:"))
+ (substitute* '("online/Makefile"
+ "onlinebin/Makefile"
+ "gst-plugin/Makefile")
+ (("../../tools/portaudio/install")
+ (assoc-ref inputs "portaudio")))
+ (substitute* "matrix/Makefile" ;temporary test bypass
+ (("matrix-lib-test sparse-matrix-test") ""))
+
+ ;; This `configure' script doesn't support variables passed as
+ ;; arguments, nor does it support "prefix".
+ (let ((out (assoc-ref outputs "out")))
+ (substitute* "configure"
+ (("check_for_slow_expf;") "")
+ ;; This affects the RPATH and also serves as the installation
+ ;; directory.
+ (("KALDILIBDIR=`pwd`/lib")
+ (string-append "KALDILIBDIR=" out "/lib"))
+ (("OPENBLASROOT=\\\"\\$\\(rel2abs ..\\/tools\\/OpenBLAS\\/install\\)\\\"")
+ (string-append "OPENBLASROOT=\"" #$openblas "\""))
+ (("-L\\$OPENBLASLIBDIR -l:libopenblas.a -l:libblas.a -l:liblapack.a -l:libf2c.a")
+ (string-append "-L$OPENBLASLIBDIR -lopenblas "
+ "-L" #$lapack "/lib -lblas -llapack")))
+ (mkdir-p out) ; must exist
+ (setenv "CONFIG_SHELL" (which "bash"))
+ (setenv "OPENFST_VER" #$(package-version openfst))
+ (invoke "./configure"
+ "--use-cuda=no"
+ "--mathlib=OPENBLAS_CLAPACK"
+ "--shared"
+ (string-append "--fst-root=" #$openfst)))))
+ (add-after 'configure 'optimize-build
+ (lambda _ (substitute* "kaldi.mk" ((" -O1") " -O3"))))
+ (replace 'install
+ (lambda* (#:key outputs #:allow-other-keys)
+ (let* ((out (assoc-ref outputs "out"))
+ (inc (string-append out "/include"))
+ (lib (string-append out "/lib")))
+ (mkdir-p lib)
+ ;; The build phase installed symlinks to the actual
+ ;; libraries. Install the actual targets.
+ (for-each (lambda (file)
+ (let ((target (readlink file)))
+ (delete-file file)
+ (install-file target lib)))
+ (find-files lib "\\.so"))
+ ;; Install headers
+ (for-each (lambda (file)
+ (let ((target-dir (string-append inc "/" (dirname file))))
+ (install-file file target-dir)))
+ (find-files "." "\\.h")))))))))))
+
(define-public gst-kaldi-nnet2-online
(let ((commit "cb227ef43b66a9835c14eb0ad39e08ee03c210ad")
(revision "2"))