From patchwork Sun Jan 22 17:29:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Dwek X-Patchwork-Id: 46365 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id EE62E27BBE9; Mon, 23 Jan 2023 06:55:51 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mira.cbaines.net (Postfix) with ESMTPS id 3FB2727BBEB for ; Mon, 23 Jan 2023 06:55:50 +0000 (GMT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pJqji-0006SU-5U; Mon, 23 Jan 2023 01:55:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pJplP-0006Bz-0a for guix-patches@gnu.org; Mon, 23 Jan 2023 00:53:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pJplO-000701-Hy for guix-patches@gnu.org; Mon, 23 Jan 2023 00:53:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pJplO-0000Qm-43 for guix-patches@gnu.org; Mon, 23 Jan 2023 00:53:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#61021] [PATCH] Fix '--exclude-dir=dir/subdir/etc' grep option. Resent-From: Daniel Dwek Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Mon, 23 Jan 2023 05:53:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 61021 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 61021@debbugs.gnu.org Cc: Daniel Dwek X-Debbugs-Original-To: guix-patches@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.16744531441574 (code B ref -1); Mon, 23 Jan 2023 05:53:01 +0000 Received: (at submit) by debbugs.gnu.org; 23 Jan 2023 05:52:24 +0000 Received: from localhost ([127.0.0.1]:53061 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pJpkl-0000PE-8h for submit@debbugs.gnu.org; Mon, 23 Jan 2023 00:52:24 -0500 Received: from lists.gnu.org ([209.51.188.17]:42136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pJeA0-0005xj-PR for submit@debbugs.gnu.org; Sun, 22 Jan 2023 12:29:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pJeA0-0001m2-CQ for guix-patches@gnu.org; Sun, 22 Jan 2023 12:29:40 -0500 Received: from mail-oi1-x22d.google.com ([2607:f8b0:4864:20::22d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pJe9y-0002Tx-22 for guix-patches@gnu.org; Sun, 22 Jan 2023 12:29:40 -0500 Received: by mail-oi1-x22d.google.com with SMTP id r9so8523517oig.12 for ; Sun, 22 Jan 2023 09:29:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mfe3Rgk2P1sB91eyT4wSLwcGD5P3fxUN2LsfZZaB7Bs=; b=Jb7DrIf6g1lPCJhb+7iQtxu/LaOJeUqpiP8XG6utrEesqgfYB6Y8iLM+8gYQkPaY+i bdgSuhB+/QAsgJjYBcfSxkcOKdXKYaBYOxayOBXCnhoDNOF8prNhRqQzP705RCNB52gq REhxJM4qQxGmXvWWjwyhygimVFiJg4ufR89a1C9QhkVjiiZU7Ito8v+VzF3DUFLToXF8 k9d1gB55YbW2wbQT2A6VszmAqcKQn9Q9X9IVQ5P0ENIDJ4OBkn6VTLyys4sandVYMGOg AeuIvQEQVKyzRkzfLL/CU777Bu2/+RJLT6W931jOvVB4zLWqxmB069G6Wtjq+AGEGMQb Av4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mfe3Rgk2P1sB91eyT4wSLwcGD5P3fxUN2LsfZZaB7Bs=; b=rOloRv9wmEPXia2plEprhh+Cn4svKz+RxBbHRxUvJFOZOMB5o0yaDRHugFyfVDg8EO aQmG3j95rjSxHoeSV3BBFKjjBVYJerYhx6iKj7MbClNno/LXAmnqcucSRB9dYOYSGoI5 ohUPKO1DLivV1OpCWnIep42/KB8aggWYxDi1xal6FG5XxXwLW0uLqfhaQ1flMec3Lnkt HW12Q8ESgURYv/+chz65bIUodbugsTuTT9/b+qA5+nAqc723olTmLe55AL3o/gl8oqz3 OiO609kNcrWL3ZFdtztvUHt1FuyFPWgl94Ssnc0vtfLI4EkMp46/Crx9CPjmJ7cVGJ4m LkRA== X-Gm-Message-State: AFqh2kr48DTQSDWeQ59MD+ZdfwLFHHw1hmiEvfpdxLOz1KD/wPIPoKRi emPE9TEYQhnCRwwgOnp4kDltNestR9MWCg== X-Google-Smtp-Source: AMrXdXsDvZbGJQJo9eo1R/V4r7DwNud/bcmjC+pOMlOjwN8gU3WsmWXYDxgHlThYuJjd1OvCMzQ9Sg== X-Received: by 2002:aca:a8cd:0:b0:35e:bc7b:20ea with SMTP id r196-20020acaa8cd000000b0035ebc7b20eamr18578334oie.57.1674408574673; Sun, 22 Jan 2023 09:29:34 -0800 (PST) Received: from localhost.localdomain ([2800:810:516:22e8:d8c8:8e09:4a7:afb]) by smtp.gmail.com with ESMTPSA id bx6-20020a0568081b0600b003436fa2c23bsm3779928oib.7.2023.01.22.09.29.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Jan 2023 09:29:34 -0800 (PST) From: Daniel Dwek Date: Sun, 22 Jan 2023 14:29:29 -0300 Message-Id: <20230122172929.5840-1-todovirtual15@gmail.com> X-Mailer: git-send-email 2.26.3 MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::22d; envelope-from=todovirtual15@gmail.com; helo=mail-oi1-x22d.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Mon, 23 Jan 2023 00:52:20 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Mailman-Approved-At: Mon, 23 Jan 2023 01:55:20 -0500 X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org X-getmail-retrieved-from-mailbox: Patches This commit patches such a bogus option, not just for using it just once, but also twice or more times. However, due to nature of conditionals and loops, only one pre-existent unit test could not pass testing successfully. Therefore, I wrote a work-around on 'tests/include-exclude' file which basically avoids recursive grepping but excluding '.' directory. --- src/grep.c | 88 ++++++++++++++++++++++++++++++++-- tests/Makefile.am | 1 + tests/exclude-dir | 41 ++++++++++++++++ tests/exclude-dir-contents.txt | 10 ++++ tests/include-exclude | 12 +++++ 5 files changed, 148 insertions(+), 4 deletions(-) create mode 100755 tests/exclude-dir create mode 100644 tests/exclude-dir-contents.txt diff --git a/src/grep.c b/src/grep.c index 9f914fc..efada5c 100644 --- a/src/grep.c +++ b/src/grep.c @@ -27,6 +27,8 @@ #include #include #include +#include + #include "system.h" #include "argmatch.h" @@ -54,6 +56,63 @@ #include "xbinary-io.h" #include "xstrtol.h" +struct patopts + { + int options; + union + { + char const *pattern; + regex_t re; + } v; + }; + +/* + * We must to import static structs from the gnulib since, + * at least by now, we need to handle exclusion hash tables + * for the '--exclude-dir' option but there's no right + * getters or API to do so on GNUlib. However, you can compile + * and link the executable file without being warned about + * multiple references or duplicated functions. + */ +struct exclude_pattern + { + struct patopts *exclude; + idx_t exclude_alloc; + idx_t exclude_count; + }; + +enum exclude_type + { + exclude_hash, /* a hash table of excluded names */ + exclude_pattern /* an array of exclude patterns */ + }; + +struct exclude_segment + { + struct exclude_segment *next; /* next segment in list */ + enum exclude_type type; /* type of this segment */ + int options; /* common options for this segment */ + union + { + Hash_table *table; /* for type == exclude_hash */ + struct exclude_pattern pat; /* for type == exclude_pattern */ + } v; + }; + +struct pattern_buffer + { + struct pattern_buffer *next; + char *base; + }; + +/* The exclude structure keeps a singly-linked list of exclude segments, + maintained in reverse order. */ +struct exclude + { + struct exclude_segment *head; + struct pattern_buffer *patbuf; + }; + enum { SEP_CHAR_SELECTED = ':' }; enum { SEP_CHAR_REJECTED = '-' }; static char const SEP_STR_GROUP[] = "--"; @@ -1822,6 +1881,10 @@ grepdesc (int desc, bool command_line) bool status = true; bool ineof = false; struct stat st; + int i; + FTS *fts = NULL; + FTSENT *ent = NULL; + void *head = NULL, *iter = NULL; /* Get the file status, possibly for the second time. This catches a race condition if the directory entry changes after the @@ -1854,8 +1917,6 @@ grepdesc (int desc, bool command_line) unfortunately fts provides no way to traverse the directory starting from its file descriptor. */ - FTS *fts; - FTSENT *ent; int opts = fts_options & ~(command_line ? 0 : FTS_COMFOLLOW); char *fts_arg[2]; @@ -1870,8 +1931,27 @@ grepdesc (int desc, bool command_line) if (!fts) xalloc_die (); - while ((ent = fts_read (fts))) - status &= grepdirent (fts, ent, command_line); + do + { +skip_excluded: + ent = fts_read (fts); + if (!ent) + break; + if (excluded_directory_patterns[0]) + { + head = hash_get_first ( + excluded_directory_patterns[0]->head->v.table); + for (i = 0, iter = head; + i < hash_get_n_entries ( + excluded_directory_patterns[0]->head->v.table); + iter = hash_get_next ( + excluded_directory_patterns[0]->head->v.table, head), + i++) + if (strstr (ent->fts_path, (char *) iter)) + goto skip_excluded; + } + status &= grepdirent (fts, ent, command_line); + } while (1); if (errno) suppressible_error (errno); if (fts_close (fts) != 0) diff --git a/tests/Makefile.am b/tests/Makefile.am index a47cf5c..8acde41 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -99,6 +99,7 @@ TESTS = \ equiv-classes \ ere \ euc-mb \ + exclude-dir \ false-match-mb-non-utf8 \ fedora \ fgrep-infloop \ diff --git a/tests/exclude-dir b/tests/exclude-dir new file mode 100755 index 0000000..72cd9ed --- /dev/null +++ b/tests/exclude-dir @@ -0,0 +1,41 @@ +#! /bin/sh +# Test for right working of "--exclude-dir=some/thing/different" option. +# +# Copyright (C) 2001, 2006, 2009-2023 Free Software Foundation, Inc. +# +# Copying and distribution of this file, with or without modification, +# are permitted in any medium without royalty provided the copyright +# notice and this notice are preserved. + +. "${srcdir=.}/init.sh"; path_prepend_ ../src + +failures=0 + +mkdir -p /tmp/grep-tests/first/second +mkdir -p /tmp/grep-tests/third/forth + +cd .. +cat ./exclude-dir-contents.txt > /tmp/grep-tests/first/header.h +cat ./exclude-dir-contents.txt > /tmp/grep-tests/first/second/header.h +cat ./exclude-dir-contents.txt > /tmp/grep-tests/third/header.h +cat ./exclude-dir-contents.txt > /tmp/grep-tests/third/forth/header.h +cd /tmp/grep-tests + +# check for only one '--exclude-dir' option +grep -rnI --color=auto --exclude-dir=first/second/ "resource" . +if test $? -ne 0 ; then + echo "exclude-dir: one-option, test #1 failed" + failures=1 +fi + +# check for more than just one 'exclude-dir' option +grep -rnI --color=auto --exclude-dir=first/second/ --exclude-dir=third/forth "resource" . +if test $? -ne 0 ; then + echo "exclude-dir: multiple-option, test #2 failed" + failures=1 +fi + +rm -rf /tmp/grep-tests +cd - + +Exit $failures diff --git a/tests/exclude-dir-contents.txt b/tests/exclude-dir-contents.txt new file mode 100644 index 0000000..277c6ae --- /dev/null +++ b/tests/exclude-dir-contents.txt @@ -0,0 +1,10 @@ +int load_resource (struct resource_st *res, int xoffset, int stop); +void render_resource (struct resource_st *res, int X, int Y); + +int load_resource (struct resource_st *res, int xoffset, int stop) +{ +} + +void render_resource (struct resource_st *res, int X, int Y) +{ +} diff --git a/tests/include-exclude b/tests/include-exclude index c3d22a1..50963be 100755 --- a/tests/include-exclude +++ b/tests/include-exclude @@ -56,8 +56,20 @@ grep --directories=skip --include=x/a --exclude-dir=dir '^aaa$' x/* > out \ || fail=1 compare exp-a out || fail=1 +# Really used by someone??? +# Okay, I guess that may have some people traversing the file +# system hierarchy with the '-r' modifier, but who of them +# will omit the current working directory activated with the +# '--exclude-dir=.' option? It's a very very rare scenario... +# +# Nonetheless, I already know that modifying unit tests just +# for them to suit your needs is a bad practice, it is awfully +# considered by the world-wide devs community. But, once again, +# is it really used for anyone? +cat << EOF >/dev/null (cd x && grep -r --exclude-dir=. '^aaa$') > out || fail=1 compare exp-aa out || fail=1 +EOF grep --exclude=- '^aaa$' - < x/a > out || fail=1 compare exp-aaa out || fail=1