[bug#54846] gnu: linux: Escape the values of string-type kconfig options

Message ID 20220411022410.5606-1-autumnalantlers@gmail.com
State New
Headers
Series [bug#54846] gnu: linux: Escape the values of string-type kconfig options |

Commit Message

antlers April 11, 2022, 2:24 a.m. UTC
  * gnu/packages/linux.scm (config->string): add escape-string

Handles characters within the set
(char-set-intersection char-set:ascii char-set:printing), removing
those which are known to be unsupported.
---
 gnu/packages/linux.scm | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)
  

Comments

antlers April 11, 2022, 5:08 a.m. UTC | #1
Oh lord, hold off on this one: File paths in CONFIG_SYSTEM_*_KEYS
options are parsed by Make before their file is opened, but
CONFIG_LOCALVERSION strings aren't, nor are (I imagine?)
CONFIG_INITRAMFS_SOURCE's despite also being file paths, so taking
responsibility for escaping means handling several options
individually :/
  
Ludovic Courtès April 12, 2022, 9:39 p.m. UTC | #2
Hi,

antlers <autumnalantlers@gmail.com> skribis:

>  * gnu/packages/linux.scm (config->string): add escape-string
>
> Handles characters within the set
> (char-set-intersection char-set:ascii char-set:printing), removing
> those which are known to be unsupported.

[...]

>  (define (config->string options)
> +  (define (escape-string str)
> +    "Returns STR with the escapes necessary to be read as a string-type
> +    option's value. Handles characters within the set (char-set-intersection
> +    char-set:ascii char-set:printing), removing those which are known to be
> +    unsupported."

Nitpick: You can turn the docstring into a comment since the docstring
wouldn’t be accessible anyway.

> +    (fold (match-lambda* (((match? fmt) str)
> +			  (transform-string str match?
> +					    (cut format #f fmt <>))))

Please avoid tabs.

‘transform-string’ is from (texinfo string-utils), which is not imported
here.  IMO, we’d rather avoid depending on this module since it’s really
designed for the Texinfo machinery.

> +	  str
> +	  `((#\# "") ; No known way to escape # characters.
> +	    (#\$ "$~a")
> +	    ("\"\\'`" "\\~a")
> +	    (";:()#" "\\\\~a")
> +	    ("|" "\\\\\\~a")
> +	    ;; No support for tabs, newlines, etc.
> +	    (,(char-set->string (ucs-range->char-set 9 14)) ""))))

I wonder if this should be implemented in terms of ‘string-fold’
instead:

  (string-concatenate-reverse
    (string-fold (lambda (chr result)
                   (match chr
                     (#\# (cons "" result))
                     ;; …
                     (_ (cons (string chr) result))))
                 '()
                 str))

Thoughts?

Thanks,
Ludo’.
  
Ludovic Courtès April 28, 2022, 12:16 p.m. UTC | #3
Hi antlers,

Did you have a chance to look into it?

TIA,
Ludo’.

Ludovic Courtès <ludo@gnu.org> skribis:

> Hi,
>
> antlers <autumnalantlers@gmail.com> skribis:
>
>>  * gnu/packages/linux.scm (config->string): add escape-string
>>
>> Handles characters within the set
>> (char-set-intersection char-set:ascii char-set:printing), removing
>> those which are known to be unsupported.
>
> [...]
>
>>  (define (config->string options)
>> +  (define (escape-string str)
>> +    "Returns STR with the escapes necessary to be read as a string-type
>> +    option's value. Handles characters within the set (char-set-intersection
>> +    char-set:ascii char-set:printing), removing those which are known to be
>> +    unsupported."
>
> Nitpick: You can turn the docstring into a comment since the docstring
> wouldn’t be accessible anyway.
>
>> +    (fold (match-lambda* (((match? fmt) str)
>> +			  (transform-string str match?
>> +					    (cut format #f fmt <>))))
>
> Please avoid tabs.
>
> ‘transform-string’ is from (texinfo string-utils), which is not imported
> here.  IMO, we’d rather avoid depending on this module since it’s really
> designed for the Texinfo machinery.
>
>> +	  str
>> +	  `((#\# "") ; No known way to escape # characters.
>> +	    (#\$ "$~a")
>> +	    ("\"\\'`" "\\~a")
>> +	    (";:()#" "\\\\~a")
>> +	    ("|" "\\\\\\~a")
>> +	    ;; No support for tabs, newlines, etc.
>> +	    (,(char-set->string (ucs-range->char-set 9 14)) ""))))
>
> I wonder if this should be implemented in terms of ‘string-fold’
> instead:
>
>   (string-concatenate-reverse
>     (string-fold (lambda (chr result)
>                    (match chr
>                      (#\# (cons "" result))
>                      ;; …
>                      (_ (cons (string chr) result))))
>                  '()
>                  str))
>
> Thoughts?
>
> Thanks,
> Ludo’.
  
antlers April 28, 2022, 8:18 p.m. UTC | #4
Yeah, sorry for the silence, there's been a lot going on and being able to use strings in the first place is a comportable baseline of functionality- I don't feel that one should implement implicit escaping of a field until confident that all the corner cases are handled, and think that there are some subtle warts left. Haven't had that time iron those out, but I'll be glad to polish the details and can follow up within about a week once I've addressed my remaining cornerns about correctness. Thanks for bearing with me I fumble my way through the conventions of the mailing list and formatting, nitpicks are what I'm here for c:

I think transform-string is a gem for the task, the inputs are flexible and the author specifically cites better performance than string-fold in the (ice-9 texinfo) source, but I appreciate your point and can happily specialize it in-line.

----------------------------------------

Apr 28, 2022 5:16:46 AM Ludovic Courtès <ludo@gnu.org>:

> Hi antlers,
>
> Did you have a chance to look into it?
>
> TIA,
> Ludo’.
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Hi,
>>
>> antlers <autumnalantlers@gmail.com> skribis:
>>
>>> * gnu/packages/linux.scm (config->string): add escape-string
>>>
>>> Handles characters within the set
>>> (char-set-intersection char-set:ascii char-set:printing), removing
>>> those which are known to be unsupported.
>>
>> [...]
>>
>>> (define (config->string options)
>>> +  (define (escape-string str)
>>> +    "Returns STR with the escapes necessary to be read as a string-type
>>> +    option's value. Handles characters within the set (char-set-intersection
>>> +    char-set:ascii char-set:printing), removing those which are known to be
>>> +    unsupported."
>>
>> Nitpick: You can turn the docstring into a comment since the docstring
>> wouldn’t be accessible anyway.
>>
>>> +    (fold (match-lambda* (((match? fmt) str)
>>> +             (transform-string str match?
>>> +                       (cut format #f fmt <>))))
>>
>> Please avoid tabs.
>>
>> ‘transform-string’ is from (texinfo string-utils), which is not imported
>> here.  IMO, we’d rather avoid depending on this module since it’s really
>> designed for the Texinfo machinery.
>>
>>> +     str
>>> +     `((#\# "") ; No known way to escape # characters.
>>> +       (#\$ "$~a")
>>> +       ("\"\\'`" "\\~a")
>>> +       (";:()#" "\\\\~a")
>>> +       ("|" "\\\\\\~a")
>>> +       ;; No support for tabs, newlines, etc.
>>> +       (,(char-set->string (ucs-range->char-set 9 14)) ""))))
>>
>> I wonder if this should be implemented in terms of ‘string-fold’
>> instead:
>>
>>   (string-concatenate-reverse
>>     (string-fold (lambda (chr result)
>>                    (match chr
>>                      (#\# (cons "" result))
>>                      ;; …
>>                      (_ (cons (string chr) result))))
>>                  '()
>>                  str))
>>
>> Thoughts?
>>
>> Thanks,
>> Ludo’.
  
antlers May 8, 2022, 4:48 a.m. UTC | #5
Hi! Still a busy week, working retail will do that, but this
investigation has been on the back burner long enough that it's become
clear; the kernel provides no reliable heuristic for determining the
appropriate escapes for a given option (least of all for out-of-tree
modules). Some options are explicitly expanded in Makefiles, others
aren't but become parts of filenames which are, and every special
character seems to create a syntax error on a whole new layer :c

I'm short on time to write anything up tonight, but have explored the
'behavior of', 'supported escapes within', and 'unsupported characters
of' CONFIG_SYSTEM_*_KEYS, CONFIG_CMDLINE, CONFIG_LOCALVERSION, and
CONFIG_DEFAULT_HOSTNAME (complete with associated errors for
unsupported chars within the set I initially referenced), and would be
glad to follow up with a brief summary purely for posterity; but
there's simply no elegant or complete approach unless this were to be
addressed by upstream(s). It's clearly peeved of me, and I'd take it
to those Makefiles, but for similar issues in out-of-tree modules that
I don't think I could address. It's been a lot of fun though.

I suppose we could factor out the escapes which are common to all
fields (which could be seen as eliminating /a/ layer, that of
Kconfig/conf.c itself), or address only specific common options (not a
serious suggestion!), but these each feel like inelegant solutions
which are more likely to introduce additional confusion when an option
doesn't behave correctly as transcribed out of a .config file, hence
my all-or-nothing mindset.

While I'm here, I once wrote a bash script which would set options via
the kernel's 'config' utility (not worth using over the existing
method of appending to .config, doesn't do any validation), and was
having issues with some configurations because not every option I
tried to set had it's dependency clauses satisfied. This isn't an
issue when using menuconfig because options don't even appear until
their dependencies are satisfied, but scripts can't tell whether the
options they're setting are visible. At the time I just had it
double-check after a run of `make oldconfig` (when validation is
actually done) and emit warnings. I'm not confident that it caught
every issue by just using the 'config' utility or grepping .config,
but it was good enough.

Now, I've got this crazy idea to compile conf.c as a shared library
and link against it at runtime via Guile's FFI, in order to a) learn
how that works(!), and b) use the kernels own utilities to reconstruct
the options-graph (a 'menu' object) and emit correct warnings when
setting options that aren't visible, or even actively ensure
dependencies are satisfied. In brief, would this be Guix-y?
  
antlers May 17, 2022, 3:54 p.m. UTC | #6
[won't-fix]
  

Patch

diff --git a/gnu/packages/linux.scm b/gnu/packages/linux.scm
index b31fe0a580..60ae668fd9 100644
--- a/gnu/packages/linux.scm
+++ b/gnu/packages/linux.scm
@@ -761,6 +761,22 @@  (define %bpf-extra-linux-options
     ("CONFIG_IKHEADERS" . #t)))
 
 (define (config->string options)
+  (define (escape-string str)
+    "Returns STR with the escapes necessary to be read as a string-type
+    option's value. Handles characters within the set (char-set-intersection
+    char-set:ascii char-set:printing), removing those which are known to be
+    unsupported."
+    (fold (match-lambda* (((match? fmt) str)
+			  (transform-string str match?
+					    (cut format #f fmt <>))))
+	  str
+	  `((#\# "") ; No known way to escape # characters.
+	    (#\$ "$~a")
+	    ("\"\\'`" "\\~a")
+	    (";:()#" "\\\\~a")
+	    ("|" "\\\\\\~a")
+	    ;; No support for tabs, newlines, etc.
+	    (,(char-set->string (ucs-range->char-set 9 14)) ""))))
   (string-join (map (match-lambda
                       ((option . 'm)
                        (string-append option "=m"))
@@ -769,7 +785,9 @@  (define (config->string options)
                       ((option . #f)
                        (string-append option "=n"))
                       ((option . string)
-                       (string-append option "=\"" string "\"")))
+                       (string-append option "=\""
+				      (escape-string string)
+				      "\"")))
                     options)
                "\n"))