89 lines
2.8 KiB
Plaintext
89 lines
2.8 KiB
Plaintext
- scsh integration
|
|
Affected: fr nawk filemtch glob rdelim re scsh-interfaces scsh-package
|
|
|
|
- Naming conventions. "re" vs. "regexp", should I have "smart" versions
|
|
of make-re-string, etc.
|
|
|
|
- Remove all "reduce" forms from scsh, replace with foldl, foldr forms.
|
|
- Check FPS, network code
|
|
|
|
- The match fun should allow you to state the beginning of string is not a
|
|
real bos & likewise for eos. Similarly for bol & eol.
|
|
execution flag:
|
|
-- REG_NOTBOL -- beginning of string doesn't count as ^ match.
|
|
-- REG_NOTEOL -- end of string doesn't count as $ match.
|
|
|
|
- Hack awk, expect, chat, dir-match for new regexp system
|
|
Current:
|
|
(awk (test body ...)
|
|
(:range test1 test2 body ...)
|
|
(else body ...)
|
|
(test => proc)
|
|
(test ==> vars body ...))
|
|
|
|
test ::=
|
|
integer
|
|
expression
|
|
string
|
|
|
|
|
|
New:
|
|
(else body ...)
|
|
(:range test1 test2 body ...)
|
|
(after body ...)
|
|
(test => proc)
|
|
(test ==> vars body ...)
|
|
(test body ...)
|
|
|
|
test ::= integer | sre | (WHEN exp) | exp
|
|
|
|
-------------------------------------------------------------------------------
|
|
Must disallow, due to Posix' RE_CONTEXT_INVALID_OPS
|
|
...^*...
|
|
*... ...(*... ...|*...
|
|
|... ...| ...|$... ...||... ...(|...
|
|
|
|
That is:
|
|
1. Do simplification below to remove repeats of zero-length matches.
|
|
2. An empty elt of a choice renders as ().
|
|
3. ...|$... Hack it: If first char of a rendered choice elt is $, prefix
|
|
with ().
|
|
|
|
Simplify ^{0,n} -> ""
|
|
^{m,n} -> ^ (0<m<=n)
|
|
^{m,n} -> (in) (m>n)
|
|
Similarly for bos/eos bol/eol bow/eow ""
|
|
|
|
Spencer says:
|
|
A repetition operator (?, *, +, or bounds) cannot follow
|
|
another repetition operator. A repetition operator cannot
|
|
begin an expression or subexpression or follow `^' or `|'.
|
|
|
|
`|' cannot appear first or last in a (sub)expression or
|
|
after another `|', i.e. an operand of `|' cannot be an
|
|
empty subexpression. An empty parenthesized subexpres-
|
|
sion, `()', is legal and matches an empty (sub)string. An
|
|
empty string is not a legal RE.
|
|
|
|
|
|
Fix the printer and reader so control chars are printed as
|
|
\ddd; do syntax for control-char input
|
|
|
|
-------------------------------------------------------------------------------
|
|
Less important:
|
|
- Support for searching vs. matching
|
|
- Case-scope hacking (needs s48 0.51 CODE-QUOTE)
|
|
- simp caching
|
|
- Better char-set->sre renderer
|
|
First, bound the cset with tightest possible superset,
|
|
then look for negations.
|
|
|
|
Possible interesting extensions:
|
|
- An ADT->DFA compiler
|
|
- A DFA->Scheme-code compiler
|
|
- An ADT interpreter
|
|
- A pattern notation for matching against s-expressions.
|
|
This would be handy for specifying the grammar of Scheme macros,
|
|
for example.
|
|
- Only allocate svec and evec if we match?
|