89 lines
		
	
	
		
			2.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			89 lines
		
	
	
		
			2.8 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
- scsh integration
 | 
						|
  Affected: fr nawk filemtch glob rdelim re scsh-interfaces scsh-package
 | 
						|
 | 
						|
- Naming conventions. "re" vs. "regexp", should I have "smart" versions
 | 
						|
  of make-re-string, etc.
 | 
						|
 | 
						|
- Remove all "reduce" forms from scsh, replace with foldl, foldr forms.
 | 
						|
  - Check FPS, network code
 | 
						|
 | 
						|
- The match fun should allow you to state the beginning of string is not a
 | 
						|
  real bos & likewise for eos. Similarly for bol & eol.
 | 
						|
  execution flag: 
 | 
						|
    -- REG_NOTBOL -- beginning of string doesn't count as ^ match.
 | 
						|
    -- REG_NOTEOL -- end       of string doesn't count as $ match.
 | 
						|
 | 
						|
- Hack awk, expect, chat, dir-match for new regexp system
 | 
						|
  Current:
 | 
						|
  (awk (test body ...)
 | 
						|
       (:range test1 test2 body ...)
 | 
						|
       (else body ...)
 | 
						|
       (test => proc)
 | 
						|
       (test ==> vars body ...))
 | 
						|
 | 
						|
  test ::=
 | 
						|
    integer
 | 
						|
    expression
 | 
						|
    string
 | 
						|
 | 
						|
 | 
						|
  New:
 | 
						|
  (else body ...)
 | 
						|
  (:range test1 test2 body ...)
 | 
						|
  (after body ...)
 | 
						|
  (test => proc)
 | 
						|
  (test ==> vars body ...)
 | 
						|
  (test body ...)
 | 
						|
 | 
						|
  test ::= integer | sre | (WHEN exp) | exp
 | 
						|
 | 
						|
-------------------------------------------------------------------------------
 | 
						|
Must disallow, due to Posix' RE_CONTEXT_INVALID_OPS
 | 
						|
    ...^*...
 | 
						|
    *... ...(*... ...|*...
 | 
						|
    |... ...| ...|$... ...||... ...(|...
 | 
						|
 | 
						|
    That is: 
 | 
						|
    1. Do simplification below to remove repeats of zero-length matches.
 | 
						|
    2. An empty elt of a choice renders as ().
 | 
						|
    3. ...|$... Hack it: If first char of a rendered choice elt is $, prefix
 | 
						|
       with ().
 | 
						|
 | 
						|
    Simplify ^{0,n} -> ""
 | 
						|
             ^{m,n} -> ^     (0<m<=n)
 | 
						|
             ^{m,n} -> (in)  (m>n)
 | 
						|
	     Similarly for bos/eos bol/eol bow/eow ""
 | 
						|
 | 
						|
    Spencer says:
 | 
						|
       A repetition operator (?, *, +, or bounds)  cannot  follow
 | 
						|
       another repetition operator.  A repetition operator cannot
 | 
						|
       begin an expression or subexpression or follow `^' or `|'.
 | 
						|
 | 
						|
       `|'  cannot  appear  first or last in a (sub)expression or
 | 
						|
       after another `|', i.e. an operand of  `|'  cannot  be  an
 | 
						|
       empty  subexpression.   An  empty parenthesized subexpres-
 | 
						|
       sion, `()', is legal and matches an empty (sub)string.  An
 | 
						|
       empty string is not a legal RE.
 | 
						|
 | 
						|
 | 
						|
Fix the printer and reader so control chars are printed as
 | 
						|
    \ddd; do syntax for control-char input
 | 
						|
 | 
						|
-------------------------------------------------------------------------------
 | 
						|
Less important:
 | 
						|
- Support for searching vs. matching
 | 
						|
- Case-scope hacking (needs s48 0.51 CODE-QUOTE)
 | 
						|
- simp caching
 | 
						|
- Better char-set->sre renderer
 | 
						|
  First, bound the cset with tightest possible superset,
 | 
						|
  then look for negations.
 | 
						|
 | 
						|
Possible interesting extensions:
 | 
						|
- An ADT->DFA compiler
 | 
						|
- A DFA->Scheme-code compiler
 | 
						|
- An ADT interpreter
 | 
						|
- A pattern notation for matching against s-expressions.
 | 
						|
  This would be handy for specifying the grammar of Scheme macros,
 | 
						|
  for example.
 | 
						|
- Only allocate svec and evec if we match?
 |