58 lines
		
	
	
		
			2.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			58 lines
		
	
	
		
			2.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
This is a revision of my well-known regular-expression package, regexp(3).
 | 
						|
It gives C programs the ability to use egrep-style regular expressions, and
 | 
						|
does it in a much cleaner fashion than the analogous routines in SysV.
 | 
						|
It is not, alas, fully POSIX.2-compliant; that is hard.  (I'm working on
 | 
						|
a full reimplementation that will do that.)
 | 
						|
 | 
						|
This version is the one which is examined and explained in one chapter of
 | 
						|
"Software Solutions in C" (Dale Schumacher, ed.; AP Professional 1994;
 | 
						|
ISBN 0-12-632360-7), plus a couple of insignificant updates, plus one
 | 
						|
significant bug fix (done 10 Nov 1995).
 | 
						|
 | 
						|
Although this package was inspired by the Bell V8 regexp(3), this
 | 
						|
implementation is *NOT* AT&T/Bell code, and is not derived from licensed
 | 
						|
software.  Even though U of T is a V8 licensee.  This software is based on
 | 
						|
a V8 manual page sent to me by Dennis Ritchie (the manual page enclosed
 | 
						|
here is a complete rewrite and hence is not covered by AT&T copyright).
 | 
						|
I admit to some familiarity with regular-expression implementations of
 | 
						|
the past, but the only one that this code traces any ancestry to is the
 | 
						|
one published in Kernighan & Plauger's "Software Tools" (from which
 | 
						|
this one draws ideas but not code).
 | 
						|
 | 
						|
Simplistically:  put this stuff into a source directory, inspect Makefile
 | 
						|
for compilation options that need changing to suit your local environment,
 | 
						|
and then do "make".  This compiles the regexp(3) functions, builds a
 | 
						|
library containing them, compiles a test program, and runs a large set of
 | 
						|
regression tests.  If there are no complaints, then put regexp.h into
 | 
						|
/usr/include, add regexp.o, regsub.o, and regerror.o into your C library
 | 
						|
(or put libre.a into /usr/lib), and install regexp.3 (perhaps with slight
 | 
						|
modifications) in your manual-pages directory. 
 | 
						|
 | 
						|
The files are:
 | 
						|
 | 
						|
COPYRIGHT	copyright notice
 | 
						|
README		this text
 | 
						|
Makefile	instructions to make everything
 | 
						|
regexp.3	manual page
 | 
						|
regexp.h	header file, for /usr/include
 | 
						|
regexp.c	source for regcomp() and regexec()
 | 
						|
regsub.c	source for regsub()
 | 
						|
regerror.c	source for default regerror()
 | 
						|
regmagic.h	internal header file
 | 
						|
try.c		source for test program
 | 
						|
timer.c		source for timing program
 | 
						|
tests		test list for try and timer
 | 
						|
 | 
						|
This implementation uses nondeterministic automata rather than the
 | 
						|
deterministic ones found in some other implementations, which makes it
 | 
						|
simpler, smaller, and faster at compiling regular expressions, but slower
 | 
						|
at executing them.  Many users have found the speed perfectly adequate,
 | 
						|
although replacing the insides of egrep with this code would be a mistake.
 | 
						|
 | 
						|
This stuff should be pretty portable, given an ANSI C compiler and
 | 
						|
appropriate option settings.  There are no "reserved" char values except for
 | 
						|
NUL, and no special significance is attached to the top bit of chars.
 | 
						|
The string(3) functions are used a fair bit, on the grounds that they are
 | 
						|
probably faster than coding the operations in line.  Some attempts at code
 | 
						|
tuning have been made, but this is invariably a bit machine-specific.
 |