From 32b0c4bea5c02eda1d22e862a52846284d8e050e Mon Sep 17 00:00:00 2001 From: olin-shivers Date: Fri, 1 Jun 2001 17:49:20 +0000 Subject: [PATCH] *** empty log message *** --- doc/scsh-manual/awk.tex | 1 + doc/scsh-manual/changes.tex | 297 ----------------------- doc/scsh-manual/decls.tex | 2 + doc/scsh-manual/intro.tex | 61 +++-- doc/scsh-manual/man.tex | 3 +- doc/scsh-manual/strings.tex | 452 ++++++++--------------------------- doc/scsh-manual/syscalls.tex | 69 ++++-- doc/scsh-manual/todo.tex | 11 - 8 files changed, 193 insertions(+), 703 deletions(-) delete mode 100644 doc/scsh-manual/changes.tex diff --git a/doc/scsh-manual/awk.tex b/doc/scsh-manual/awk.tex index 9b29176..e8ec358 100644 --- a/doc/scsh-manual/awk.tex +++ b/doc/scsh-manual/awk.tex @@ -76,6 +76,7 @@ characters. \subsection{Parsing fields} +\label{sec:field-splitter} \defun {field-splitter} {[field num-fields]} \proc \defunx {infix-splitter} {[delim num-fields handle-delim]} \proc diff --git a/doc/scsh-manual/changes.tex b/doc/scsh-manual/changes.tex deleted file mode 100644 index c44637e..0000000 --- a/doc/scsh-manual/changes.tex +++ /dev/null @@ -1,297 +0,0 @@ -%&latex -*- latex -*- - -\chapter{Changes from previous releases} -\label{sec:changes} - -\newcommand{\itam}[1]{\item {#1} \\} - -\section{Changes from the previous release} - -This section details changes that have been made in scsh since -the previous release. - -Scsh is now much more robust. -All known bugs have been fixed. -There have been many improvements and extensions made. -These new features and changes are listed below, in no particular order; -the relevant sections of the manual give the full details. - -Scsh now supports complete {\Posix}, including signal handlers. -Early autoreaping of child processes is now handled by a \ex{SIGCHLD} -signal handler, so children are reaped as early as possible with no -user intervention required. - -A functional static heap linker is included in this release. -It is ugly, limited in functionality, and extremely slow, but it works. -It can be used to build scsh binaries that start up instantly. - -The regular expression system has been sped up. -Regular-expression compilation is now provided, -and the \ex{awk} macro has been rewritten to pre-compile -regexps used in rules outside the loop. -It is still, however, slower than it should be. - -Execing programs should be faster in this release, since we now use the -\ex{CLOEXEC} status bit to get automatic closing of unrevealed -port file descriptors. - -{scm}'s floating point support was inadvertently omitted from the last -release. It has been reinstated. - -There is now a new command-line switch, \ex{-sfd \var{num}}, -which causes scsh to read its script from file descriptor \var{num}. - - -\section{Changes from the penultimate release} - -This section details changes that have been made in scsh since -the penultimate release. - -Scsh is now much more robust. -All known bugs have been fixed. -There have been many improvements and extensions made. -We have also made made some incompatible changes. - -The sections below briefly describe these new features and changes; -the relevant sections of the manual give the full details. - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{New features} -This release incorporates several new features into scsh. - -\begin{itemize} -\itam{Control of buffered I/O} -Scsh now allows you to control the buffering policy used for doing I/O -on a Scheme port. - -\itam{Here-strings} -Scsh now has a new lexical feature, \verb|#<<|, that provides -the ability to enter long, multi-line string constants in scsh programs. -Such a string is called a ``here string,'' by analogy to the common -shell ``here document'' \ex{<<} redirection. - -\itam{Delimited readers and read-line} -Scsh now has a powerful set of delimited readers. -These can be used to read input delimited by -a newline character (\ex{read-line}), -a blank line (\ex{read-paragraph}), -or the occurrence of any character in an arbitrary set (\ex{read-delimited}). - -While these procedures can be applied to any Scheme input port, -there is native-code support for performing delimited reads on -Unix input sources, so doing block input with these procedures should be -much faster than the equivalent character-at-a-time Scheme code. - -\itam{New system calls} -With the sole exception of signal handlers, scsh now has all of {\Posix}. -This release introduces -\begin{itemize} -\item \ex{select}, -\item full terminal device control, -\item support for pseudo-terminal ``pty'' devices, -\item file locking, -\item process timing, -\item \ex{set-file-times}, -\item \ex{seek} and \ex{tell}. -\end{itemize} - -Note that having \ex{select}, pseudo-terminals, and tty device control means -that it is now possible to implement interesting network protocols, such as -telnet servers and clients, directly in Scheme. - -\itam{New command-line switches} -There is a new set of command-line switches that make it possible -to write shell scripts using the {\scm} module system. -Scripts can use the new command-line switches to open dependent -modules and load dependent source code. -Scripts can also be written in the {\scm} module language, -which allows you to use it both as a standalone shell script, -and as a code module that can be loaded and used by other Scheme programs. - -\itam{Static heap linking} -There is a new facility that allows you to compile a heap image -to a \ex{.o} file that can be linked with the scsh virtual machine. -This produces a standalone executable binary, makes startup time -near-instantaneous, and greatly improves memory performance---the -initial heap image is placed in the process' text pages, -where it is shared by different scsh processes, and does not occupy -space in the run-time heap. - -\oops{The static heap linker was not documented and installed in time - for this release.} - - -\end{itemize} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Incompatible improvements} -Some features of scsh have been improved in ways that are -not backwards-compatible with previous releases. -These changes should not affect most code; -however, please note the changes and modify your code accordingly. - -\begin{itemize} -\itam{New process-object data-type returned by \ex{fork}} -Previous releases were prone to fill up the kernel's process table -if a program forked large numbers of processes and subsequently failed -to use \ex{wait} to reclaim the entries in the kernel's process table. -(This is a problem in standard C environments, as well.) - -Scsh 0.4 introduces a new mechanism for automatically managing subprocesses. -Processes are no longer represented by an integer process id, -which is impossible to garbage-collect, but by an -abstract process data type that encapsulates the process id. -All processes are represented using the new data structures; -see the relevant section of the manual for further details. - -\itam{Better stdio/current-port synchronisation} -The \ex{(begin \ldots)} process form now does a \ex{stdio->stdports} -call before executing its body. -This means that the Scheme code in the body ``sees'' any external -redirections. -For example, it means that if a \ex{begin} form in the middle of a pipeline -performs I/O on the current input and output ports, it will be communicating -with its upstream and downstream pipes. -\Eg, this code works as intended without the need for explicit synchronisation: -\begin{verbatim} -(run (| (gunzip) - ;; Kill line 1 and insert doubled-sided - ;; code at head of Postscript. - (begin (read-line) ; Eat first line. - (display "%!PS-Adobe-2.0\\n") - (display "statusdict /setduplexmode known ") - (display "{statusdict begin true ") - (display "setduplexmode end} if\n") - (exec-epf (cat))) - (lpr)) - (< paper.ps))\end{verbatim} -Arranging for the \ex{begin} process form to synchronise -the current I/O ports with stdio means that all process forms now -see their epf's redirections. - -\itam{\ex{file-match} more robust} -The \ex{file-match} procedure now catches any error condition -signalled by a match procedure, -and treats it as if the procedure had simply returned {\sharpf}, -\ie, match failure. -This means \ex{file-match} no longer gets blown out of the water by -trying to apply a function like \ex{file-directory?} to a dangling symlink, -and other related OS errors. - -\itam{Standard input now unbuffered} -Scsh's startup code now makes the initial current input port -(corresponding to file descriptor 0) unbuffered. -This keeps the shell from ``stealing'' input meant for subprocesses. -However, it does slow down character-at-a-time input processing. -If you are writing a program that is tolerant of buffered input, -and wish the efficiency gains, you can reset the buffering policy -yourself. - -\itam{``writeable'' now spelled ``writable''} -We inconsistently spelled \ex{file-writable?} and \ex{file-not-writable?} -in the manual and the implementation. -We have now standardised on the common spelling ``writable'' in both. -The older bindings still exist in release 0.4, but will go away in future -releases. - -\itam{\protect\ex{char-set-member?} replaced} -We have de-released the \ex{char-set-member?} procedure. -The scsh 0.3 version of this procedure took arguments -in the following order: - \codex{(char-set-member? \var{char} \var{char-set})} -This argument order is in accordance with standard mathematical useage -(\ie, $x \in S$), and also consistent with the R4RS -\ex{member}, \ex{memq} and \ex{memv} procedures. -It is, however, exactly opposite from the argument order -used by the \ex{char-set-member?} in MIT Scheme's character-set library. -If we left things as they were, we risked problems with code -ported over from MIT Scheme. -On the other hand, changing to conformance with MIT Scheme meant -inconsistency with common mathematical notation and other long-standing -Scheme procedures. -Either way was bound to introduce confusion. - -We've taken the approach of simply removing the \ex{char-set-member?} -procedure altogether, and replacing it with a new procedure: -\codex{(char-set-contains? \var{cset} \var{char})} -Note that the argument order is consistent with the name. - -\itam{\ex{file-attributes} now \ex{file-info}} -In keeping with the general convention in scsh of naming procedures -that retrieve information about system resources \ex{\ldots-info} -(\eg, \ex{tty-info}, \ex{user-info}, \ex{group-info}), -the \ex{file-attributes} procedure is now named \ex{file-info}. - -We continue to export a \ex{file-attributes} binding for the current -release, but it will go away in future releases. - -\itam{Renaming of I/O synchronisation procedures} -The \ex{(stdio->stdports \var{thunk})} procedure has been -renamed \ex{with-stdio-ports*}; -there is now a corresponding \ex{with-stdio-ports} special form. -The \ex{stdio->stdports} procedure is now a nullary procedure -that side-effects the current set of current I/O port bindings. - -\itam{New meta-arg line-two syntax} -Scsh now uses a simplified grammar for describing command-line -arguments read by the ``meta-arg'' switch from line two of a shell script. -If you were using this feature in previous releases, the three incompatible -changes of which to be aware are: -(1) tab is no longer allowed as an argument delimiter, -(2) a run of space characters is not equivalent to a single space, -(3) empty arguments are written a different way. -\end{itemize} - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Backwards-compatible improvements} - -Some existing features in scsh have been improved in ways that will -not effect existing code. - -\begin{itemize} -\itam{Improved error reporting} -Exception handlers that print out error messages and warnings now -print their messages on the error output port, -instead of the current output port. -Previous releases used the current output port, -a problem inherited from Scheme 48. - -Previous scsh releases flushed the Scheme 48 debugging tables when -creating the standard scsh heap image. -This trimmed the size of the heap image, but made error messages much -less comprehensible. -We now retain the debugging tables. -This bloats the heap image up by about 600kb. And worth it, too. - -(We also have some new techniques for eliminating the run-time memory -penalty imposed by these large heap images. -Scsh's new static-heap technology allows for this data to be linked -into the text pages of the vm's binary, where it will not be touched -by the GC or otherwise affect the memory system until it is referenced.) - -Finally, scsh now generates more informative error messages for syscall -errors. -For example, a file-open error previously told you what the error was -(\eg, ``Permission denied,'' or ``No such file or directory''), -but not which file you had tried to open. -We've improved this. - -\itam{Closing a port twice allowed} -Scsh used to generate an error if you attempted to close a port -that had already been closed. -This is now allowed. -The close procedure returns a boolean to indicate whether the port had -already been closed or not. - -\itam{Better time precision} -The \ex{time+ticks} procedure now returns sub-second precision on OS's -that support it. - -\itam{Nicer print-methods for basic data-types} -Scsh's standard record types now print more informatively. -For example, a process object includes the process id in its -printed representation: the process object for process id 2653 -prints as \verb|#{proc 2653}|. - -\end{itemize} diff --git a/doc/scsh-manual/decls.tex b/doc/scsh-manual/decls.tex index a391681..f927101 100644 --- a/doc/scsh-manual/decls.tex +++ b/doc/scsh-manual/decls.tex @@ -23,6 +23,8 @@ \def\maketildeactive{\catcode`\~=13} \def\~{\char`\~} +\newcommand{\evalsto}{\ensuremath{\Rightarrow}} + % One-line code examples %\newcommand{\codex}[1]% One line, centred. Tight spacing. % {$$\abovedisplayskip=.75ex plus 1ex minus .5ex% diff --git a/doc/scsh-manual/intro.tex b/doc/scsh-manual/intro.tex index 60a2414..622b16b 100644 --- a/doc/scsh-manual/intro.tex +++ b/doc/scsh-manual/intro.tex @@ -18,6 +18,30 @@ This manual gives a complete description of scsh. A general discussion of the design principles behind scsh can be found in a companion paper, ``A Scheme Shell.'' +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Copyright \& source-code license} +Scsh is open source. The complete sources come with the standard +distribution, which can be downloaded off the net. + +For years, scsh's underlying Scheme implementation, Scheme 48, did not have an +open-source copyright. However, around 1999/2000, the Scheme 48 authors +graciously retrofitted a BSD-style open-source copyright onto the system. +Swept up by the fervor, we tacked an ideologically hip license onto scsh +source, ourselves (BSD-style, as well). Not that we ever cared before what you +did with the system. + +As a result, the whole system is now open source, top-to-bottom. + +We note that the code is a rich source for other Scheme implementations +to mine. Not only the \emph{code}, but the \emph{APIs} are available +for implementors working on Scheme environments for systems programming. +These APIs represent years of work, and should provide a big head-start +on any related effort. (Just don't call it ``scsh,'' unless it's +\emph{exactly} compliant with the scsh interfaces.) + +Take all the code you like; we'll just write more. + + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Obtaining scsh} Scsh is distributed via net publication. @@ -100,14 +124,11 @@ but the system as-released does not currently provide these features. In the current release, the system has some rough edges. It is quite slow to start up---loading the initial image into the -{\scm} virtual machine takes about a cpu second. +{\scm} virtual machine induces a noticeable delay. This can be fixed with the static heap linker provided with this release. -This manual is very, very rough. -At some point, we hope to polish it up, finish it off, and re-typeset it -using markup, so we can generate html, info nodes, and {\TeX} output from -the single source without having to deal with Texinfo. -But it's all there is, for now. +We welcome parties interested in porting the manual to a more portable +XML or SGML format; please contact us if you are interested in doing so. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Naming conventions} @@ -375,11 +396,17 @@ All told, the \ex{define-record} form above defines the following procedures: (ship:size \var{ship}) & Retrieve the \var{size} field. \\ \hline (set-ship:x \var{ship} \var{new-x}) & Assign the \var{x} field. \\ -(set-ship:y \var{ship} \var{new-y}) & Assign the \var{x} field. \\ +(set-ship:y \var{ship} \var{new-y}) & Assign the \var{y} field. \\ (set-ship:size \var{ship} \var{new-size}) & Assign the \var{size} field. \\ \hline +(modify-ship:x \var{ship} \var{xfun}) & Modify \var{x} field with \var{xfun}. \\ +(modify-ship:y \var{ship} \var{yfun}) & Modify \var{y} field with \var{yfun}. \\ +(modify-ship:size \var{ship} \var{sizefun}) & Modify \var{size} field with \var{sizefun}. \\ +\hline (ship? \var{object}) & Type predicate. \\ \hline +(copy-ship \var{ship}) & Shallow-copy of the record. \\ +\hline \end{tabular} \end{center} % @@ -387,7 +414,9 @@ All told, the \ex{define-record} form above defines the following procedures: An implementation of \ex{define-record} is available as a macro for Scheme programmers to define their own record types; the syntax is accessed by opening the package \ex{defrec-package}, which -exports the single syntax form \ex{define-record}. +exports the single syntax form \ex{define-record}. +See the source code for the \ex{defrec-package} module +for further details of the macro. You must open this package to access the form. Scsh does not export a record-definition package by default as there are @@ -417,21 +446,9 @@ you could not read and internalise such a twisted account without bleeding from the nose and ears. However, you might keep in mind the following simple fact: of all the -standards, {\Posix}, as far as I have been able to determine, -is the least common denominator. +standards, {\Posix} is the least common denominator. So when this manual repeatedly refers to {\Posix}, the point is ``the thing we are describing should be portable just about anywhere.'' -Scsh sticks to {\Posix} when at all possible; it's major departure is +Scsh sticks to {\Posix} when at all possible; its major departure is symbolic links, which aren't in {\Posix} (see---it really \emph{is} a least common denominator). - -However, just because {\Posix} is the l.c.d. standard doesn't mean everyone -supports all of it. -The guerilla PC {\Unix} implementations that have been springing up on -the net (\eg, NetBSD, Linux, FreeBSD, and so forth) are only recently coming -into compliance with the standard---although they are getting there. -We have been able to implement scsh completely on all of these systems, -however---the single exception is NeXTSTEP, whose buggy {\Posix} libraries -restricts us to partial support (these lacunae are indicated where relevant -in the rest of the manual).\footnote{Feel like porting scsh from {\Posix} to -NeXT's BSD API? Send us your fixes; we'll fold them in.} diff --git a/doc/scsh-manual/man.tex b/doc/scsh-manual/man.tex index 4e358f4..89ff81b 100644 --- a/doc/scsh-manual/man.tex +++ b/doc/scsh-manual/man.tex @@ -1,4 +1,4 @@ -%&latex -*- latex -*- +% -*- latex -*- % This is the reference manual for the Scheme Shell. @@ -47,7 +47,6 @@ \include{awk} \include{miscprocs} \include{running} -\include{changes} \include{todo} \backmatter diff --git a/doc/scsh-manual/strings.tex b/doc/scsh-manual/strings.tex index a87c66f..983ea26 100644 --- a/doc/scsh-manual/strings.tex +++ b/doc/scsh-manual/strings.tex @@ -1,27 +1,57 @@ % -*- latex -*- \chapter{Strings and characters} -Scsh provides a set of procedures for processing strings and characters. -The procedures provided match regular expressions, search strings, -parse file-names, and manipulate sets of characters. - -Also see chapters \ref{chapt:sre}, \ref{chapt:rdelim} and \ref{chapt:fr-awk} -on regular-expressions, record I/O, field parsing, and the awk loop. -The procedures documented there allow you to search and pattern-match strings, -read character-delimited records from ports, -use regular expressions to split the records into fields -(for example, splitting a string at every occurrence of colon or white-space), -and loop over streams of these records in a convenient way. - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{String manipulation} -\label{sec:stringmanip} - Strings are the basic communication medium for {\Unix} processes, so a -shell language must have reasonable facilities for manipulating them. +Unix programming environment must have reasonable facilities for manipulating +them. +Scsh provides a powerful set of procedures for processing strings and +characters. +Besides the the facilities described in this chapter, scsh also provides +\begin{itemize} +\itum{Regular expressions (chapter~\ref{chapt:sre})} + A complete regular-expression system. + +\itum{Field parsing, delimited record I/O and the awk loop + (chapter~\ref{chapt:fr-awk})} + These procedures let you read in chunks of text delimited by selected + characters, and + parse each record into fields based on regular expressions + (for example, splitting a string at every occurrence of colon or + white-space). + The \ex{awk} form allows you to loop over streams of these records + in a convenient way. + +\itum{The SRFI-13 string libraries} + This pair of libraries contains procedures that create, fold, iterate over, + search, compare, assemble, cut, hash, case-map, and otherwise manipulate + strings. + They are provided by the \ex{string-lib} and \ex{string-lib-internals} + packages, and are also available in the default \ex{scsh} package. + + More documentation on these procedures can be found at URLs + \begin{tightinset} + % The gratuitous mbox makes xdvi render the hyperlinks better. + \mbox{\url{http://srfi.schemers.org/srfi-13/srfi-13.html}}\\ + \url{http://srfi.schemers.org/srfi-13/srfi-13.txt} + \end{tightinset} + +\itum{The SRFI-14 character-set library} + This library provides a set-of-characters abstraction, which is frequently + useful when searching, parsing, filtering or otherwise operating on + strings and character data. The SRFI is provided by the \ex{char-set-lib} + package; it's bindings are also available in the default \ex{scsh} package. + + More documentation on this library can be found at URLs + \begin{tightinset} + % The gratuitous mbox makes xdvi render the hyperlinks better. + \mbox{\url{http://srfi.schemers.org/srfi-14/srfi-14.html}}\\ + \url{http://srfi.schemers.org/srfi-14/srfi-14.txt} + \end{tightinset} + +\end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Manipulating file-names} +\section{Manipulating file names} \label{sec:filenames} These procedures do not access the file-system at all; they merely operate @@ -30,7 +60,7 @@ design. Perhaps a more sophisticated system would be better, something like the pathname abstractions of {\CommonLisp} or MIT Scheme. However, being {\Unix}-specific, we can be a little less general. -\subsubsection{Terminology} +\subsection{Terminology} These procedures carefully adhere to the {\Posix} standard for file-name resolution, which occasionally entails some slightly odd things. This section will describe these rules, and give some basic terminology. @@ -95,7 +125,7 @@ interpreted in file-name form, \ie, as root. -\subsubsection{Procedures} +\subsection{Procedures} \defun {file-name-directory?} {fname} \boolean \defunx {file-name-non-directory?} {fname} \boolean @@ -355,38 +385,7 @@ is also frequently useful for expanding file-names. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Other string manipulation facilities} - -\defun {index} {string char [start]} {{\fixnum} or false} -\defunx {rindex} {string char [start]} {{\fixnum} or false} -\begin{desc} - These procedures search through \var{string} looking for an occurrence - of character \var{char}. \ex{index} searches left-to-right; \ex{rindex} - searches right-to-left. - - \ex{index} returns the smallest index $i$ of \var{string} greater - than or equal to \var{start} such that $\var{string}[i] = \var{char}$. - The default for \var{start} is zero. If there is no such match, - \ex{index} returns false. - - \ex{rindex} returns the largest index $i$ of \var{string} less than - \var{start} such that $\var{string}[i] = \var{char}$. - The default for \var{start} is \ex{(string-length \var{string})}. - If there is no such match, \ex{rindex} returns false. -\end{desc} - -I should probably snarf all the MIT Scheme string functions, and stick them -in a package. {\Unix} programs need to mung character strings a lot. - -MIT string match commands: -\begin{tightcode} -[sub]string-match-{forward,backward}[-ci] -[sub]string-{prefix,suffix}[-ci]? -[sub]string-find-{next,previous}-char[-ci] -[sub]string-find-{next,previous}-char-in-set -[sub]string-replace[!] -\ldots\etc\end{tightcode} -These are not currently provided. +\section{Other string manipulation facilities} \begin{defundesc} {substitute-env-vars} {fname} \str Replace occurrences of environment variables with their values. @@ -412,315 +411,72 @@ These are not currently provided. \end{desc} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{Character sets} -\label{sec:char-sets} +\section{Character predicates} -Scsh provides a \ex{char-set} type for expressing sets of characters. -These sets are used by some of the delimited-input procedures -(section~\ref{sec:field-reader}). -Scsh's character set package was adapted and extended from -Project Mac's MIT Scheme package. -Note that the character type used in the current implementation corresponds -to the ASCII character set---but you would be wise not to build this -assumption into your code if you can help it.\footnote{ - Actually, it's slightly uglier than that, albeit somewhat more - useful. The current character type corresponds to an eight-bit - superset of ASCII. The \ex{ascii->char} and \ex{char->ascii} - functions will preserve this eighth bit. However, none of the - the high 128 characters appear in any of the standard character - sets defined in section~\ref{sec:std-csets}, except for - \ex{char-set:full}. If someone would email the authors a listing - of the full Latin-1 definition, we'll be happy to upgrade these - sets' definitions to make them Latin-1 compliant.} - -\defun{char-set?}{x}\boolean -\begin{desc} -Is the object \var{x} a character set? -\end{desc} - -\defun{char-set=}{\vari{cs}1 \vari{cs}2\ldots}\boolean -\begin{desc} -Are the character sets equal? -\end{desc} - -\defun{char-set<=}{\vari{cs}1 \vari{cs}2\ldots}\boolean -\begin{desc} -Returns true if every character set \vari{cs}{i} is -a subset of character set \vari{cs}{i+1}. -\end{desc} - -\defun{char-set-fold}{kons knil cs}\object -\begin{desc} -This is the fundamental iterator for character sets. -Applies the function \var{kons} across the character set \var{cs} using -initial state value \var{knil}. -That is, if \var{cs} is the empty set, the procedure returns \var{knil}. -Otherwise, some element \var{c} of \var{cs} is chosen; let \var{cs'} be -the remaining, unchosen characters. -The procedure returns -\begin{tightcode} -(char-set-fold \var{kons} (\var{kons} \var{c} \var{knil}) \var{cs'})\end{tightcode} -For example, we could define \ex{char-set-members} (see below) -as -\begin{tightcode} -(lambda (cs) (char-set-fold cons '() cs))\end{tightcode} - -\remark{This procedure was formerly named \texttt{\indx{reduce-char-set}}. - The old binding is still provided, but is deprecated and will - probably vanish in a future release.} -\end{desc} - -\defun{char-set-for-each}{p cs}{\undefined} -\begin{desc} -Apply procedure \var{p} to each character in the character set \var{cs}. -Note that the order in which \var{p} is applied to the characters in the -set is not specified, and may even change from application to application. -\end{desc} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Creating character sets} - -\defun{char-set}{\vari{char}1\ldots}{char-set} -\begin{desc} -Return a character set containing the given characters. -\end{desc} - -\defun{chars->char-set}{chars}{char-set} -\begin{desc} -Return a character set containing the characters in the list \var{chars}. -\end{desc} - -\defun{string->char-set}{s}{char-set} -\begin{desc} -Return a character set containing the characters in the string \var{s}. -\end{desc} - -\defun{predicate->char-set}{pred}{char-set} -\begin{desc} -Returns a character set containing every character \var{c} such that -\ex{(\var{pred} \var{c})} returns true. -\end{desc} - -\defun{ascii-range->char-set}{lower upper}{char-set} -\begin{desc} -Returns a character set containing every character whose {\Ascii} -code lies in the half-open range $[\var{lower},\var{upper})$. -\end{desc} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Querying character sets} -\defun {char-set-members}{char-set}{character-list} -\begin{desc} -This procedure returns a list of the members of \var{char-set}. -\end{desc} - -\defunx{char-set-contains?}{char-set char}\boolean -\begin{desc} -This procedure tests \var{char} for membership in set \var{char-set}. -\remark{Previous releases of scsh called this procedure \ex{char-set-member?}, -reversing the order of the arguments. -This made sense, but was unfortunately the reverse order in which the -arguments appear in MIT Scheme. -A reasonable argument order was not backwards-compatible with MIT Scheme; -on the other hand, the MIT Scheme argument order was counter-intuitive -and at odds with common mathematical notation and the \ex{member} family -of R4RS procedures. - -We sought to escape the dilemma by shifting to a new name.} -\end{desc} - -\defun{char-set-size}{cs}\integer -\begin{desc} -Returns the number of elements in character set \var{cs}. -\end{desc} - -\defun{char-set-every?}{pred cs}\boolean -\defunx{char-set-any?}{pred cs}\object -\begin{desc} -The \ex{char-set-every?} procedure returns true if predicate \var{pred} -returns true of every character in the character set \var{cs}. - -Likewise, \ex{char-set-any?} applies \var{pred} to every character in -character set \var{cs}, and returns the first true value it finds. -If no character produces a true value, it returns false. - -The order in which these procedures sequence through the elements of -\var{cs} is not specified. -\end{desc} - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Character-set algebra} -\defun {char-set-invert}{char-set}{char-set} -\defunx{char-set-union}{\vari{char-set}1\ldots}{char-set} -\defunx{char-set-intersection}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set} -\defunx{char-set-difference}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set} -\begin{desc} -These procedures implement set complement, union, intersection, and difference -for character sets. -The union, intersection, and difference operations are n-ary, associating -to the left; the difference function requires at least one argument, while -union and intersection may be applied to zero arguments. -\end{desc} - -\defun {char-set-adjoin}{cs \vari{char}1\ldots}{char-set} -\defunx{char-set-delete}{cs \vari{char}1\ldots}{char-set} -\begin{desc} -Add/delete the \vari{char}i characters to/from character set \var{cs}. -\end{desc} - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Standard character sets} -\label{sec:std-csets} -Several character sets are predefined for convenience: - -\begin{center} -\newcommand{\entry}[1]{\ex{#1}\index{#1}} -\begin{tabular}{|ll|} -\hline -\entry{char-set:lower-case} & Lower-case alphabetic chars \\ -\entry{char-set:upper-case} & Upper-case alphabetic chars \\ -\entry{char-set:alphabetic} & Alphabetic chars \\ -\entry{char-set:numeric} & Decimal digits: 0--9 \\ -\entry{char-set:alphanumeric} & Alphabetic or numeric \\ -\entry{char-set:graphic} & Printing characters except space \\ -\entry{char-set:printing} & Printing characters including space \\ -\entry{char-set:whitespace} & Whitespace characters \\ -\entry{char-set:control} & Control characters \\ -\entry{char-set:punctuation} & Punctuation characters \\ -\entry{char-set:hex-digit} & A hexadecimal digit: 0--9, A--F, a--f \\ -\entry{char-set:blank} & Blank characters \\ -\entry{char-set:ascii} & A character in the ASCII set. \\ -\entry{char-set:empty} & Empty set \\ -\entry{char-set:full} & All characters \\ -\hline -\end{tabular} -\end{center} -The first eleven of these correspond to the character classes defined in -Posix. -Note that there may be characters in \ex{char-set:alphabetic} that are -neither upper or lower case---this might occur in implementations that -use a character type richer than ASCII, such as Unicode. -A ``graphic character'' is one that would put ink on your page. -While the exact composition of these sets may vary depending upon the -character type provided by the Scheme system upon which scsh is running, -here are the definitions for some of the sets in an ASCII character set: -\begin{center} -\newcommand{\entry}[1]{\ex{#1}\index{#1}} -\begin{tabular}{|ll|} -\hline -char-set:alphabetic & A--Z and a--z \\ -char-set:lower-case & a--z \\ -char-set:upper-case & A--Z \\ -char-set:graphic & Alphanumeric + punctuation \\ -char-set:whitespace & Space, newline, tab, page, - vertical tab, carriage return \\ -char-set:blank & Space and tab \\ -char-set:control & ASCII 0--31 and 127 \\ -char-set:punctuation & \verb|!"#$%&'()*+,-./:;<=>|\verb#?@[\]^_`{|}~# \\ -\hline -\end{tabular} -\end{center} - - -\defun {char-alphabetic?}\character\boolean +\defun {char-letter?}\character\boolean \defunx{char-lower-case?}\character\boolean \defunx{char-upper-case?}\character\boolean -\defunx{char-numeric? }\character\boolean -\defunx{char-alphanumeric?}\character\boolean +\defunx{char-title-case?}\character\boolean +\defunx{char-digit?}\character\boolean +\defunx{char-letter+digit?}\character\boolean \defunx{char-graphic?}\character\boolean \defunx{char-printing?}\character\boolean \defunx{char-whitespace?}\character\boolean \defunx{char-blank?}\character\boolean -\defunx{char-control?}\character\boolean +\defunx{char-iso-control?}\character\boolean \defunx{char-punctuation?}\character\boolean \defunx{char-hex-digit?}\character\boolean \defunx{char-ascii?}\character\boolean \begin{desc} -These predicates are defined in terms of the above character sets. +Each of these predicates tests for membership in one of the standard +character sets provided by the SRFI-14 character-set library. +Additionally, the following redundant bindings are provided for {R5RS} +compatibility: +\begin{inset} +\begin{tabular}{ll} + {R5RS} name & scsh definition \\ \hline + \ex{char-alphabetic?} & \ex{char-letter+digit?} \\ + \ex{char-numeric?} & \ex{char-digit?} \\ + \ex{char-alphanumeric?} & \ex{char-letter+digit?} +\end{tabular} +\end{inset} \end{desc} + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Linear-update character-set operations} -These procedures have a hybrid pure-functional/side-effecting semantics: -they are allowed, but not required, to side-effect one of their parameters -in order to construct their result. -An implementation may legally implement these procedures as pure, -side-effect-free functions, or it may implement them using side effects, -depending upon the details of what is the most efficient or simple to -implement in terms of the underlying representation. +\section{Deprecated character-set procedures} +\label{sec:char-sets} -What this means is that clients of these procedures \emph{may not} rely -upon these procedures working by side effect. -For example, this is not guaranteed to work: -\begin{verbatim} -(let ((cs (char-set #\a #\b #\c))) - (char-set-adjoin! cs #\d) - cs) ; Could be either {a,b,c} or {a,b,c,d}. -\end{verbatim} -However, this is well-defined: -\begin{verbatim} -(let ((cs (char-set #\a #\b #\c))) - (char-set-adjoin! cs #\d)) ; {a,b,c,d} -\end{verbatim} -So clients of these procedures write in a functional style, but must -additionally be sure that, when the procedure is called, there are no -other live pointers to the potentially-modified character set (hence the term -``linear update''). +The SRFI-13 character-set library grew out of an earlier library developed +for scsh. +However, the SRFI standardisation process introduced incompatibilities with +the original scsh bindings. +The current version of scsh provides the library + \ex{obsolete-char-set-lib}, which contains the old bindings found in +previous releases of scsh. +The following table lists the members of this library, along with +the equivalent SRFI-13 binding. This obsolete library is deprecated and +\emph{not} open by default in the standard \ex{scsh} environment; +new code should use the SRFI-13 bindings. +\begin{inset} +\begin{tabular}{ll} + Old \ex{obsolete-char-set-lib} & SRFI-13 \ex{char-set-lib} \\ \hline -There are two benefits to this convention: -\begin{itemize} -\item Implementations are free to provide the most efficient possible - implementation, either functional or side-effecting. -\item Programmers may nonetheless continue to assume that character sets - are purely functional data structures: they may be reliably shared - without needing to be copied, uniquified, and so forth. -\end{itemize} - -In practice, these procedures are most useful for efficiently constructing -character sets in a side-effecting manner, in some limited local context, -before passing the character set outside the local construction scope to be -used in a functional manner. - -Scsh provides no assistance in checking the linearity of the potentially -side-effected parameters passed to these functions --- there's no linear -type checker or run-time mechanism for detecting violations. - -\defun{char-set-copy}{cs}{char-set} -\begin{desc} -Returns a copy of the character set \var{cs}. -``Copy'' means that if either the input parameter or the -result value of this procedure is passed to one of the linear-update -procedures described below, the other character set is guaranteed -not to be altered. -(A system that provides pure-functional implementations of the rest of -the linear-operator suite could implement this procedure as the -identity function.) -\end{desc} - -\defun{char-set-adjoin!}{cs \vari{char}1\ldots}{char-set} -\begin{desc} -Add the \vari{char}i characters to character set \var{cs}, and -return the result. -This procedure is allowed, but not required, to side-effect \var{cs}. -\end{desc} - -\defun{char-set-delete!}{cs \vari{char}1\ldots}{char-set} -\begin{desc} -Remove the \vari{char}i characters to character set \var{cs}, and -return the result. -This procedure is allowed, but not required, to side-effect \var{cs}. -\end{desc} - -\defun {char-set-invert!}{char-set}{char-set} -\defunx{char-set-union!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set} -\defunx{char-set-intersection!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set} -\defunx{char-set-difference!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set} -\begin{desc} -These procedures implement set complement, union, intersection, and difference -for character sets. -They are allowed, but not required, to side-effect their first parameter. -The union, intersection, and difference operations are n-ary, associating -to the left. -\end{desc} + \ex{char-set-members} & \ex{char-set->list} \\ + \ex{chars->char-set} & \ex{list->char-set} \\ + \ex{ascii-range->char-set} & \ex{ucs-range->char-set} (not exact) \\ + \ex{predicate->char-set} & \ex{char-set-filter} (not exact) \\ + \ex{char-set-every}? & \ex{char-set-every} \\ + \ex{char-set-any}? & \ex{char-set-any} \\ + \\ + \ex{char-set-invert} & \ex{char-set-complement} \\ + \ex{char-set-invert}! & \ex{char-set-complement!} \\ + \\ + \ex{char-set:alphabetic} & \ex{char-set:letter} \\ + \ex{char-set:numeric} & \ex{char-set:digit} \\ + \ex{char-set:alphanumeric} & \ex{char-set:letter+digit} \\ + \ex{char-set:control} & \ex{char-set:iso-control} +\end{tabular} +\end{inset} +Note also that the \ex{->char-set} procedure no longer handles a predicate +argument. diff --git a/doc/scsh-manual/syscalls.tex b/doc/scsh-manual/syscalls.tex index cc25f93..64be7ad 100644 --- a/doc/scsh-manual/syscalls.tex +++ b/doc/scsh-manual/syscalls.tex @@ -963,11 +963,6 @@ Note that once a Scheme port is revealed in scsh, the runtime will not shift the port around with \ex{dup()} and \ex{close()}. This means the file-locking procedures can then be applied to the port's associated file descriptor. - -NeXTSTEP users should also note that even minimalist {\Posix} file locking -is not supported for NFS-mounted files in NeXTSTEP; NeXT claims they will -fix this in NS release 4. -We'd appreciate hearing from users when and if this happens. } {\Posix} allows the user to lock a region of a file with either @@ -1392,8 +1387,8 @@ Returns: Note that the rules of backslash for {\Scheme} strings and glob patterns work together to require four backslashes in a row to specify a - single literal backslash. Fortunately, this should be a rare - occurrence. + single literal backslash. Fortunately, it is very rare that a backslash + occurs in a Unix file name. A glob subpattern will not match against dot files unless the first character of the subpattern is a literal ``\ex{.}''. @@ -2623,9 +2618,6 @@ all of the complexity is optional, and defaulting all the optional arguments reduces the system to a simple interface. -\remark{This time package does not currently work with NeXTSTEP, as NeXTSTEP -does not provide a {\Posix}-compliant time library that will even link.} - \subsection{Terminology} ``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the official name for what is colloquially referred to as ``Greenwich Mean @@ -2992,7 +2984,8 @@ If \var{val} is {\sharpf}, then any entry for \var{var} is deleted. an alist, \eg, \begin{code} (("TERM" . "vt100") - ("SHELL" . "/bin/csh") + ("SHELL" . "/usr/local/bin/scsh") + ("PATH" . "/sbin:/usr/sbin:/bin:/usr/bin") ("EDITOR" . "emacs") \ldots)\end{code} \end{desc} @@ -3005,6 +2998,21 @@ If \var{val} is {\sharpf}, then any entry for \var{var} is deleted. environment (\ie, converted to a null-terminated C vector of \ex{"\var{var}=\var{val}"} strings which is assigned to the global \ex{char **environ}). + +\begin{code} +;;; Note $PATH entry is converted +;;; to /sbin:/usr/sbin:/bin:/usr/bin. +(alist->env '(("TERM" . "vt100") + ("PATH" "/sbin" "/usr/sbin" "/bin") + ("SHELL" . "/usr/local/bin/scsh"))) +\end{code} + +Note that \ex{env->alist} and \ex{alist->env} are not exact +inverses---\ex{alist->env} will convert a list value into a single +colon-separated string, but \ex{env->alist} will not parse colon-separated +values into lists. (See the \ex{\$PATH} element in the examples given for +each procedure.) + \end{desc} The following three functions help the programmer manipulate alist @@ -3082,18 +3090,30 @@ Example: These four pieces of code all run the mailer with special \subsection{Path lists and colon lists} -Environment variables such as \ex{\$PATH} encode a list of strings -by separating the list elements with colon delimiters. -Once parsed into actual lists, these ordered lists can be manipulated -with the following two functions. +When environment variables such as \ex{\$PATH} need to encode a list of +strings (such as a list of directories to be searched), +the common Unix convention is to separate the list elements with +colon delimiters.\footnote{\ldots and hope the individual list elements +don't contain colons themselves.} To convert between the colon-separated string encoding and the -list-of-strings representation, see the \ex{field-reader} and -\ex{join-strings} functions in section~\ref{sec:field-reader}. -\remark{An earlier release of scsh provided the \ex{split-colon-list} - and \ex{string-list->colon-list} functions. These have been - removed from scsh, and are replaced by the more general - parsers and unparsers of the field-reader module.} +list-of-strings representation, see the \ex{infix-splitter} function +(section~\ref{sec:field-splitter}) and the string library's +\ex{string-join} function. +For example, +\begin{code} +(define split (infix-splitter (rx ":"))) +(split "/sbin:/bin::/usr/bin") {\evalsto} + '("/sbin" "/bin" "" "/usr/bin") +(string-join ":" '("/sbin" "/bin" "" "/usr/bin")) {\evalsto} + "/sbin:/bin::/usr/bin"\end{code} +The following two functions are useful for manipulating these ordered lists, +once they have been parsed from their colon-separated form. +%\remark{An earlier release of scsh provided the \ex{split-colon-list} +% and \ex{string-list->colon-list} functions. These have been +% removed from scsh, and are replaced by the more general +% parsers and unparsers of the field-reader module.} +% %\defun {split-colon-list} {string} {{\str} list} %\defunx {string-list->colon-list} {string-list} \str %\begin{desc} @@ -3146,15 +3166,18 @@ Scsh never uses \cd{$USER} at all. It computes \ex{(user-login-name)} from the system call \ex{(user-uid)}. \defvar {home-directory} \str -\defvarx {exec-path-list} {{\str} list} +\defvarx {exec-path-list} {{\str} list fluid} \begin{desc} Scsh accesses \cd{$HOME} at start-up time, and stores the value in the global variable \ex{home-directory}. It uses this value for \ex{\~} lookups and for returning to home on \ex{(chdir)}. Scsh accesses \cd{$PATH} at start-up time, colon-splits the path list, and - stores the value in the global variable \ex{exec-path-list}. This list is + stores the value in the fluid \ex{exec-path-list}. This list is used for \ex{exec-path} and \ex{exec-path/env} searches. + + To access, rebind or side-effect fluid cells, you must open + the \ex{fluids} package. \end{desc} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% diff --git a/doc/scsh-manual/todo.tex b/doc/scsh-manual/todo.tex index 285e13d..622e6fc 100644 --- a/doc/scsh-manual/todo.tex +++ b/doc/scsh-manual/todo.tex @@ -17,19 +17,8 @@ an elegant language; go wild. \item An X gui interface. (Needs threads.) \item A better C function/data-structure interface. This is not easy. \item More network protocols. Telnet and ftp would be the most important. -\item An ILU interface. -\item An RPC system, with ``tail-recursion.'' -\item Interfaces to relational db's. - This would be quite useful for Web servers. - An s-expression embedding of SQL would be a key design component - of such a system, along the lines of scsh's process notation or - \ex{awk} notation. \item Port Edwin, and emacs text editor written in MIT Scheme, to scsh. Combine it with scsh's OS interfaces to make a visual shell. -\item An \ex{expect} knock-off. -\item A \ex{make} replacement, using scsh's process notation in the build - rules. - \item Manual hacking. \begin{itemize} \item The {\LaTeX} hackery needs yet another serious pass. Most importantly,