*** empty log message ***
This commit is contained in:
parent
85003bce1d
commit
32b0c4bea5
|
@ -76,6 +76,7 @@ characters.
|
||||||
|
|
||||||
|
|
||||||
\subsection{Parsing fields}
|
\subsection{Parsing fields}
|
||||||
|
\label{sec:field-splitter}
|
||||||
|
|
||||||
\defun {field-splitter} {[field num-fields]} \proc
|
\defun {field-splitter} {[field num-fields]} \proc
|
||||||
\defunx {infix-splitter} {[delim num-fields handle-delim]} \proc
|
\defunx {infix-splitter} {[delim num-fields handle-delim]} \proc
|
||||||
|
|
|
@ -1,297 +0,0 @@
|
||||||
%&latex -*- latex -*-
|
|
||||||
|
|
||||||
\chapter{Changes from previous releases}
|
|
||||||
\label{sec:changes}
|
|
||||||
|
|
||||||
\newcommand{\itam}[1]{\item {#1} \\}
|
|
||||||
|
|
||||||
\section{Changes from the previous release}
|
|
||||||
|
|
||||||
This section details changes that have been made in scsh since
|
|
||||||
the previous release.
|
|
||||||
|
|
||||||
Scsh is now much more robust.
|
|
||||||
All known bugs have been fixed.
|
|
||||||
There have been many improvements and extensions made.
|
|
||||||
These new features and changes are listed below, in no particular order;
|
|
||||||
the relevant sections of the manual give the full details.
|
|
||||||
|
|
||||||
Scsh now supports complete {\Posix}, including signal handlers.
|
|
||||||
Early autoreaping of child processes is now handled by a \ex{SIGCHLD}
|
|
||||||
signal handler, so children are reaped as early as possible with no
|
|
||||||
user intervention required.
|
|
||||||
|
|
||||||
A functional static heap linker is included in this release.
|
|
||||||
It is ugly, limited in functionality, and extremely slow, but it works.
|
|
||||||
It can be used to build scsh binaries that start up instantly.
|
|
||||||
|
|
||||||
The regular expression system has been sped up.
|
|
||||||
Regular-expression compilation is now provided,
|
|
||||||
and the \ex{awk} macro has been rewritten to pre-compile
|
|
||||||
regexps used in rules outside the loop.
|
|
||||||
It is still, however, slower than it should be.
|
|
||||||
|
|
||||||
Execing programs should be faster in this release, since we now use the
|
|
||||||
\ex{CLOEXEC} status bit to get automatic closing of unrevealed
|
|
||||||
port file descriptors.
|
|
||||||
|
|
||||||
{scm}'s floating point support was inadvertently omitted from the last
|
|
||||||
release. It has been reinstated.
|
|
||||||
|
|
||||||
There is now a new command-line switch, \ex{-sfd \var{num}},
|
|
||||||
which causes scsh to read its script from file descriptor \var{num}.
|
|
||||||
|
|
||||||
|
|
||||||
\section{Changes from the penultimate release}
|
|
||||||
|
|
||||||
This section details changes that have been made in scsh since
|
|
||||||
the penultimate release.
|
|
||||||
|
|
||||||
Scsh is now much more robust.
|
|
||||||
All known bugs have been fixed.
|
|
||||||
There have been many improvements and extensions made.
|
|
||||||
We have also made made some incompatible changes.
|
|
||||||
|
|
||||||
The sections below briefly describe these new features and changes;
|
|
||||||
the relevant sections of the manual give the full details.
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{New features}
|
|
||||||
This release incorporates several new features into scsh.
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\itam{Control of buffered I/O}
|
|
||||||
Scsh now allows you to control the buffering policy used for doing I/O
|
|
||||||
on a Scheme port.
|
|
||||||
|
|
||||||
\itam{Here-strings}
|
|
||||||
Scsh now has a new lexical feature, \verb|#<<|, that provides
|
|
||||||
the ability to enter long, multi-line string constants in scsh programs.
|
|
||||||
Such a string is called a ``here string,'' by analogy to the common
|
|
||||||
shell ``here document'' \ex{<<} redirection.
|
|
||||||
|
|
||||||
\itam{Delimited readers and read-line}
|
|
||||||
Scsh now has a powerful set of delimited readers.
|
|
||||||
These can be used to read input delimited by
|
|
||||||
a newline character (\ex{read-line}),
|
|
||||||
a blank line (\ex{read-paragraph}),
|
|
||||||
or the occurrence of any character in an arbitrary set (\ex{read-delimited}).
|
|
||||||
|
|
||||||
While these procedures can be applied to any Scheme input port,
|
|
||||||
there is native-code support for performing delimited reads on
|
|
||||||
Unix input sources, so doing block input with these procedures should be
|
|
||||||
much faster than the equivalent character-at-a-time Scheme code.
|
|
||||||
|
|
||||||
\itam{New system calls}
|
|
||||||
With the sole exception of signal handlers, scsh now has all of {\Posix}.
|
|
||||||
This release introduces
|
|
||||||
\begin{itemize}
|
|
||||||
\item \ex{select},
|
|
||||||
\item full terminal device control,
|
|
||||||
\item support for pseudo-terminal ``pty'' devices,
|
|
||||||
\item file locking,
|
|
||||||
\item process timing,
|
|
||||||
\item \ex{set-file-times},
|
|
||||||
\item \ex{seek} and \ex{tell}.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
Note that having \ex{select}, pseudo-terminals, and tty device control means
|
|
||||||
that it is now possible to implement interesting network protocols, such as
|
|
||||||
telnet servers and clients, directly in Scheme.
|
|
||||||
|
|
||||||
\itam{New command-line switches}
|
|
||||||
There is a new set of command-line switches that make it possible
|
|
||||||
to write shell scripts using the {\scm} module system.
|
|
||||||
Scripts can use the new command-line switches to open dependent
|
|
||||||
modules and load dependent source code.
|
|
||||||
Scripts can also be written in the {\scm} module language,
|
|
||||||
which allows you to use it both as a standalone shell script,
|
|
||||||
and as a code module that can be loaded and used by other Scheme programs.
|
|
||||||
|
|
||||||
\itam{Static heap linking}
|
|
||||||
There is a new facility that allows you to compile a heap image
|
|
||||||
to a \ex{.o} file that can be linked with the scsh virtual machine.
|
|
||||||
This produces a standalone executable binary, makes startup time
|
|
||||||
near-instantaneous, and greatly improves memory performance---the
|
|
||||||
initial heap image is placed in the process' text pages,
|
|
||||||
where it is shared by different scsh processes, and does not occupy
|
|
||||||
space in the run-time heap.
|
|
||||||
|
|
||||||
\oops{The static heap linker was not documented and installed in time
|
|
||||||
for this release.}
|
|
||||||
|
|
||||||
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Incompatible improvements}
|
|
||||||
Some features of scsh have been improved in ways that are
|
|
||||||
not backwards-compatible with previous releases.
|
|
||||||
These changes should not affect most code;
|
|
||||||
however, please note the changes and modify your code accordingly.
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\itam{New process-object data-type returned by \ex{fork}}
|
|
||||||
Previous releases were prone to fill up the kernel's process table
|
|
||||||
if a program forked large numbers of processes and subsequently failed
|
|
||||||
to use \ex{wait} to reclaim the entries in the kernel's process table.
|
|
||||||
(This is a problem in standard C environments, as well.)
|
|
||||||
|
|
||||||
Scsh 0.4 introduces a new mechanism for automatically managing subprocesses.
|
|
||||||
Processes are no longer represented by an integer process id,
|
|
||||||
which is impossible to garbage-collect, but by an
|
|
||||||
abstract process data type that encapsulates the process id.
|
|
||||||
All processes are represented using the new data structures;
|
|
||||||
see the relevant section of the manual for further details.
|
|
||||||
|
|
||||||
\itam{Better stdio/current-port synchronisation}
|
|
||||||
The \ex{(begin \ldots)} process form now does a \ex{stdio->stdports}
|
|
||||||
call before executing its body.
|
|
||||||
This means that the Scheme code in the body ``sees'' any external
|
|
||||||
redirections.
|
|
||||||
For example, it means that if a \ex{begin} form in the middle of a pipeline
|
|
||||||
performs I/O on the current input and output ports, it will be communicating
|
|
||||||
with its upstream and downstream pipes.
|
|
||||||
\Eg, this code works as intended without the need for explicit synchronisation:
|
|
||||||
\begin{verbatim}
|
|
||||||
(run (| (gunzip)
|
|
||||||
;; Kill line 1 and insert doubled-sided
|
|
||||||
;; code at head of Postscript.
|
|
||||||
(begin (read-line) ; Eat first line.
|
|
||||||
(display "%!PS-Adobe-2.0\\n")
|
|
||||||
(display "statusdict /setduplexmode known ")
|
|
||||||
(display "{statusdict begin true ")
|
|
||||||
(display "setduplexmode end} if\n")
|
|
||||||
(exec-epf (cat)))
|
|
||||||
(lpr))
|
|
||||||
(< paper.ps))\end{verbatim}
|
|
||||||
Arranging for the \ex{begin} process form to synchronise
|
|
||||||
the current I/O ports with stdio means that all process forms now
|
|
||||||
see their epf's redirections.
|
|
||||||
|
|
||||||
\itam{\ex{file-match} more robust}
|
|
||||||
The \ex{file-match} procedure now catches any error condition
|
|
||||||
signalled by a match procedure,
|
|
||||||
and treats it as if the procedure had simply returned {\sharpf},
|
|
||||||
\ie, match failure.
|
|
||||||
This means \ex{file-match} no longer gets blown out of the water by
|
|
||||||
trying to apply a function like \ex{file-directory?} to a dangling symlink,
|
|
||||||
and other related OS errors.
|
|
||||||
|
|
||||||
\itam{Standard input now unbuffered}
|
|
||||||
Scsh's startup code now makes the initial current input port
|
|
||||||
(corresponding to file descriptor 0) unbuffered.
|
|
||||||
This keeps the shell from ``stealing'' input meant for subprocesses.
|
|
||||||
However, it does slow down character-at-a-time input processing.
|
|
||||||
If you are writing a program that is tolerant of buffered input,
|
|
||||||
and wish the efficiency gains, you can reset the buffering policy
|
|
||||||
yourself.
|
|
||||||
|
|
||||||
\itam{``writeable'' now spelled ``writable''}
|
|
||||||
We inconsistently spelled \ex{file-writable?} and \ex{file-not-writable?}
|
|
||||||
in the manual and the implementation.
|
|
||||||
We have now standardised on the common spelling ``writable'' in both.
|
|
||||||
The older bindings still exist in release 0.4, but will go away in future
|
|
||||||
releases.
|
|
||||||
|
|
||||||
\itam{\protect\ex{char-set-member?} replaced}
|
|
||||||
We have de-released the \ex{char-set-member?} procedure.
|
|
||||||
The scsh 0.3 version of this procedure took arguments
|
|
||||||
in the following order:
|
|
||||||
\codex{(char-set-member? \var{char} \var{char-set})}
|
|
||||||
This argument order is in accordance with standard mathematical useage
|
|
||||||
(\ie, $x \in S$), and also consistent with the R4RS
|
|
||||||
\ex{member}, \ex{memq} and \ex{memv} procedures.
|
|
||||||
It is, however, exactly opposite from the argument order
|
|
||||||
used by the \ex{char-set-member?} in MIT Scheme's character-set library.
|
|
||||||
If we left things as they were, we risked problems with code
|
|
||||||
ported over from MIT Scheme.
|
|
||||||
On the other hand, changing to conformance with MIT Scheme meant
|
|
||||||
inconsistency with common mathematical notation and other long-standing
|
|
||||||
Scheme procedures.
|
|
||||||
Either way was bound to introduce confusion.
|
|
||||||
|
|
||||||
We've taken the approach of simply removing the \ex{char-set-member?}
|
|
||||||
procedure altogether, and replacing it with a new procedure:
|
|
||||||
\codex{(char-set-contains? \var{cset} \var{char})}
|
|
||||||
Note that the argument order is consistent with the name.
|
|
||||||
|
|
||||||
\itam{\ex{file-attributes} now \ex{file-info}}
|
|
||||||
In keeping with the general convention in scsh of naming procedures
|
|
||||||
that retrieve information about system resources \ex{\ldots-info}
|
|
||||||
(\eg, \ex{tty-info}, \ex{user-info}, \ex{group-info}),
|
|
||||||
the \ex{file-attributes} procedure is now named \ex{file-info}.
|
|
||||||
|
|
||||||
We continue to export a \ex{file-attributes} binding for the current
|
|
||||||
release, but it will go away in future releases.
|
|
||||||
|
|
||||||
\itam{Renaming of I/O synchronisation procedures}
|
|
||||||
The \ex{(stdio->stdports \var{thunk})} procedure has been
|
|
||||||
renamed \ex{with-stdio-ports*};
|
|
||||||
there is now a corresponding \ex{with-stdio-ports} special form.
|
|
||||||
The \ex{stdio->stdports} procedure is now a nullary procedure
|
|
||||||
that side-effects the current set of current I/O port bindings.
|
|
||||||
|
|
||||||
\itam{New meta-arg line-two syntax}
|
|
||||||
Scsh now uses a simplified grammar for describing command-line
|
|
||||||
arguments read by the ``meta-arg'' switch from line two of a shell script.
|
|
||||||
If you were using this feature in previous releases, the three incompatible
|
|
||||||
changes of which to be aware are:
|
|
||||||
(1) tab is no longer allowed as an argument delimiter,
|
|
||||||
(2) a run of space characters is not equivalent to a single space,
|
|
||||||
(3) empty arguments are written a different way.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Backwards-compatible improvements}
|
|
||||||
|
|
||||||
Some existing features in scsh have been improved in ways that will
|
|
||||||
not effect existing code.
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\itam{Improved error reporting}
|
|
||||||
Exception handlers that print out error messages and warnings now
|
|
||||||
print their messages on the error output port,
|
|
||||||
instead of the current output port.
|
|
||||||
Previous releases used the current output port,
|
|
||||||
a problem inherited from Scheme 48.
|
|
||||||
|
|
||||||
Previous scsh releases flushed the Scheme 48 debugging tables when
|
|
||||||
creating the standard scsh heap image.
|
|
||||||
This trimmed the size of the heap image, but made error messages much
|
|
||||||
less comprehensible.
|
|
||||||
We now retain the debugging tables.
|
|
||||||
This bloats the heap image up by about 600kb. And worth it, too.
|
|
||||||
|
|
||||||
(We also have some new techniques for eliminating the run-time memory
|
|
||||||
penalty imposed by these large heap images.
|
|
||||||
Scsh's new static-heap technology allows for this data to be linked
|
|
||||||
into the text pages of the vm's binary, where it will not be touched
|
|
||||||
by the GC or otherwise affect the memory system until it is referenced.)
|
|
||||||
|
|
||||||
Finally, scsh now generates more informative error messages for syscall
|
|
||||||
errors.
|
|
||||||
For example, a file-open error previously told you what the error was
|
|
||||||
(\eg, ``Permission denied,'' or ``No such file or directory''),
|
|
||||||
but not which file you had tried to open.
|
|
||||||
We've improved this.
|
|
||||||
|
|
||||||
\itam{Closing a port twice allowed}
|
|
||||||
Scsh used to generate an error if you attempted to close a port
|
|
||||||
that had already been closed.
|
|
||||||
This is now allowed.
|
|
||||||
The close procedure returns a boolean to indicate whether the port had
|
|
||||||
already been closed or not.
|
|
||||||
|
|
||||||
\itam{Better time precision}
|
|
||||||
The \ex{time+ticks} procedure now returns sub-second precision on OS's
|
|
||||||
that support it.
|
|
||||||
|
|
||||||
\itam{Nicer print-methods for basic data-types}
|
|
||||||
Scsh's standard record types now print more informatively.
|
|
||||||
For example, a process object includes the process id in its
|
|
||||||
printed representation: the process object for process id 2653
|
|
||||||
prints as \verb|#{proc 2653}|.
|
|
||||||
|
|
||||||
\end{itemize}
|
|
|
@ -23,6 +23,8 @@
|
||||||
\def\maketildeactive{\catcode`\~=13}
|
\def\maketildeactive{\catcode`\~=13}
|
||||||
\def\~{\char`\~}
|
\def\~{\char`\~}
|
||||||
|
|
||||||
|
\newcommand{\evalsto}{\ensuremath{\Rightarrow}}
|
||||||
|
|
||||||
% One-line code examples
|
% One-line code examples
|
||||||
%\newcommand{\codex}[1]% One line, centred. Tight spacing.
|
%\newcommand{\codex}[1]% One line, centred. Tight spacing.
|
||||||
% {$$\abovedisplayskip=.75ex plus 1ex minus .5ex%
|
% {$$\abovedisplayskip=.75ex plus 1ex minus .5ex%
|
||||||
|
|
|
@ -18,6 +18,30 @@ This manual gives a complete description of scsh.
|
||||||
A general discussion of the design principles behind scsh can be found
|
A general discussion of the design principles behind scsh can be found
|
||||||
in a companion paper, ``A Scheme Shell.''
|
in a companion paper, ``A Scheme Shell.''
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
\section{Copyright \& source-code license}
|
||||||
|
Scsh is open source. The complete sources come with the standard
|
||||||
|
distribution, which can be downloaded off the net.
|
||||||
|
|
||||||
|
For years, scsh's underlying Scheme implementation, Scheme 48, did not have an
|
||||||
|
open-source copyright. However, around 1999/2000, the Scheme 48 authors
|
||||||
|
graciously retrofitted a BSD-style open-source copyright onto the system.
|
||||||
|
Swept up by the fervor, we tacked an ideologically hip license onto scsh
|
||||||
|
source, ourselves (BSD-style, as well). Not that we ever cared before what you
|
||||||
|
did with the system.
|
||||||
|
|
||||||
|
As a result, the whole system is now open source, top-to-bottom.
|
||||||
|
|
||||||
|
We note that the code is a rich source for other Scheme implementations
|
||||||
|
to mine. Not only the \emph{code}, but the \emph{APIs} are available
|
||||||
|
for implementors working on Scheme environments for systems programming.
|
||||||
|
These APIs represent years of work, and should provide a big head-start
|
||||||
|
on any related effort. (Just don't call it ``scsh,'' unless it's
|
||||||
|
\emph{exactly} compliant with the scsh interfaces.)
|
||||||
|
|
||||||
|
Take all the code you like; we'll just write more.
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Obtaining scsh}
|
\section{Obtaining scsh}
|
||||||
Scsh is distributed via net publication.
|
Scsh is distributed via net publication.
|
||||||
|
@ -100,14 +124,11 @@ but the system as-released does not currently provide these features.
|
||||||
|
|
||||||
In the current release, the system has some rough edges.
|
In the current release, the system has some rough edges.
|
||||||
It is quite slow to start up---loading the initial image into the
|
It is quite slow to start up---loading the initial image into the
|
||||||
{\scm} virtual machine takes about a cpu second.
|
{\scm} virtual machine induces a noticeable delay.
|
||||||
This can be fixed with the static heap linker provided with this release.
|
This can be fixed with the static heap linker provided with this release.
|
||||||
|
|
||||||
This manual is very, very rough.
|
We welcome parties interested in porting the manual to a more portable
|
||||||
At some point, we hope to polish it up, finish it off, and re-typeset it
|
XML or SGML format; please contact us if you are interested in doing so.
|
||||||
using markup, so we can generate html, info nodes, and {\TeX} output from
|
|
||||||
the single source without having to deal with Texinfo.
|
|
||||||
But it's all there is, for now.
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Naming conventions}
|
\section{Naming conventions}
|
||||||
|
@ -375,11 +396,17 @@ All told, the \ex{define-record} form above defines the following procedures:
|
||||||
(ship:size \var{ship}) & Retrieve the \var{size} field. \\
|
(ship:size \var{ship}) & Retrieve the \var{size} field. \\
|
||||||
\hline
|
\hline
|
||||||
(set-ship:x \var{ship} \var{new-x}) & Assign the \var{x} field. \\
|
(set-ship:x \var{ship} \var{new-x}) & Assign the \var{x} field. \\
|
||||||
(set-ship:y \var{ship} \var{new-y}) & Assign the \var{x} field. \\
|
(set-ship:y \var{ship} \var{new-y}) & Assign the \var{y} field. \\
|
||||||
(set-ship:size \var{ship} \var{new-size}) & Assign the \var{size} field. \\
|
(set-ship:size \var{ship} \var{new-size}) & Assign the \var{size} field. \\
|
||||||
\hline
|
\hline
|
||||||
|
(modify-ship:x \var{ship} \var{xfun}) & Modify \var{x} field with \var{xfun}. \\
|
||||||
|
(modify-ship:y \var{ship} \var{yfun}) & Modify \var{y} field with \var{yfun}. \\
|
||||||
|
(modify-ship:size \var{ship} \var{sizefun}) & Modify \var{size} field with \var{sizefun}. \\
|
||||||
|
\hline
|
||||||
(ship? \var{object}) & Type predicate. \\
|
(ship? \var{object}) & Type predicate. \\
|
||||||
\hline
|
\hline
|
||||||
|
(copy-ship \var{ship}) & Shallow-copy of the record. \\
|
||||||
|
\hline
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
%
|
%
|
||||||
|
@ -388,6 +415,8 @@ An implementation of \ex{define-record} is available as a macro for Scheme
|
||||||
programmers to define their own record types;
|
programmers to define their own record types;
|
||||||
the syntax is accessed by opening the package \ex{defrec-package}, which
|
the syntax is accessed by opening the package \ex{defrec-package}, which
|
||||||
exports the single syntax form \ex{define-record}.
|
exports the single syntax form \ex{define-record}.
|
||||||
|
See the source code for the \ex{defrec-package} module
|
||||||
|
for further details of the macro.
|
||||||
|
|
||||||
You must open this package to access the form.
|
You must open this package to access the form.
|
||||||
Scsh does not export a record-definition package by default as there are
|
Scsh does not export a record-definition package by default as there are
|
||||||
|
@ -417,21 +446,9 @@ you could not read and internalise such a twisted account without
|
||||||
bleeding from the nose and ears.
|
bleeding from the nose and ears.
|
||||||
|
|
||||||
However, you might keep in mind the following simple fact: of all the
|
However, you might keep in mind the following simple fact: of all the
|
||||||
standards, {\Posix}, as far as I have been able to determine,
|
standards, {\Posix} is the least common denominator.
|
||||||
is the least common denominator.
|
|
||||||
So when this manual repeatedly refers to {\Posix}, the point is ``the
|
So when this manual repeatedly refers to {\Posix}, the point is ``the
|
||||||
thing we are describing should be portable just about anywhere.''
|
thing we are describing should be portable just about anywhere.''
|
||||||
Scsh sticks to {\Posix} when at all possible; it's major departure is
|
Scsh sticks to {\Posix} when at all possible; its major departure is
|
||||||
symbolic links, which aren't in {\Posix} (see---it
|
symbolic links, which aren't in {\Posix} (see---it
|
||||||
really \emph{is} a least common denominator).
|
really \emph{is} a least common denominator).
|
||||||
|
|
||||||
However, just because {\Posix} is the l.c.d. standard doesn't mean everyone
|
|
||||||
supports all of it.
|
|
||||||
The guerilla PC {\Unix} implementations that have been springing up on
|
|
||||||
the net (\eg, NetBSD, Linux, FreeBSD, and so forth) are only recently coming
|
|
||||||
into compliance with the standard---although they are getting there.
|
|
||||||
We have been able to implement scsh completely on all of these systems,
|
|
||||||
however---the single exception is NeXTSTEP, whose buggy {\Posix} libraries
|
|
||||||
restricts us to partial support (these lacunae are indicated where relevant
|
|
||||||
in the rest of the manual).\footnote{Feel like porting scsh from {\Posix} to
|
|
||||||
NeXT's BSD API? Send us your fixes; we'll fold them in.}
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
%&latex -*- latex -*-
|
% -*- latex -*-
|
||||||
|
|
||||||
% This is the reference manual for the Scheme Shell.
|
% This is the reference manual for the Scheme Shell.
|
||||||
|
|
||||||
|
@ -47,7 +47,6 @@
|
||||||
\include{awk}
|
\include{awk}
|
||||||
\include{miscprocs}
|
\include{miscprocs}
|
||||||
\include{running}
|
\include{running}
|
||||||
\include{changes}
|
|
||||||
\include{todo}
|
\include{todo}
|
||||||
|
|
||||||
\backmatter
|
\backmatter
|
||||||
|
|
|
@ -1,27 +1,57 @@
|
||||||
% -*- latex -*-
|
% -*- latex -*-
|
||||||
\chapter{Strings and characters}
|
\chapter{Strings and characters}
|
||||||
|
|
||||||
Scsh provides a set of procedures for processing strings and characters.
|
|
||||||
The procedures provided match regular expressions, search strings,
|
|
||||||
parse file-names, and manipulate sets of characters.
|
|
||||||
|
|
||||||
Also see chapters \ref{chapt:sre}, \ref{chapt:rdelim} and \ref{chapt:fr-awk}
|
|
||||||
on regular-expressions, record I/O, field parsing, and the awk loop.
|
|
||||||
The procedures documented there allow you to search and pattern-match strings,
|
|
||||||
read character-delimited records from ports,
|
|
||||||
use regular expressions to split the records into fields
|
|
||||||
(for example, splitting a string at every occurrence of colon or white-space),
|
|
||||||
and loop over streams of these records in a convenient way.
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\section{String manipulation}
|
|
||||||
\label{sec:stringmanip}
|
|
||||||
|
|
||||||
Strings are the basic communication medium for {\Unix} processes, so a
|
Strings are the basic communication medium for {\Unix} processes, so a
|
||||||
shell language must have reasonable facilities for manipulating them.
|
Unix programming environment must have reasonable facilities for manipulating
|
||||||
|
them.
|
||||||
|
Scsh provides a powerful set of procedures for processing strings and
|
||||||
|
characters.
|
||||||
|
Besides the the facilities described in this chapter, scsh also provides
|
||||||
|
\begin{itemize}
|
||||||
|
\itum{Regular expressions (chapter~\ref{chapt:sre})}
|
||||||
|
A complete regular-expression system.
|
||||||
|
|
||||||
|
\itum{Field parsing, delimited record I/O and the awk loop
|
||||||
|
(chapter~\ref{chapt:fr-awk})}
|
||||||
|
These procedures let you read in chunks of text delimited by selected
|
||||||
|
characters, and
|
||||||
|
parse each record into fields based on regular expressions
|
||||||
|
(for example, splitting a string at every occurrence of colon or
|
||||||
|
white-space).
|
||||||
|
The \ex{awk} form allows you to loop over streams of these records
|
||||||
|
in a convenient way.
|
||||||
|
|
||||||
|
\itum{The SRFI-13 string libraries}
|
||||||
|
This pair of libraries contains procedures that create, fold, iterate over,
|
||||||
|
search, compare, assemble, cut, hash, case-map, and otherwise manipulate
|
||||||
|
strings.
|
||||||
|
They are provided by the \ex{string-lib} and \ex{string-lib-internals}
|
||||||
|
packages, and are also available in the default \ex{scsh} package.
|
||||||
|
|
||||||
|
More documentation on these procedures can be found at URLs
|
||||||
|
\begin{tightinset}
|
||||||
|
% The gratuitous mbox makes xdvi render the hyperlinks better.
|
||||||
|
\mbox{\url{http://srfi.schemers.org/srfi-13/srfi-13.html}}\\
|
||||||
|
\url{http://srfi.schemers.org/srfi-13/srfi-13.txt}
|
||||||
|
\end{tightinset}
|
||||||
|
|
||||||
|
\itum{The SRFI-14 character-set library}
|
||||||
|
This library provides a set-of-characters abstraction, which is frequently
|
||||||
|
useful when searching, parsing, filtering or otherwise operating on
|
||||||
|
strings and character data. The SRFI is provided by the \ex{char-set-lib}
|
||||||
|
package; it's bindings are also available in the default \ex{scsh} package.
|
||||||
|
|
||||||
|
More documentation on this library can be found at URLs
|
||||||
|
\begin{tightinset}
|
||||||
|
% The gratuitous mbox makes xdvi render the hyperlinks better.
|
||||||
|
\mbox{\url{http://srfi.schemers.org/srfi-14/srfi-14.html}}\\
|
||||||
|
\url{http://srfi.schemers.org/srfi-14/srfi-14.txt}
|
||||||
|
\end{tightinset}
|
||||||
|
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\subsection{Manipulating file-names}
|
\section{Manipulating file names}
|
||||||
\label{sec:filenames}
|
\label{sec:filenames}
|
||||||
|
|
||||||
These procedures do not access the file-system at all; they merely operate
|
These procedures do not access the file-system at all; they merely operate
|
||||||
|
@ -30,7 +60,7 @@ design. Perhaps a more sophisticated system would be better, something
|
||||||
like the pathname abstractions of {\CommonLisp} or MIT Scheme. However,
|
like the pathname abstractions of {\CommonLisp} or MIT Scheme. However,
|
||||||
being {\Unix}-specific, we can be a little less general.
|
being {\Unix}-specific, we can be a little less general.
|
||||||
|
|
||||||
\subsubsection{Terminology}
|
\subsection{Terminology}
|
||||||
These procedures carefully adhere to the {\Posix} standard for file-name
|
These procedures carefully adhere to the {\Posix} standard for file-name
|
||||||
resolution, which occasionally entails some slightly odd things.
|
resolution, which occasionally entails some slightly odd things.
|
||||||
This section will describe these rules, and give some basic terminology.
|
This section will describe these rules, and give some basic terminology.
|
||||||
|
@ -95,7 +125,7 @@ interpreted in file-name form, \ie, as root.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\subsubsection{Procedures}
|
\subsection{Procedures}
|
||||||
|
|
||||||
\defun {file-name-directory?} {fname} \boolean
|
\defun {file-name-directory?} {fname} \boolean
|
||||||
\defunx {file-name-non-directory?} {fname} \boolean
|
\defunx {file-name-non-directory?} {fname} \boolean
|
||||||
|
@ -355,38 +385,7 @@ is also frequently useful for expanding file-names.
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\subsection{Other string manipulation facilities}
|
\section{Other string manipulation facilities}
|
||||||
|
|
||||||
\defun {index} {string char [start]} {{\fixnum} or false}
|
|
||||||
\defunx {rindex} {string char [start]} {{\fixnum} or false}
|
|
||||||
\begin{desc}
|
|
||||||
These procedures search through \var{string} looking for an occurrence
|
|
||||||
of character \var{char}. \ex{index} searches left-to-right; \ex{rindex}
|
|
||||||
searches right-to-left.
|
|
||||||
|
|
||||||
\ex{index} returns the smallest index $i$ of \var{string} greater
|
|
||||||
than or equal to \var{start} such that $\var{string}[i] = \var{char}$.
|
|
||||||
The default for \var{start} is zero. If there is no such match,
|
|
||||||
\ex{index} returns false.
|
|
||||||
|
|
||||||
\ex{rindex} returns the largest index $i$ of \var{string} less than
|
|
||||||
\var{start} such that $\var{string}[i] = \var{char}$.
|
|
||||||
The default for \var{start} is \ex{(string-length \var{string})}.
|
|
||||||
If there is no such match, \ex{rindex} returns false.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
I should probably snarf all the MIT Scheme string functions, and stick them
|
|
||||||
in a package. {\Unix} programs need to mung character strings a lot.
|
|
||||||
|
|
||||||
MIT string match commands:
|
|
||||||
\begin{tightcode}
|
|
||||||
[sub]string-match-{forward,backward}[-ci]
|
|
||||||
[sub]string-{prefix,suffix}[-ci]?
|
|
||||||
[sub]string-find-{next,previous}-char[-ci]
|
|
||||||
[sub]string-find-{next,previous}-char-in-set
|
|
||||||
[sub]string-replace[!]
|
|
||||||
\ldots\etc\end{tightcode}
|
|
||||||
These are not currently provided.
|
|
||||||
|
|
||||||
\begin{defundesc} {substitute-env-vars} {fname} \str
|
\begin{defundesc} {substitute-env-vars} {fname} \str
|
||||||
Replace occurrences of environment variables with their values.
|
Replace occurrences of environment variables with their values.
|
||||||
|
@ -412,315 +411,72 @@ These are not currently provided.
|
||||||
\end{desc}
|
\end{desc}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Character sets}
|
\section{Character predicates}
|
||||||
\label{sec:char-sets}
|
|
||||||
|
|
||||||
Scsh provides a \ex{char-set} type for expressing sets of characters.
|
\defun {char-letter?}\character\boolean
|
||||||
These sets are used by some of the delimited-input procedures
|
|
||||||
(section~\ref{sec:field-reader}).
|
|
||||||
Scsh's character set package was adapted and extended from
|
|
||||||
Project Mac's MIT Scheme package.
|
|
||||||
Note that the character type used in the current implementation corresponds
|
|
||||||
to the ASCII character set---but you would be wise not to build this
|
|
||||||
assumption into your code if you can help it.\footnote{
|
|
||||||
Actually, it's slightly uglier than that, albeit somewhat more
|
|
||||||
useful. The current character type corresponds to an eight-bit
|
|
||||||
superset of ASCII. The \ex{ascii->char} and \ex{char->ascii}
|
|
||||||
functions will preserve this eighth bit. However, none of the
|
|
||||||
the high 128 characters appear in any of the standard character
|
|
||||||
sets defined in section~\ref{sec:std-csets}, except for
|
|
||||||
\ex{char-set:full}. If someone would email the authors a listing
|
|
||||||
of the full Latin-1 definition, we'll be happy to upgrade these
|
|
||||||
sets' definitions to make them Latin-1 compliant.}
|
|
||||||
|
|
||||||
\defun{char-set?}{x}\boolean
|
|
||||||
\begin{desc}
|
|
||||||
Is the object \var{x} a character set?
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set=}{\vari{cs}1 \vari{cs}2\ldots}\boolean
|
|
||||||
\begin{desc}
|
|
||||||
Are the character sets equal?
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set<=}{\vari{cs}1 \vari{cs}2\ldots}\boolean
|
|
||||||
\begin{desc}
|
|
||||||
Returns true if every character set \vari{cs}{i} is
|
|
||||||
a subset of character set \vari{cs}{i+1}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-fold}{kons knil cs}\object
|
|
||||||
\begin{desc}
|
|
||||||
This is the fundamental iterator for character sets.
|
|
||||||
Applies the function \var{kons} across the character set \var{cs} using
|
|
||||||
initial state value \var{knil}.
|
|
||||||
That is, if \var{cs} is the empty set, the procedure returns \var{knil}.
|
|
||||||
Otherwise, some element \var{c} of \var{cs} is chosen; let \var{cs'} be
|
|
||||||
the remaining, unchosen characters.
|
|
||||||
The procedure returns
|
|
||||||
\begin{tightcode}
|
|
||||||
(char-set-fold \var{kons} (\var{kons} \var{c} \var{knil}) \var{cs'})\end{tightcode}
|
|
||||||
For example, we could define \ex{char-set-members} (see below)
|
|
||||||
as
|
|
||||||
\begin{tightcode}
|
|
||||||
(lambda (cs) (char-set-fold cons '() cs))\end{tightcode}
|
|
||||||
|
|
||||||
\remark{This procedure was formerly named \texttt{\indx{reduce-char-set}}.
|
|
||||||
The old binding is still provided, but is deprecated and will
|
|
||||||
probably vanish in a future release.}
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-for-each}{p cs}{\undefined}
|
|
||||||
\begin{desc}
|
|
||||||
Apply procedure \var{p} to each character in the character set \var{cs}.
|
|
||||||
Note that the order in which \var{p} is applied to the characters in the
|
|
||||||
set is not specified, and may even change from application to application.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Creating character sets}
|
|
||||||
|
|
||||||
\defun{char-set}{\vari{char}1\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Return a character set containing the given characters.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{chars->char-set}{chars}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Return a character set containing the characters in the list \var{chars}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{string->char-set}{s}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Return a character set containing the characters in the string \var{s}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{predicate->char-set}{pred}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Returns a character set containing every character \var{c} such that
|
|
||||||
\ex{(\var{pred} \var{c})} returns true.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{ascii-range->char-set}{lower upper}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Returns a character set containing every character whose {\Ascii}
|
|
||||||
code lies in the half-open range $[\var{lower},\var{upper})$.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Querying character sets}
|
|
||||||
\defun {char-set-members}{char-set}{character-list}
|
|
||||||
\begin{desc}
|
|
||||||
This procedure returns a list of the members of \var{char-set}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defunx{char-set-contains?}{char-set char}\boolean
|
|
||||||
\begin{desc}
|
|
||||||
This procedure tests \var{char} for membership in set \var{char-set}.
|
|
||||||
\remark{Previous releases of scsh called this procedure \ex{char-set-member?},
|
|
||||||
reversing the order of the arguments.
|
|
||||||
This made sense, but was unfortunately the reverse order in which the
|
|
||||||
arguments appear in MIT Scheme.
|
|
||||||
A reasonable argument order was not backwards-compatible with MIT Scheme;
|
|
||||||
on the other hand, the MIT Scheme argument order was counter-intuitive
|
|
||||||
and at odds with common mathematical notation and the \ex{member} family
|
|
||||||
of R4RS procedures.
|
|
||||||
|
|
||||||
We sought to escape the dilemma by shifting to a new name.}
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-size}{cs}\integer
|
|
||||||
\begin{desc}
|
|
||||||
Returns the number of elements in character set \var{cs}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-every?}{pred cs}\boolean
|
|
||||||
\defunx{char-set-any?}{pred cs}\object
|
|
||||||
\begin{desc}
|
|
||||||
The \ex{char-set-every?} procedure returns true if predicate \var{pred}
|
|
||||||
returns true of every character in the character set \var{cs}.
|
|
||||||
|
|
||||||
Likewise, \ex{char-set-any?} applies \var{pred} to every character in
|
|
||||||
character set \var{cs}, and returns the first true value it finds.
|
|
||||||
If no character produces a true value, it returns false.
|
|
||||||
|
|
||||||
The order in which these procedures sequence through the elements of
|
|
||||||
\var{cs} is not specified.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Character-set algebra}
|
|
||||||
\defun {char-set-invert}{char-set}{char-set}
|
|
||||||
\defunx{char-set-union}{\vari{char-set}1\ldots}{char-set}
|
|
||||||
\defunx{char-set-intersection}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set}
|
|
||||||
\defunx{char-set-difference}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
These procedures implement set complement, union, intersection, and difference
|
|
||||||
for character sets.
|
|
||||||
The union, intersection, and difference operations are n-ary, associating
|
|
||||||
to the left; the difference function requires at least one argument, while
|
|
||||||
union and intersection may be applied to zero arguments.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun {char-set-adjoin}{cs \vari{char}1\ldots}{char-set}
|
|
||||||
\defunx{char-set-delete}{cs \vari{char}1\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Add/delete the \vari{char}i characters to/from character set \var{cs}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\subsection{Standard character sets}
|
|
||||||
\label{sec:std-csets}
|
|
||||||
Several character sets are predefined for convenience:
|
|
||||||
|
|
||||||
\begin{center}
|
|
||||||
\newcommand{\entry}[1]{\ex{#1}\index{#1}}
|
|
||||||
\begin{tabular}{|ll|}
|
|
||||||
\hline
|
|
||||||
\entry{char-set:lower-case} & Lower-case alphabetic chars \\
|
|
||||||
\entry{char-set:upper-case} & Upper-case alphabetic chars \\
|
|
||||||
\entry{char-set:alphabetic} & Alphabetic chars \\
|
|
||||||
\entry{char-set:numeric} & Decimal digits: 0--9 \\
|
|
||||||
\entry{char-set:alphanumeric} & Alphabetic or numeric \\
|
|
||||||
\entry{char-set:graphic} & Printing characters except space \\
|
|
||||||
\entry{char-set:printing} & Printing characters including space \\
|
|
||||||
\entry{char-set:whitespace} & Whitespace characters \\
|
|
||||||
\entry{char-set:control} & Control characters \\
|
|
||||||
\entry{char-set:punctuation} & Punctuation characters \\
|
|
||||||
\entry{char-set:hex-digit} & A hexadecimal digit: 0--9, A--F, a--f \\
|
|
||||||
\entry{char-set:blank} & Blank characters \\
|
|
||||||
\entry{char-set:ascii} & A character in the ASCII set. \\
|
|
||||||
\entry{char-set:empty} & Empty set \\
|
|
||||||
\entry{char-set:full} & All characters \\
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{center}
|
|
||||||
The first eleven of these correspond to the character classes defined in
|
|
||||||
Posix.
|
|
||||||
Note that there may be characters in \ex{char-set:alphabetic} that are
|
|
||||||
neither upper or lower case---this might occur in implementations that
|
|
||||||
use a character type richer than ASCII, such as Unicode.
|
|
||||||
A ``graphic character'' is one that would put ink on your page.
|
|
||||||
While the exact composition of these sets may vary depending upon the
|
|
||||||
character type provided by the Scheme system upon which scsh is running,
|
|
||||||
here are the definitions for some of the sets in an ASCII character set:
|
|
||||||
\begin{center}
|
|
||||||
\newcommand{\entry}[1]{\ex{#1}\index{#1}}
|
|
||||||
\begin{tabular}{|ll|}
|
|
||||||
\hline
|
|
||||||
char-set:alphabetic & A--Z and a--z \\
|
|
||||||
char-set:lower-case & a--z \\
|
|
||||||
char-set:upper-case & A--Z \\
|
|
||||||
char-set:graphic & Alphanumeric + punctuation \\
|
|
||||||
char-set:whitespace & Space, newline, tab, page,
|
|
||||||
vertical tab, carriage return \\
|
|
||||||
char-set:blank & Space and tab \\
|
|
||||||
char-set:control & ASCII 0--31 and 127 \\
|
|
||||||
char-set:punctuation & \verb|!"#$%&'()*+,-./:;<=>|\verb#?@[\]^_`{|}~# \\
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{center}
|
|
||||||
|
|
||||||
|
|
||||||
\defun {char-alphabetic?}\character\boolean
|
|
||||||
\defunx{char-lower-case?}\character\boolean
|
\defunx{char-lower-case?}\character\boolean
|
||||||
\defunx{char-upper-case?}\character\boolean
|
\defunx{char-upper-case?}\character\boolean
|
||||||
\defunx{char-numeric? }\character\boolean
|
\defunx{char-title-case?}\character\boolean
|
||||||
\defunx{char-alphanumeric?}\character\boolean
|
\defunx{char-digit?}\character\boolean
|
||||||
|
\defunx{char-letter+digit?}\character\boolean
|
||||||
\defunx{char-graphic?}\character\boolean
|
\defunx{char-graphic?}\character\boolean
|
||||||
\defunx{char-printing?}\character\boolean
|
\defunx{char-printing?}\character\boolean
|
||||||
\defunx{char-whitespace?}\character\boolean
|
\defunx{char-whitespace?}\character\boolean
|
||||||
\defunx{char-blank?}\character\boolean
|
\defunx{char-blank?}\character\boolean
|
||||||
\defunx{char-control?}\character\boolean
|
\defunx{char-iso-control?}\character\boolean
|
||||||
\defunx{char-punctuation?}\character\boolean
|
\defunx{char-punctuation?}\character\boolean
|
||||||
\defunx{char-hex-digit?}\character\boolean
|
\defunx{char-hex-digit?}\character\boolean
|
||||||
\defunx{char-ascii?}\character\boolean
|
\defunx{char-ascii?}\character\boolean
|
||||||
\begin{desc}
|
\begin{desc}
|
||||||
These predicates are defined in terms of the above character sets.
|
Each of these predicates tests for membership in one of the standard
|
||||||
|
character sets provided by the SRFI-14 character-set library.
|
||||||
|
Additionally, the following redundant bindings are provided for {R5RS}
|
||||||
|
compatibility:
|
||||||
|
\begin{inset}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
{R5RS} name & scsh definition \\ \hline
|
||||||
|
\ex{char-alphabetic?} & \ex{char-letter+digit?} \\
|
||||||
|
\ex{char-numeric?} & \ex{char-digit?} \\
|
||||||
|
\ex{char-alphanumeric?} & \ex{char-letter+digit?}
|
||||||
|
\end{tabular}
|
||||||
|
\end{inset}
|
||||||
\end{desc}
|
\end{desc}
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\subsection{Linear-update character-set operations}
|
\section{Deprecated character-set procedures}
|
||||||
These procedures have a hybrid pure-functional/side-effecting semantics:
|
\label{sec:char-sets}
|
||||||
they are allowed, but not required, to side-effect one of their parameters
|
|
||||||
in order to construct their result.
|
|
||||||
An implementation may legally implement these procedures as pure,
|
|
||||||
side-effect-free functions, or it may implement them using side effects,
|
|
||||||
depending upon the details of what is the most efficient or simple to
|
|
||||||
implement in terms of the underlying representation.
|
|
||||||
|
|
||||||
What this means is that clients of these procedures \emph{may not} rely
|
The SRFI-13 character-set library grew out of an earlier library developed
|
||||||
upon these procedures working by side effect.
|
for scsh.
|
||||||
For example, this is not guaranteed to work:
|
However, the SRFI standardisation process introduced incompatibilities with
|
||||||
\begin{verbatim}
|
the original scsh bindings.
|
||||||
(let ((cs (char-set #\a #\b #\c)))
|
The current version of scsh provides the library
|
||||||
(char-set-adjoin! cs #\d)
|
\ex{obsolete-char-set-lib}, which contains the old bindings found in
|
||||||
cs) ; Could be either {a,b,c} or {a,b,c,d}.
|
previous releases of scsh.
|
||||||
\end{verbatim}
|
The following table lists the members of this library, along with
|
||||||
However, this is well-defined:
|
the equivalent SRFI-13 binding. This obsolete library is deprecated and
|
||||||
\begin{verbatim}
|
\emph{not} open by default in the standard \ex{scsh} environment;
|
||||||
(let ((cs (char-set #\a #\b #\c)))
|
new code should use the SRFI-13 bindings.
|
||||||
(char-set-adjoin! cs #\d)) ; {a,b,c,d}
|
\begin{inset}
|
||||||
\end{verbatim}
|
\begin{tabular}{ll}
|
||||||
So clients of these procedures write in a functional style, but must
|
Old \ex{obsolete-char-set-lib} & SRFI-13 \ex{char-set-lib} \\ \hline
|
||||||
additionally be sure that, when the procedure is called, there are no
|
|
||||||
other live pointers to the potentially-modified character set (hence the term
|
|
||||||
``linear update'').
|
|
||||||
|
|
||||||
There are two benefits to this convention:
|
\ex{char-set-members} & \ex{char-set->list} \\
|
||||||
\begin{itemize}
|
\ex{chars->char-set} & \ex{list->char-set} \\
|
||||||
\item Implementations are free to provide the most efficient possible
|
\ex{ascii-range->char-set} & \ex{ucs-range->char-set} (not exact) \\
|
||||||
implementation, either functional or side-effecting.
|
\ex{predicate->char-set} & \ex{char-set-filter} (not exact) \\
|
||||||
\item Programmers may nonetheless continue to assume that character sets
|
\ex{char-set-every}? & \ex{char-set-every} \\
|
||||||
are purely functional data structures: they may be reliably shared
|
\ex{char-set-any}? & \ex{char-set-any} \\
|
||||||
without needing to be copied, uniquified, and so forth.
|
\\
|
||||||
\end{itemize}
|
\ex{char-set-invert} & \ex{char-set-complement} \\
|
||||||
|
\ex{char-set-invert}! & \ex{char-set-complement!} \\
|
||||||
In practice, these procedures are most useful for efficiently constructing
|
\\
|
||||||
character sets in a side-effecting manner, in some limited local context,
|
\ex{char-set:alphabetic} & \ex{char-set:letter} \\
|
||||||
before passing the character set outside the local construction scope to be
|
\ex{char-set:numeric} & \ex{char-set:digit} \\
|
||||||
used in a functional manner.
|
\ex{char-set:alphanumeric} & \ex{char-set:letter+digit} \\
|
||||||
|
\ex{char-set:control} & \ex{char-set:iso-control}
|
||||||
Scsh provides no assistance in checking the linearity of the potentially
|
\end{tabular}
|
||||||
side-effected parameters passed to these functions --- there's no linear
|
\end{inset}
|
||||||
type checker or run-time mechanism for detecting violations.
|
Note also that the \ex{->char-set} procedure no longer handles a predicate
|
||||||
|
argument.
|
||||||
\defun{char-set-copy}{cs}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Returns a copy of the character set \var{cs}.
|
|
||||||
``Copy'' means that if either the input parameter or the
|
|
||||||
result value of this procedure is passed to one of the linear-update
|
|
||||||
procedures described below, the other character set is guaranteed
|
|
||||||
not to be altered.
|
|
||||||
(A system that provides pure-functional implementations of the rest of
|
|
||||||
the linear-operator suite could implement this procedure as the
|
|
||||||
identity function.)
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-adjoin!}{cs \vari{char}1\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Add the \vari{char}i characters to character set \var{cs}, and
|
|
||||||
return the result.
|
|
||||||
This procedure is allowed, but not required, to side-effect \var{cs}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun{char-set-delete!}{cs \vari{char}1\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
Remove the \vari{char}i characters to character set \var{cs}, and
|
|
||||||
return the result.
|
|
||||||
This procedure is allowed, but not required, to side-effect \var{cs}.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
\defun {char-set-invert!}{char-set}{char-set}
|
|
||||||
\defunx{char-set-union!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set}
|
|
||||||
\defunx{char-set-intersection!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set}
|
|
||||||
\defunx{char-set-difference!}{\vari{char-set}1 \vari{char-set}2\ldots}{char-set}
|
|
||||||
\begin{desc}
|
|
||||||
These procedures implement set complement, union, intersection, and difference
|
|
||||||
for character sets.
|
|
||||||
They are allowed, but not required, to side-effect their first parameter.
|
|
||||||
The union, intersection, and difference operations are n-ary, associating
|
|
||||||
to the left.
|
|
||||||
\end{desc}
|
|
||||||
|
|
|
@ -963,11 +963,6 @@ Note that once a Scheme port is revealed in scsh, the runtime will not
|
||||||
shift the port around with \ex{dup()} and \ex{close()}.
|
shift the port around with \ex{dup()} and \ex{close()}.
|
||||||
This means the file-locking procedures can then be applied to the port's
|
This means the file-locking procedures can then be applied to the port's
|
||||||
associated file descriptor.
|
associated file descriptor.
|
||||||
|
|
||||||
NeXTSTEP users should also note that even minimalist {\Posix} file locking
|
|
||||||
is not supported for NFS-mounted files in NeXTSTEP; NeXT claims they will
|
|
||||||
fix this in NS release 4.
|
|
||||||
We'd appreciate hearing from users when and if this happens.
|
|
||||||
}
|
}
|
||||||
|
|
||||||
{\Posix} allows the user to lock a region of a file with either
|
{\Posix} allows the user to lock a region of a file with either
|
||||||
|
@ -1392,8 +1387,8 @@ Returns:
|
||||||
|
|
||||||
Note that the rules of backslash for {\Scheme} strings and glob patterns
|
Note that the rules of backslash for {\Scheme} strings and glob patterns
|
||||||
work together to require four backslashes in a row to specify a
|
work together to require four backslashes in a row to specify a
|
||||||
single literal backslash. Fortunately, this should be a rare
|
single literal backslash. Fortunately, it is very rare that a backslash
|
||||||
occurrence.
|
occurs in a Unix file name.
|
||||||
|
|
||||||
A glob subpattern will not match against dot files unless the first
|
A glob subpattern will not match against dot files unless the first
|
||||||
character of the subpattern is a literal ``\ex{.}''.
|
character of the subpattern is a literal ``\ex{.}''.
|
||||||
|
@ -2623,9 +2618,6 @@ all of the complexity is optional,
|
||||||
and defaulting all the optional arguments reduces the system
|
and defaulting all the optional arguments reduces the system
|
||||||
to a simple interface.
|
to a simple interface.
|
||||||
|
|
||||||
\remark{This time package does not currently work with NeXTSTEP, as NeXTSTEP
|
|
||||||
does not provide a {\Posix}-compliant time library that will even link.}
|
|
||||||
|
|
||||||
\subsection{Terminology}
|
\subsection{Terminology}
|
||||||
``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the
|
``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the
|
||||||
official name for what is colloquially referred to as ``Greenwich Mean
|
official name for what is colloquially referred to as ``Greenwich Mean
|
||||||
|
@ -2992,7 +2984,8 @@ If \var{val} is {\sharpf}, then any entry for \var{var} is deleted.
|
||||||
an alist, \eg,
|
an alist, \eg,
|
||||||
\begin{code}
|
\begin{code}
|
||||||
(("TERM" . "vt100")
|
(("TERM" . "vt100")
|
||||||
("SHELL" . "/bin/csh")
|
("SHELL" . "/usr/local/bin/scsh")
|
||||||
|
("PATH" . "/sbin:/usr/sbin:/bin:/usr/bin")
|
||||||
("EDITOR" . "emacs")
|
("EDITOR" . "emacs")
|
||||||
\ldots)\end{code}
|
\ldots)\end{code}
|
||||||
\end{desc}
|
\end{desc}
|
||||||
|
@ -3005,6 +2998,21 @@ If \var{val} is {\sharpf}, then any entry for \var{var} is deleted.
|
||||||
environment (\ie, converted to a null-terminated C vector of
|
environment (\ie, converted to a null-terminated C vector of
|
||||||
\ex{"\var{var}=\var{val}"} strings which is assigned to the global
|
\ex{"\var{var}=\var{val}"} strings which is assigned to the global
|
||||||
\ex{char **environ}).
|
\ex{char **environ}).
|
||||||
|
|
||||||
|
\begin{code}
|
||||||
|
;;; Note $PATH entry is converted
|
||||||
|
;;; to /sbin:/usr/sbin:/bin:/usr/bin.
|
||||||
|
(alist->env '(("TERM" . "vt100")
|
||||||
|
("PATH" "/sbin" "/usr/sbin" "/bin")
|
||||||
|
("SHELL" . "/usr/local/bin/scsh")))
|
||||||
|
\end{code}
|
||||||
|
|
||||||
|
Note that \ex{env->alist} and \ex{alist->env} are not exact
|
||||||
|
inverses---\ex{alist->env} will convert a list value into a single
|
||||||
|
colon-separated string, but \ex{env->alist} will not parse colon-separated
|
||||||
|
values into lists. (See the \ex{\$PATH} element in the examples given for
|
||||||
|
each procedure.)
|
||||||
|
|
||||||
\end{desc}
|
\end{desc}
|
||||||
|
|
||||||
The following three functions help the programmer manipulate alist
|
The following three functions help the programmer manipulate alist
|
||||||
|
@ -3082,18 +3090,30 @@ Example: These four pieces of code all run the mailer with special
|
||||||
|
|
||||||
\subsection{Path lists and colon lists}
|
\subsection{Path lists and colon lists}
|
||||||
|
|
||||||
Environment variables such as \ex{\$PATH} encode a list of strings
|
When environment variables such as \ex{\$PATH} need to encode a list of
|
||||||
by separating the list elements with colon delimiters.
|
strings (such as a list of directories to be searched),
|
||||||
Once parsed into actual lists, these ordered lists can be manipulated
|
the common Unix convention is to separate the list elements with
|
||||||
with the following two functions.
|
colon delimiters.\footnote{\ldots and hope the individual list elements
|
||||||
|
don't contain colons themselves.}
|
||||||
To convert between the colon-separated string encoding and the
|
To convert between the colon-separated string encoding and the
|
||||||
list-of-strings representation, see the \ex{field-reader} and
|
list-of-strings representation, see the \ex{infix-splitter} function
|
||||||
\ex{join-strings} functions in section~\ref{sec:field-reader}.
|
(section~\ref{sec:field-splitter}) and the string library's
|
||||||
\remark{An earlier release of scsh provided the \ex{split-colon-list}
|
\ex{string-join} function.
|
||||||
and \ex{string-list->colon-list} functions. These have been
|
For example,
|
||||||
removed from scsh, and are replaced by the more general
|
\begin{code}
|
||||||
parsers and unparsers of the field-reader module.}
|
(define split (infix-splitter (rx ":")))
|
||||||
|
(split "/sbin:/bin::/usr/bin") {\evalsto}
|
||||||
|
'("/sbin" "/bin" "" "/usr/bin")
|
||||||
|
(string-join ":" '("/sbin" "/bin" "" "/usr/bin")) {\evalsto}
|
||||||
|
"/sbin:/bin::/usr/bin"\end{code}
|
||||||
|
The following two functions are useful for manipulating these ordered lists,
|
||||||
|
once they have been parsed from their colon-separated form.
|
||||||
|
|
||||||
|
%\remark{An earlier release of scsh provided the \ex{split-colon-list}
|
||||||
|
% and \ex{string-list->colon-list} functions. These have been
|
||||||
|
% removed from scsh, and are replaced by the more general
|
||||||
|
% parsers and unparsers of the field-reader module.}
|
||||||
|
%
|
||||||
%\defun {split-colon-list} {string} {{\str} list}
|
%\defun {split-colon-list} {string} {{\str} list}
|
||||||
%\defunx {string-list->colon-list} {string-list} \str
|
%\defunx {string-list->colon-list} {string-list} \str
|
||||||
%\begin{desc}
|
%\begin{desc}
|
||||||
|
@ -3146,15 +3166,18 @@ Scsh never uses \cd{$USER} at all.
|
||||||
It computes \ex{(user-login-name)} from the system call \ex{(user-uid)}.
|
It computes \ex{(user-login-name)} from the system call \ex{(user-uid)}.
|
||||||
|
|
||||||
\defvar {home-directory} \str
|
\defvar {home-directory} \str
|
||||||
\defvarx {exec-path-list} {{\str} list}
|
\defvarx {exec-path-list} {{\str} list fluid}
|
||||||
\begin{desc}
|
\begin{desc}
|
||||||
Scsh accesses \cd{$HOME} at start-up time, and stores the value in the
|
Scsh accesses \cd{$HOME} at start-up time, and stores the value in the
|
||||||
global variable \ex{home-directory}. It uses this value for \ex{\~}
|
global variable \ex{home-directory}. It uses this value for \ex{\~}
|
||||||
lookups and for returning to home on \ex{(chdir)}.
|
lookups and for returning to home on \ex{(chdir)}.
|
||||||
|
|
||||||
Scsh accesses \cd{$PATH} at start-up time, colon-splits the path list, and
|
Scsh accesses \cd{$PATH} at start-up time, colon-splits the path list, and
|
||||||
stores the value in the global variable \ex{exec-path-list}. This list is
|
stores the value in the fluid \ex{exec-path-list}. This list is
|
||||||
used for \ex{exec-path} and \ex{exec-path/env} searches.
|
used for \ex{exec-path} and \ex{exec-path/env} searches.
|
||||||
|
|
||||||
|
To access, rebind or side-effect fluid cells, you must open
|
||||||
|
the \ex{fluids} package.
|
||||||
\end{desc}
|
\end{desc}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
|
|
@ -17,19 +17,8 @@ an elegant language; go wild.
|
||||||
\item An X gui interface. (Needs threads.)
|
\item An X gui interface. (Needs threads.)
|
||||||
\item A better C function/data-structure interface. This is not easy.
|
\item A better C function/data-structure interface. This is not easy.
|
||||||
\item More network protocols. Telnet and ftp would be the most important.
|
\item More network protocols. Telnet and ftp would be the most important.
|
||||||
\item An ILU interface.
|
|
||||||
\item An RPC system, with ``tail-recursion.''
|
|
||||||
\item Interfaces to relational db's.
|
|
||||||
This would be quite useful for Web servers.
|
|
||||||
An s-expression embedding of SQL would be a key design component
|
|
||||||
of such a system, along the lines of scsh's process notation or
|
|
||||||
\ex{awk} notation.
|
|
||||||
\item Port Edwin, and emacs text editor written in MIT Scheme, to scsh.
|
\item Port Edwin, and emacs text editor written in MIT Scheme, to scsh.
|
||||||
Combine it with scsh's OS interfaces to make a visual shell.
|
Combine it with scsh's OS interfaces to make a visual shell.
|
||||||
\item An \ex{expect} knock-off.
|
|
||||||
\item A \ex{make} replacement, using scsh's process notation in the build
|
|
||||||
rules.
|
|
||||||
|
|
||||||
\item Manual hacking.
|
\item Manual hacking.
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item The {\LaTeX} hackery needs yet another serious pass. Most importantly,
|
\item The {\LaTeX} hackery needs yet another serious pass. Most importantly,
|
||||||
|
|
Loading…
Reference in New Issue