544 lines
23 KiB
TeX
544 lines
23 KiB
TeX
|
%&latex -*- latex -*-
|
||
|
|
||
|
\chapter{Process notation}
|
||
|
\label{sec:proc-forms}
|
||
|
Scsh has a notation for controlling {\Unix} processes that takes the
|
||
|
form of s-expressions; this notation can then be embedded inside of
|
||
|
standard {\Scheme} code.
|
||
|
The basic elements of this notation are \emph{process forms},
|
||
|
\emph{extended process forms}, and \emph{redirections}.
|
||
|
|
||
|
\section{Extended process forms and i/o redirections}
|
||
|
An \emph{extended process form} is a specification of a {\Unix} process to
|
||
|
run, in a particular I/O environment:
|
||
|
\codex{\var{epf} {\synteq} (\var{pf} $ \var{redir}_1$ {\ldots} $ \var{redir}_n $)}
|
||
|
where \var{pf} is a process form and the $\var{redir}_i$ are redirection specs.
|
||
|
A \emph{redirection spec} is one of:
|
||
|
\begin{inset}
|
||
|
\begin{tabular}{@{}l@{\qquad{\tt; }}l@{}}
|
||
|
\ex{(< \var{[fdes]} \var{file-name})} & \ex{Open file for read.}
|
||
|
\\\ex{(> \var{[fdes]} \var{file-name})} & \ex{Open file create/truncate.}
|
||
|
\\\ex{(<< \var{[fdes]} \var{object})} & \ex{Use \var{object}'s printed rep.}
|
||
|
\\\ex{(>> \var{[fdes]} \var{file-name})} & \ex{Open file for append.}
|
||
|
\\\ex{(= \var{fdes} \var{fdes/port})} & \ex{Dup2}
|
||
|
\\\ex{(- \var{fdes/port})} & \ex{Close \var{fdes/port}.}
|
||
|
\\\ex{stdports} & \ex{0,1,2 dup'd from standard ports.}
|
||
|
\end{tabular}
|
||
|
\end{inset}
|
||
|
The input redirections default to file descriptor 0;
|
||
|
the output redirections default to file descriptor 1.
|
||
|
|
||
|
The subforms of a redirection are implicitly backquoted,
|
||
|
and symbols stand for their print-names.
|
||
|
So \ex{(> ,x)} means
|
||
|
``output to the file named by {\Scheme} variable \ex{x},''
|
||
|
and \ex{(< /usr/shivers/.login)} means ``read from \ex{/usr/shivers/.login}.''
|
||
|
|
||
|
\pagebreak
|
||
|
Here are two more examples of i/o redirection:
|
||
|
%
|
||
|
\begin{center}
|
||
|
\begin{codebox}
|
||
|
(< ,(vector-ref fv i))
|
||
|
(>> 2 /tmp/buf)\end{codebox}
|
||
|
\end{center}
|
||
|
%
|
||
|
These two redirections cause the file \ex{fv[i]} to be opened on stdin, and
|
||
|
\ex{/tmp/buf} to be opened for append writes on stderr.
|
||
|
|
||
|
The redirection \ex{(<< \var{object})} causes input to come from the
|
||
|
printed representation of \var{object}.
|
||
|
For example,
|
||
|
\codex{(<< "The quick brown fox jumped over the lazy dog.")}
|
||
|
causes reads from stdin to produce the characters of the above string.
|
||
|
The object is converted to its printed representation using the \ex{display}
|
||
|
procedure, so
|
||
|
\codex{(<< (A five element list))}
|
||
|
is the same as
|
||
|
\codex{(<< "(A five element list)")}
|
||
|
is the same as
|
||
|
\codex{(<< ,(reverse '(list element five A))){\rm.}}
|
||
|
(Here we use the implicit backquoting feature to compute the list to
|
||
|
be printed.)
|
||
|
|
||
|
The redirection \ex{(= \var{fdes} \var{fdes/port})} causes \var{fdes/port}
|
||
|
to be dup'd into file descriptor \var{fdes}.
|
||
|
For example, the redirection
|
||
|
\codex{(= 2 1)}
|
||
|
causes stderr to be the same as stdout.
|
||
|
\var{fdes/port} can also be a port, for example:
|
||
|
\codex{(= 2 ,(current-output-port))}
|
||
|
causes stderr to be dup'd from the current output port.
|
||
|
In this case, it is an error if the port is not a file port
|
||
|
(\eg, a string port).
|
||
|
More complex redirections can be accomplished using the \ex{begin}
|
||
|
process form, discussed below, which gives the programmer full control
|
||
|
of i/o redirection from {\Scheme}.
|
||
|
|
||
|
\subsection{Port and file descriptor sync}
|
||
|
\begin{sloppypar}
|
||
|
It's important to remember that rebinding Scheme's current I/O ports
|
||
|
(\eg, using \ex{call-with-input-file} to rebind the value of
|
||
|
\ex{(current-input-port)})
|
||
|
does \emph{not} automatically ``rebind'' the file referenced by the
|
||
|
{\Unix} stdio file descriptors 0, 1, and 2.
|
||
|
This is impossible to do in general, since some {\Scheme} ports are
|
||
|
not representable as {\Unix} file descriptors.
|
||
|
For example, many {\Scheme} implementations provide ``string ports,''
|
||
|
that is, ports that collect characters sent to them into memory buffers.
|
||
|
The accumulated string can later be retrieved from the port as a string.
|
||
|
If a user were to bind \ex{(current-output-port)} to such a port, it would
|
||
|
be impossible to associate file descriptor 1 with this port, as it
|
||
|
cannot be represented in {\Unix}.
|
||
|
So, if the user subsequently forked off some other program as a subprocess,
|
||
|
that program would of course not see the {\Scheme} string port as its standard
|
||
|
output.
|
||
|
\end{sloppypar}
|
||
|
|
||
|
To keep stdio synced with the values of {\Scheme}'s current i/o ports,
|
||
|
use the special redirection \ex{stdports}.
|
||
|
This causes 0, 1, 2 to be redirected from the current {\Scheme} standard ports.
|
||
|
It is equivalent to the three redirections:
|
||
|
\begin{code}
|
||
|
(= 0 ,(current-input-port))
|
||
|
(= 1 ,(current-output-port))
|
||
|
(= 2 ,(error-output-port))\end{code}
|
||
|
%
|
||
|
The redirections are done in the indicated order. This will cause an error if
|
||
|
one of the current i/o ports isn't a {\Unix} port (\eg, if one is a string
|
||
|
port).
|
||
|
This {\Scheme}/{\Unix} i/o synchronisation can also be had in {\Scheme} code
|
||
|
(as opposed to a redirection spec) with the \ex{(stdports->stdio)}
|
||
|
procedure.
|
||
|
|
||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||
|
\section{Process forms}
|
||
|
A \emph{process form} specifies a computation to perform as an independent
|
||
|
{\Unix} process. It can be one of the following:
|
||
|
%
|
||
|
\begin{leftinset}
|
||
|
\begin{codebox}
|
||
|
(begin . \var{scheme-code})
|
||
|
(| \vari{pf}{\!1} {\ldots} \vari{pf}{\!n})
|
||
|
(|+ \var{connect-list} \vari{pf}{\!1} {\ldots} \vari{pf}{\!n})
|
||
|
(epf . \var{epf})
|
||
|
(\var{prog} \vari{arg}{1} {\ldots} \vari{arg}{n})
|
||
|
\end{codebox}
|
||
|
\qquad
|
||
|
\begin{codebox}
|
||
|
; Run \var{scheme-code} in a fork.
|
||
|
; Simple pipeline
|
||
|
; Complex pipeline
|
||
|
; An extended process form.
|
||
|
; Default: exec the program.
|
||
|
\end{codebox}
|
||
|
\end{leftinset}
|
||
|
%
|
||
|
The default case \ex{(\var{prog} \vari{arg}1 {\ldots} \vari{arg}n)}
|
||
|
is also implicitly backquoted.
|
||
|
That is, it is equivalent to:
|
||
|
%
|
||
|
\codex{(begin (apply exec-path `(\var{prog} \vari{arg}1 {\ldots} \vari{arg}n)))}
|
||
|
%
|
||
|
\ex{Exec-path} is the version of the \ex{\urlh{http://www.FreeBSD.org/cgi/man.cgi?query=exec&apropos=0&sektion=0&manpath=FreeBSD+4.3-RELEASE&format=html}{exec()}} system call that
|
||
|
uses scsh's path list to search for an executable.
|
||
|
The program and the arguments must be either strings, symbols, or integers.
|
||
|
Symbols and integers are coerced to strings.
|
||
|
A symbol's print-name is used.
|
||
|
Integers are converted to strings in base 10.
|
||
|
Using symbols instead of strings is convenient, since it suppresses the
|
||
|
clutter of the surrounding \ex{"\ldots"} quotation marks.
|
||
|
To aid this purpose, scsh reads symbols in a case-sensitive manner,
|
||
|
so that you can say
|
||
|
\codex{(more Readme)}
|
||
|
and get the right file.
|
||
|
|
||
|
A \var{connect-list} is a specification of how two processes are to be wired
|
||
|
together by pipes.
|
||
|
It has the form \ex{((\vari{from}1 \vari{from}2 {\ldots} \var{to}) \ldots)}
|
||
|
and is implicitly backquoted.
|
||
|
For example,
|
||
|
%
|
||
|
\codex{(|+ ((1 2 0) (3 1)) \vari{pf}{\!1} \vari{pf}{\!2})}
|
||
|
%
|
||
|
runs \vari{pf}{\!1} and \vari{pf}{\!2}.
|
||
|
The first clause \ex{(1 2 0)} causes \vari{pf}{\!1}'s
|
||
|
stdout (1) and stderr (2) to be connected via pipe
|
||
|
to \vari{pf}{\!2}'s stdin (0).
|
||
|
The second clause \ex{(3 1)} causes \vari{pf}{\!1}'s file descriptor 3 to be
|
||
|
connected to \vari{pf}{\!2}'s file descriptor 1.
|
||
|
%---this is unusual, and not expected to occur very often.
|
||
|
|
||
|
The \ex{begin} process form does a \ex{stdio->stdports} synchronisation
|
||
|
in the child process before executing the body of the form.
|
||
|
This guarantees that the \ex{begin} form, like all other process forms,
|
||
|
``sees'' the effects of any associated I/O redirections.
|
||
|
|
||
|
Note that {\R4RS} does not specify whether or not \ex{|} and \ex{|+}
|
||
|
are readable symbols. Scsh does.
|
||
|
|
||
|
\section{Using extended process forms in \Scheme}
|
||
|
Process forms and extended process forms are \emph{not} {\Scheme}.
|
||
|
They are a different notation for expressing computation that, like {\Scheme},
|
||
|
is based upon s-expressions.
|
||
|
Extended process forms are used in {\Scheme} programs by embedding them inside
|
||
|
special {\Scheme} forms.
|
||
|
There are three basic {\Scheme} forms that use extended process forms:
|
||
|
\ex{exec-epf}, \cd{&}, and \ex{run}.
|
||
|
|
||
|
\dfn {exec-epf} {. \var{epf}} {\noreturn} {syntax}
|
||
|
\dfnx {\&} {. \var{epf}} {proc} {syntax}
|
||
|
\dfnx {run} {. \var{epf}} {proc} {syntax}
|
||
|
\begin{desc}
|
||
|
\index{exec-epf} \index{\&} \index{run}
|
||
|
The \ex{(exec-epf . \var{epf})} form nukes the current process: it establishes
|
||
|
the i/o redirections and then overlays the current process with the requested
|
||
|
computation.
|
||
|
|
||
|
The \ex{(\& . \var{epf})} form is similar, except that the process is forked
|
||
|
off in background. The form returns the subprocess' process object.
|
||
|
|
||
|
The \ex{(run . \var{epf})} form runs the process in foreground:
|
||
|
after forking off the computation, it waits for the subprocess to exit,
|
||
|
and returns its exit status.
|
||
|
|
||
|
These special forms are macros that expand into the equivalent
|
||
|
series of system calls.
|
||
|
The definition of the \ex{exec-epf} macro is non-trivial,
|
||
|
as it produces the code to handle i/o redirections and set up pipelines.
|
||
|
However, the definitions of the \cd{&} and \ex{run} macros are very simple:
|
||
|
\begin{leftinset}
|
||
|
\begin{tabular}{@{}l@{\quad$\equiv$\quad}l@{}}
|
||
|
\cd{(& . \var{epf})} & \ex{(fork (\l{} (exec-epf . \var{epf})))} \\
|
||
|
\ex{(run . \var{epf})} & \cd{(wait (& . \var{epf}))}
|
||
|
\end{tabular}
|
||
|
\end{leftinset}
|
||
|
\end{desc}
|
||
|
|
||
|
\subsection{Procedures and special forms}
|
||
|
It is a general design principle in scsh that all functionality
|
||
|
made available through special syntax is also available in a
|
||
|
straightforward procedural form.
|
||
|
So there are procedural equivalents for all of the process notation.
|
||
|
In this way, the programmer is not restricted by the particular details of
|
||
|
the syntax.
|
||
|
Here are some of the syntax/procedure equivalents:
|
||
|
\begin{inset}
|
||
|
\begin{tabular}{@{}|ll|@{}}
|
||
|
\hline
|
||
|
Notation & Procedure \\ \hline \hline
|
||
|
\ex{|} & \ex{fork/pipe} \\
|
||
|
\ex{|+} & \ex{fork/pipe+} \\
|
||
|
\ex{exec-epf} & \ex{exec-path} \\
|
||
|
redirection & \ex{open}, \ex{dup} \\
|
||
|
\cd{&} & \ex{fork} \\
|
||
|
\ex{run} & $\ex{wait} + \ex{fork}$ \\
|
||
|
\hline
|
||
|
\end{tabular}
|
||
|
\end{inset}
|
||
|
%
|
||
|
Having a solid procedural foundation also allows for general notational
|
||
|
experimentation using {\Scheme}'s macros.
|
||
|
For example, the programmer can build his own pipeline notation on top of the
|
||
|
\ex{fork} and \ex{fork/pipe} procedures.
|
||
|
Chapter~\ref{chapt:syscalls} gives the full story on all the procedures
|
||
|
in the syscall library.
|
||
|
|
||
|
\subsection{Interfacing process output to {\Scheme}}
|
||
|
\label{sec:io-interface}
|
||
|
There is a family of procedures and special forms that can be used
|
||
|
to capture the output of processes as {\Scheme} data.
|
||
|
%
|
||
|
\dfn {run/port} {. \var{epf}} {port} {syntax}
|
||
|
\dfnx{run/file} {. \var{epf}} {\str} {syntax}
|
||
|
\dfnx{run/string} {. \var{epf}} {\str} {syntax}
|
||
|
\dfnx{run/strings} {. \var{epf}} {{\str} list} {syntax}
|
||
|
\dfnx{run/sexp} {. \var{epf}} {object} {syntax}
|
||
|
\dfnx{run/sexps} {. \var{epf}} {list} {syntax}
|
||
|
\begin{desc}
|
||
|
These forms all fork off subprocesses, collecting the process' output
|
||
|
to stdout in some form or another.
|
||
|
The subprocess runs with file descriptor 1 and the current output port
|
||
|
bound to a pipe.
|
||
|
\begin{desctable}{0.7\linewidth}
|
||
|
\ex{run/port} & Value is a port open on process's stdout.
|
||
|
Returns immediately after forking child. \\
|
||
|
\ex{run/file} & Value is name of a temp file containing process's output.
|
||
|
Returns when process exits. \\
|
||
|
\ex{run/string} & Value is a string containing process' output.
|
||
|
Returns when eof read. \\
|
||
|
\ex{run/strings}& Splits process' output into a list of
|
||
|
newline-delimited strings. Returns when eof read. \\
|
||
|
\ex{run/sexp} & Reads a single object from process' stdout with \ex{read}.
|
||
|
Returns as soon as the read completes. \\
|
||
|
\ex{run/sexps} & Repeatedly reads objects from process' stdout with \ex{read}.
|
||
|
Returns accumulated list upon eof.
|
||
|
\end{desctable}
|
||
|
The delimiting newlines are not included in the strings returned by
|
||
|
\ex{run/strings}.
|
||
|
|
||
|
These special forms just expand into calls to the following analogous
|
||
|
procedures.
|
||
|
\end{desc}
|
||
|
|
||
|
\defun {run/port*} {thunk} {port}
|
||
|
\defunx {run/file*} {thunk} {\str}
|
||
|
\defunx {run/string*} {thunk} {\str}
|
||
|
\defunx {run/strings*} {thunk} {{\str} list}
|
||
|
\defunx {run/sexp*} {thunk} {object}
|
||
|
\defunx {run/sexps*} {thunk} {object list}
|
||
|
\begin{desc}
|
||
|
For example, \ex{(run/port . \var{epf})} expands into
|
||
|
\codex{(run/port* (\l{} (exec-epf . \var{epf}))).}
|
||
|
\end{desc}
|
||
|
|
||
|
The following procedures are also of utility for generally parsing
|
||
|
input streams in scsh:
|
||
|
\defun {port->string} {port} {\str}
|
||
|
\defunx {port->sexp-list} {port} {list}
|
||
|
\defunx {port->string-list} {port} {{\str} list}
|
||
|
\defunx {port->list} {reader port} {list}
|
||
|
\begin{desc}
|
||
|
\ex{Port->string} reads the port until eof,
|
||
|
then returns the accumulated string.
|
||
|
\ex{Port->sexp-list} repeatedly reads data from the port until eof,
|
||
|
then returns the accumulated list of items.
|
||
|
\ex{Port->string-list} repeatedly reads newline-terminated strings from the
|
||
|
port until eof, then returns the accumulated list of strings.
|
||
|
The delimiting newlines are not part of the returned strings.
|
||
|
\ex{Port->list} generalises these two procedures.
|
||
|
It uses \var{reader} to repeatedly read objects from a port.
|
||
|
It accumulates these objects into a list, which is returned upon eof.
|
||
|
The \ex{port->string-list} and \ex{port->sexp-list} procedures
|
||
|
are trivial to define, being merely \ex{port->list} curried with
|
||
|
the appropriate parsers:
|
||
|
\begin{code}\cddollar
|
||
|
(port->string-list \var{port}) $\equiv$ (port->list read-line \var{port})
|
||
|
(port->sexp-list \var{port}) $\equiv$ (port->list read \var{port})\end{code}
|
||
|
%
|
||
|
The following compositions also hold:
|
||
|
\begin{code}\cddollar
|
||
|
run/string* $\equiv$ port->string $\circ$ run/port*
|
||
|
run/strings* $\equiv$ port->string-list $\circ$ run/port*
|
||
|
run/sexp* $\equiv$ read $\circ$ run/port*
|
||
|
run/sexps* $\equiv$ port->sexp-list $\circ$ run/port*\end{code}
|
||
|
\end{desc}
|
||
|
|
||
|
\defun{port-fold}{port reader op . seeds} {\object\star}
|
||
|
\begin{desc}
|
||
|
This procedure can be used to perform a variety of iterative operations
|
||
|
over an input stream.
|
||
|
It repeatedly uses \var{reader} to read an object from \var{port}.
|
||
|
If the first read returns eof, then the entire \ex{port-fold}
|
||
|
operation returns the seeds as multiple values.
|
||
|
If the first read operation returns some other value $v$, then
|
||
|
\var{op} is applied to $v$ and the seeds:
|
||
|
\ex{(\var{op} \var{v} . \var{seeds})}.
|
||
|
This should return a new set of seed values, and the reduction then loops,
|
||
|
reading a new value from the port, and so forth.
|
||
|
(If multiple seed values are used, then \var{op} must return multiple values.)
|
||
|
|
||
|
For example, \ex{(port->list \var{reader} \var{port})}
|
||
|
could be defined as
|
||
|
\codex{(reverse (port-fold \var{port} \var{reader} cons '()))}
|
||
|
|
||
|
An imperative way to look at \ex{port-fold} is to say that it
|
||
|
abstracts the idea of a loop over a stream of values read from
|
||
|
some port, where the seed values express the loop state.
|
||
|
|
||
|
\remark{This procedure was formerly named \texttt{\indx{reduce-port}}.
|
||
|
The old binding is still provided, but is deprecated and will
|
||
|
probably vanish in a future release.}
|
||
|
\end{desc}
|
||
|
|
||
|
|
||
|
\section{More complex process operations}
|
||
|
The procedures and special forms in the previous section provide for the
|
||
|
common case, where the programmer is only interested in the output of the
|
||
|
process.
|
||
|
These special forms and procedures provide more complicated facilities
|
||
|
for manipulating processes.
|
||
|
|
||
|
|
||
|
\subsection{Pids and ports together}
|
||
|
\dfn {run/port+proc} {. \var{epf}} {[port proc]} {syntax}
|
||
|
\defunx {run/port+proc*} {thunk} {[port proc]}
|
||
|
\begin{desc}
|
||
|
This special form and its analogous procedure can be used
|
||
|
if the programmer also wishes access to the process' pid, exit status,
|
||
|
or other information.
|
||
|
They both fork off a subprocess, returning two values:
|
||
|
a port open on the process' stdout (and current output port),
|
||
|
and the subprocess's process object.
|
||
|
A process object encapsulates the subprocess' process id and exit code;
|
||
|
it is the value passed to the \ex{wait} system call.
|
||
|
|
||
|
|
||
|
For example, to uncompress a tech report, reading the uncompressed
|
||
|
data into scsh, and also be able to track the exit status of
|
||
|
the decompression process, use the following:
|
||
|
\begin{code}
|
||
|
(receive (port child) (run/port+proc (zcat tr91-145.tex.Z))
|
||
|
(let* ((paper (port->string port))
|
||
|
(status (wait child)))
|
||
|
{\rm\ldots{}use \ex{paper}, \ex{status}, and \ex{child} here\ldots}))\end{code}
|
||
|
%
|
||
|
Note that you must \emph{first} do the \ex{port->string} and
|
||
|
\emph{then} do the wait---the other way around may lock up when the
|
||
|
zcat fills up its output pipe buffer.
|
||
|
\end{desc}
|
||
|
|
||
|
|
||
|
\subsection{Multiple stream capture}
|
||
|
|
||
|
Occasionally, the programmer may want to capture multiple distinct output
|
||
|
streams from a process. For instance, he may wish to read the stdout and
|
||
|
stderr streams into two distinct strings. This is accomplished with the
|
||
|
\ex{run/collecting} form and its analogous procedure, \ex{run/collecting*}.
|
||
|
%
|
||
|
\dfn {run/collecting} {fds . epf} {[port\ldots]} {syntax}
|
||
|
\defunx {run/collecting*} {fds thunk} {[port\ldots]}
|
||
|
\begin{desc}
|
||
|
\ex{Run/collecting} and \ex{run/collecting*} run processes that produce
|
||
|
multiple output streams and return ports open on these streams. To avoid
|
||
|
issues of deadlock, \ex{run/collecting} doesn't use pipes. Instead, it first
|
||
|
runs the process with output to temp files, then returns ports open on the
|
||
|
temp files. For example,
|
||
|
%
|
||
|
\codex{(run/collecting (1 2) (ls))}
|
||
|
%
|
||
|
runs \ex{ls} with stdout (fd 1) and stderr (fd 2) redirected to temporary
|
||
|
files.
|
||
|
When the \ex{ls} is done, \ex{run/collecting} returns three values: the
|
||
|
\ex{ls} process' exit status, and two ports open on the temporary files. The
|
||
|
files are deleted before \ex{run/collecting} returns, so when the ports are
|
||
|
closed, they vanish. The \ex{fds} list of file descriptors is implicitly
|
||
|
backquoted by the special-form version.
|
||
|
|
||
|
For example, if Kaiming has his mailbox protected, then
|
||
|
\begin{code}
|
||
|
(receive (status out err)
|
||
|
(run/collecting (1 2) (cat /usr/kmshea/mbox))
|
||
|
(list status (port->string out) (port->string err)))\end{code}
|
||
|
%
|
||
|
might produce the list
|
||
|
\codex{(256 "" "cat: /usr/kmshea/mbox: Permission denied")}
|
||
|
|
||
|
What is the deadlock hazard that causes \ex{run/collecting} to use temp files?
|
||
|
Processes with multiple output streams can lock up if they use pipes
|
||
|
to communicate with {\Scheme} i/o readers. For example, suppose
|
||
|
some {\Unix} program \ex{myprog} does the following:
|
||
|
\begin{enumerate}
|
||
|
\item First, outputs a single ``\ex{(}'' to stderr.
|
||
|
\item Then, outputs a megabyte of data to stdout.
|
||
|
\item Finally, outputs a single ``\ex{)}'' to stderr, and exits.
|
||
|
\end{enumerate}
|
||
|
|
||
|
Our scsh programmer decides to run \ex{myprog} with stdout and stderr redirected
|
||
|
\emph{via {\Unix} pipes} to the ports \ex{port1} and \ex{port2}, respectively.
|
||
|
He gets into trouble when he subsequently says \ex{(read port2)}.
|
||
|
The {\Scheme} \ex{read} routine reads the open paren, and then hangs in a
|
||
|
\ex{\urlh{http://www.FreeBSD.org/cgi/man.cgi?query=read&apropos=0&sektion=0&manpath=FreeBSD+4.3-RELEASE&format=html}{read()}} system call trying to read a matching close paren.
|
||
|
But before \ex{myprog} sends the close paren down the stderr
|
||
|
pipe, it first tries to write a megabyte of data to the stdout pipe.
|
||
|
However, {\Scheme} is not reading that pipe---it's stuck waiting for input on
|
||
|
stderr.
|
||
|
So the stdout pipe quickly fills up, and \ex{myprog} hangs, waiting for the
|
||
|
pipe to drain.
|
||
|
The \ex{myprog} child is stuck in a stdout/\ex{port1} write;
|
||
|
the {\Scheme} parent is stuck in a stderr/\ex{port2} read.
|
||
|
Deadlock.
|
||
|
|
||
|
Here's a concrete example that does exactly the above:
|
||
|
\begin{code}
|
||
|
(receive (status port1 port2)
|
||
|
(run/collecting (1 2)
|
||
|
(begin
|
||
|
;; Write an open paren to stderr.
|
||
|
(run (echo "(") (= 1 2))
|
||
|
;; Copy a lot of stuff to stdout.
|
||
|
(run (cat /usr/dict/words))
|
||
|
;; Write a close paren to stderr.
|
||
|
(run (echo ")") (= 1 2))))
|
||
|
|
||
|
;; OK. Here, I have a port PORT1 built over a pipe
|
||
|
;; connected to the BEGIN subproc's stdout, and
|
||
|
;; PORT2 built over a pipe connected to the BEGIN
|
||
|
;; subproc's stderr.
|
||
|
(read port2) ; Should return the empty list.
|
||
|
(port->string port1)) ; Should return a big string.\end{code}
|
||
|
%
|
||
|
In order to avoid this problem, \ex{run/collecting} and \ex{run/collecting*}
|
||
|
first run the child process to completion, buffering all the output
|
||
|
streams in temp files (using the \ex{temp-file-channel} procedure, see below).
|
||
|
When the child process exits, ports open on the buffered output are returned.
|
||
|
This approach has two disadvantages over using pipes:
|
||
|
\begin{itemize}
|
||
|
\item The total output from the child output is temporarily written
|
||
|
to the disk before returning from \ex{run/collecting}. If this output
|
||
|
is some large intermediate result, the disk could fill up.
|
||
|
|
||
|
\item The child producer and {\Scheme} consumer are serialised; there is
|
||
|
no concurrency overlap in their execution.
|
||
|
\end{itemize}
|
||
|
%
|
||
|
However, it remains a simple solution that avoids deadlock. More
|
||
|
sophisticated solutions can easily be programmed up as
|
||
|
needed---\ex{run/collecting*} itself is only 12 lines of simple code.
|
||
|
|
||
|
See \ex{temp-file-channel} for more information on creating temp files
|
||
|
as communication channels.
|
||
|
\end{desc}
|
||
|
|
||
|
\section{Conditional process sequencing forms}
|
||
|
These forms allow conditional execution of a sequence of processes.
|
||
|
|
||
|
\dfn{||} {\vari{pf}1 \ldots \var{pf}n} {\boolean} {syntax}
|
||
|
\begin{desc}
|
||
|
Run each proc until one completes successfully (\ie, exit status zero).
|
||
|
Return true if some proc completes successfully; otherwise \sharpf.
|
||
|
\end{desc}
|
||
|
|
||
|
\dfn{\&\&} {\vari{pf}1 \ldots \var{pf}n} {\boolean} {syntax}
|
||
|
\begin{desc}
|
||
|
Run each proc until one fails (\ie, exit status non-zero).
|
||
|
Return true if all procs complete successfully; otherwise \sharpf.
|
||
|
\end{desc}
|
||
|
|
||
|
\section{Process filters}
|
||
|
|
||
|
These procedures are useful for forking off processes to filter
|
||
|
text streams.
|
||
|
|
||
|
\begin{defundesc}{char-filter}{filter}{\proc}
|
||
|
The \var{filter} argument is a character$\rightarrow$character procedure.
|
||
|
Returns a procedure that when called, repeatedly reads a character
|
||
|
from the current input port, applies \var{filter} to the character,
|
||
|
and writes the result to the current output port.
|
||
|
The procedure returns upon reaching eof on the input port.
|
||
|
|
||
|
For example, to downcase a stream of text in a spell-checking pipeline,
|
||
|
instead of using the {\Unix} \ex{tr A-Z a-z} command, we can say:
|
||
|
\begin{code}
|
||
|
(run (| (delatex)
|
||
|
(begin ((char-filter char-downcase))) ; tr A-Z a-z
|
||
|
(spell)
|
||
|
(sort)
|
||
|
(uniq))
|
||
|
(< scsh.tex)
|
||
|
(> spell-errors.txt))\end{code}
|
||
|
\end{defundesc}
|
||
|
|
||
|
\begin{defundesc}{string-filter}{filter [buflen]}{\proc}
|
||
|
The \var{filter} argument is a string$\rightarrow$string procedure.
|
||
|
Returns a procedure that when called, repeatedly reads a string
|
||
|
from the current input port, applies \var{filter} to the string,
|
||
|
and writes the result to the current output port.
|
||
|
The procedure returns upon reaching eof on the input port.
|
||
|
|
||
|
The optional \var{buflen} argument controls the number of characters
|
||
|
each internal read operation requests; this means that \var{filter}
|
||
|
will never be applied to a string longer than \var{buflen} chars.
|
||
|
The default \var{buflen} value is 1024.
|
||
|
\end{defundesc}
|