3185 lines
133 KiB
TeX
3185 lines
133 KiB
TeX
%&latex -*- latex -*-
|
|
|
|
\chapter{System Calls}
|
|
\label{chapt:syscalls}
|
|
|
|
Scsh provides (almost) complete access to the basic {\Unix} kernel services:
|
|
processes, files, signals and so forth. These procedures comprise a
|
|
{\Scheme} binding for {\Posix}, with a few of the more standard extensions
|
|
thrown in (\eg, symbolic links, \ex{fchown}, \ex{fstat}, sockets).
|
|
|
|
|
|
\section{Errors}
|
|
Scsh syscalls never return error codes, and do not use a global
|
|
\ex{errno} variable to report errors.
|
|
Errors are consistently reported by raising exceptions.
|
|
This frees up the procedures to return useful values,
|
|
and allows the programmer to assume that
|
|
\emph{if a syscall returns, it succeeded.}
|
|
This greatly simplifies the flow of the code from the programmer's point
|
|
of view.
|
|
|
|
Since {\Scheme} does not yet have a standard exception system, the scsh
|
|
definition remains somewhat vague on the actual form of exceptions
|
|
and exception handlers. When a standard exception system is defined,
|
|
scsh will move to it. For now, scsh uses the {\scm} exception system,
|
|
with a simple sugaring on top to hide the details in the common case.
|
|
|
|
System call error exceptions contain the {\Unix} \ex{errno} code reported by
|
|
the system call. Unlike C, the \ex{errno} value is a part of the exception
|
|
packet, it is \emph{not} accessed through a global variable.
|
|
|
|
For reference purposes, the {\Unix} \ex{errno} numbers
|
|
are bound to the variables \ex{errno/perm}, \ex{errno/noent}, {\etc}
|
|
System calls never return \ex{error/intr}---they
|
|
automatically retry.
|
|
|
|
\begin{dfndesc}
|
|
{errno-error}{errno syscall .\ data}{\noreturn}{procedure}
|
|
Raises a {\Unix} error exception for {\Unix} error number \var{errno}.
|
|
The \var{syscall} and \var{data} arguments are packaged up in the exception
|
|
packet passed to the exception handler.
|
|
\end{dfndesc}
|
|
|
|
\defunx{with-errno-handler*}{handler thunk}{value(s) of thunk}
|
|
\begin{dfndescx}
|
|
{with-errno-handler}{handler-spec . body}{\valueofbody}{syntax}
|
|
{\Unix} syscalls raise error exceptions by calling \ex{errno-error}.
|
|
Programs can use \ex{with-errno-handler*} to establish
|
|
handlers for these exceptions.
|
|
|
|
If a {\Unix} error arises while \var{thunk} is executing,
|
|
\var{handler} is called on two arguments:
|
|
\codex{(\var{handler} \var{errno} \var{packet})}
|
|
\var{packet} is a list of the form
|
|
$$\var{packet} = \ex{(\var{errno-msg} \var{syscall} . \var{data})},$$
|
|
where \var{errno-msg} is the standard {\Unix} error message for the error,
|
|
\var{syscall} is the procedure that generated the error,
|
|
and \var{data} is a list of information generated by the error,
|
|
which varies from syscall to syscall.
|
|
|
|
If \var{handler} returns, the handler search continues upwards.
|
|
\var{Handler} can acquire the exception by invoking a saved continuation.
|
|
This procedure can be sugared over with the following syntax:
|
|
%
|
|
\begin{code}
|
|
(with-errno-handler
|
|
((\var{errno} \var{packet}) \var{clause} \ldots)
|
|
\var{body1}
|
|
\var{body2}
|
|
\ldots)\end{code}
|
|
%
|
|
This form executes the body forms with a particular errno handler installed.
|
|
When an errno error is raised, the handler search machinery will
|
|
bind variable \var{errno} to the error's integer code, and variable
|
|
\var{packet} to the error's auxiliary data packet.
|
|
Then, the clauses will be checked for a match.
|
|
The first clause that matches is executed, and its value is the
|
|
value of the entire \ex{with-errno-handler} form.
|
|
If no clause matches, the handler search continues.
|
|
|
|
Error clauses have two forms
|
|
%
|
|
\begin{code}
|
|
((\var{errno} \ldots) \var{body} \ldots)
|
|
(else \var{body} \ldots)\end{code}
|
|
%
|
|
In the first type of clause, the \var{errno} forms are integer expressions.
|
|
They are evaluated and compared to the error's errno value.
|
|
An \ex{else} clause matches any errno value.
|
|
Note that the \var{errno} and \var{data}
|
|
variables are lexically visible to the error clauses.
|
|
|
|
Example:
|
|
\begin{code}
|
|
(with-errno-handler
|
|
((errno packet) ; Only handle 3 particular errors.
|
|
((errno/wouldblock errno/again)
|
|
(loop))
|
|
((errno/acces)
|
|
(format #t "Not allowed access!")
|
|
#f))
|
|
|
|
(foo frobbotz)
|
|
(blatz garglemumph))\end{code}
|
|
%
|
|
It is not defined what dynamic context the handler executes in,
|
|
so fluid variables cannot reliably be referenced.
|
|
|
|
Note that Scsh system calls always retry when interrupted, so that
|
|
the \ex{errno/intr} exception is never raised.
|
|
If the programmer wishes to abort a system call on an interrupt, he
|
|
should have the interrupt handler explicitly raise an exception or
|
|
invoke a stored continuation to throw out of the system call.
|
|
\end{dfndescx}
|
|
|
|
|
|
\subsection{Interactive mode and error handling}
|
|
Scsh runs in two modes: interactive and script mode. It starts up in
|
|
interactive mode if the scsh interpreter is started up with no script
|
|
argument. Otherwise, scsh starts up in script mode. The mode determines
|
|
whether scsh prints prompts in between reading and evaluating forms, and it
|
|
affects the default error handler. In interactive mode, the default error
|
|
handler will report the error, and generate an interactive breakpoint so that
|
|
the user can interact with the system to examine, fix, or dismiss from the
|
|
error. In script mode, the default error handler causes the scsh process to
|
|
exit.
|
|
|
|
When scsh forks a child with \ex{(fork)}, the child resets to script mode.
|
|
This can be overridden if the programmer wishes.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{I/O}
|
|
|
|
\subsection{Standard {\R4RS} I/O procedures}
|
|
In scsh, most standard {\R4RS} i/o operations (such as \ex{display} or
|
|
\ex{read-char}) work on both integer file descriptors and {\Scheme} ports.
|
|
When doing i/o with a file descriptor, the i/o operation is done
|
|
directly on the file, bypassing any buffered data that may have
|
|
accumulated in an associated port.
|
|
Note that character-at-a-time operations such as \ex{read-char}
|
|
are likely to be quite slow when performed directly upon file
|
|
descriptors.
|
|
|
|
The standard {\R4RS} procedures \ex{read-char}, \ex{char-ready?}, \ex{write},
|
|
\ex{display}, \ex{newline},
|
|
and \ex{write-char} are all generic, accepting integer file descriptor
|
|
arguments as well as ports.
|
|
Scsh also mandates the availability of \ex{format}, and further requires
|
|
\ex{format} to accept file descriptor arguments as well as ports.
|
|
|
|
The procedures \ex{peek-char} and \ex{read} do \emph{not} accept
|
|
file descriptor arguments, since these functions require the ability to
|
|
read ahead in the input stream, a feature not supported by {\Unix} I/O.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Port manipulation and standard ports}
|
|
\defun {close-after} {port consumer} {value(s) of consumer}
|
|
\begin{desc}
|
|
Returns \ex{(\var{consumer} \var{port})}, but closes the port on return.
|
|
No dynamic-wind magic. \remark{Is there a less-awkward name?}
|
|
\end{desc}
|
|
|
|
\defun {error-output-port}{} {port}
|
|
\begin{desc}
|
|
This procedure is analogous to \ex{current-output-port}, but produces
|
|
a port used for error messages---the scsh equivalent of stderr.
|
|
\end{desc}
|
|
|
|
\defun {with-current-input-port*} {port thunk} {value(s) of thunk}
|
|
\defunx {with-current-output-port*} {port thunk} {value(s) of thunk}
|
|
\defunx {with-error-output-port*} {port thunk} {value(s) of thunk}
|
|
\begin{desc}
|
|
These procedures install \var{port} as the current input, current output,
|
|
and error output port, respectively, for the duration of a call to
|
|
\var{thunk}.
|
|
\end{desc}
|
|
|
|
\dfn {with-current-input-port} {port . body} {value(s) of body} {syntax}
|
|
\dfnx {with-current-output-port} {port . body} {value(s) of body} {syntax}
|
|
\dfnx {with-error-output-port} {port . body} {value(s) of body} {syntax}
|
|
\begin{desc}
|
|
These special forms are simply syntactic sugar for the
|
|
{\ttt with\=current\=input\=port*} procedure and friends.
|
|
\end{desc}
|
|
|
|
\defun {set-current-input-port!} {port}{\undefined}
|
|
\defunx{set-current-output-port!}{port}{\undefined}
|
|
\defunx{set-error-output-port!} {port}{\undefined}
|
|
\begin{desc}
|
|
These procedures alter the dynamic binding of the current I/O port procedures
|
|
to new values.
|
|
\end{desc}
|
|
|
|
\defun {close} {fd/port} {\boolean}
|
|
\begin{desc}
|
|
Close the port or file descriptor.
|
|
|
|
If \var{fd/port} is a file descriptor, and it has a port allocated to it,
|
|
the port is shifted to a new file descriptor created with \ex{(dup
|
|
fd/port)} before closing \ex{fd/port}. The port then has its revealed
|
|
count set to zero. This reflects the design criteria that ports are not
|
|
associated with file descriptors, but with open files.
|
|
|
|
To close a file descriptor, and any associated port it might have, you
|
|
must instead say one of (as appropriate):
|
|
\begin{code}
|
|
(close (fdes->inport fd))
|
|
(close (fdes->outport fd))\end{code}
|
|
|
|
The procedure returns true if it closed an open port.
|
|
If the port was already closed, it returns false;
|
|
this is not an error.
|
|
\end{desc}
|
|
|
|
\defun {stdports->stdio}{} {\undefined}
|
|
\defunx {stdio->stdports}{} {\undefined}
|
|
\begin{desc}
|
|
These two procedures are used to synchronise Unix' standard I/O
|
|
file descriptors and Scheme's current I/O ports.
|
|
|
|
\ex{(stdports->stdio)} causes the standard I/O file descriptors
|
|
(0, 1, and 2) to take their values from the current I/O ports.
|
|
It is exactly equivalent to the series of
|
|
redirections:\footnote{Why not \ex{move->fdes}?
|
|
Because the current output port and error port
|
|
might be the same port.}
|
|
\begin{code}
|
|
(dup (current-input-port) 0)
|
|
(dup (current-output-port) 1)
|
|
(dup (error-output-port) 2)\end{code}
|
|
%
|
|
\ex{stdio->stdports} causes the bindings of the current I/O ports
|
|
to be changed to ports constructed over the standard I/O file
|
|
descriptors.
|
|
It is exactly equivalent to the series of assignments
|
|
\begin{code}
|
|
(set-current-input-port! (fdes->inport 0))
|
|
(set-current-output-port! (fdes->outport 1))
|
|
(set-error-output-port! (fdes->outport 2))\end{code}
|
|
However, you are more likely to find the dynamic-extent variant,
|
|
\ex{with-stdio-ports*}, below, to be of use in general programming.
|
|
\end{desc}
|
|
|
|
\defun{with-stdio-ports*} {thunk} {value(s) of thunk}
|
|
\dfnx {with-stdio-ports} {body \ldots} {value(s) of body}{syntax}
|
|
\begin{desc}
|
|
\ex{with-stdio-ports*} binds the standard ports \ex{(current-input-port)},
|
|
\ex{(current-output-port)}, and \ex{(error-output-port)} to be ports
|
|
on file descriptors 0, 1, 2, and then calls \var{thunk}.
|
|
It is equivalent to:
|
|
\begin{code}
|
|
(with-current-input-port (fdes->inport 0)
|
|
(with-current-output-port (fdes->inport 1)
|
|
(with-error-output-port (fdes->outport 2)
|
|
(thunk))))\end{code}
|
|
%
|
|
The \ex{with-stdio-ports} special form is merely syntactic sugar.
|
|
\end{desc}
|
|
|
|
|
|
|
|
|
|
\subsection{String ports}
|
|
{\scm} has string ports, which you can use. Scsh has not committed to the
|
|
particular interface or names that {\scm} uses, so be warned that the
|
|
interface described herein may be liable to change.
|
|
|
|
\defun {make-string-input-port} {string} {\port}
|
|
\begin{desc}
|
|
Returns a port that reads characters from the supplied string.
|
|
\end{desc}
|
|
|
|
\defun {make-string-output-port} {} {\port}
|
|
\defunx {string-output-port-output} {port} {\str}
|
|
\begin{desc}
|
|
A string output port is a port that collects the characters given to it into
|
|
a string.
|
|
The accumulated string is retrieved by applying \ex{string-output-port-output}
|
|
to the port.
|
|
\end{desc}
|
|
|
|
\defun {call-with-string-output-port} {procedure} {\str}
|
|
\begin{desc}
|
|
The \var{procedure} value is called on a port. When it returns,
|
|
\ex{call-with-string-output-port} returns a string containing the
|
|
characters that were written to that port during the execution
|
|
of \var{procedure}.
|
|
\end{desc}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Revealed ports and file descriptors}
|
|
|
|
The material in this section and the following one is not critical for most
|
|
applications.
|
|
You may safely skim or completely skip this section on a first reading.
|
|
|
|
Dealing with {\Unix} file descriptors in a {\Scheme} environment is difficult.
|
|
In {\Unix}, open files are part of the process environment, and are referenced
|
|
by small integers called \emph{file descriptors}. Open file descriptors are
|
|
the fundamental way i/o redirections are passed to subprocesses, since
|
|
file descriptors are preserved across fork's and exec's.
|
|
|
|
{\Scheme}, on the other hand, uses ports for specifying i/o sources. Ports are
|
|
garbage-collected {\Scheme} objects, not integers. Ports can be garbage
|
|
collected; when a port is collected, it is also closed. Because file
|
|
descriptors are just integers, it's impossible to garbage collect them---you
|
|
wouldn't be able to close file descriptor 3 unless there were no 3's in the
|
|
system, and you could further prove that your program would never again
|
|
compute a 3. This is difficult at best.
|
|
|
|
If a {\Scheme} program only used {\Scheme} ports, and never actually used
|
|
file descriptors, this would not be a problem. But {\Scheme} code
|
|
must descend to the file descriptor level in at least two circumstances:
|
|
%
|
|
\begin{itemize}
|
|
\item when interfacing to foreign code
|
|
\item when interfacing to a subprocess.
|
|
\end{itemize}
|
|
%
|
|
This causes a problem. Suppose we have a {\Scheme} port constructed
|
|
on top of file descriptor 2. We intend to fork off a program that
|
|
will inherit this file descriptor. If we drop references to the port,
|
|
the garbage collector may prematurely close file 2 before we fork
|
|
the subprocess. The interface described below is intended to fix this and
|
|
other problems arising from the mismatch between ports and file descriptors.
|
|
|
|
The {\Scheme} kernel maintains a port table that maps a file descriptor
|
|
to the {\Scheme} port allocated for it (or, {\sharpf} if there is no port
|
|
allocated for this file descriptor). This is used to ensure that
|
|
there is at most one open port for each open file descriptor.
|
|
|
|
The port data structure for file ports has two fields besides the descriptor:
|
|
\var{revealed} and \var{closed?}.
|
|
When a file port is closed with \ex{(close port)},
|
|
the port's file descriptor is closed, its entry in the port table is cleared,
|
|
and the port's \var{closed?} field is set to true.
|
|
|
|
When a file descriptor is closed with \ex{(close fdes)}, any associated
|
|
port is shifted to a new file descriptor created with \ex{(dup fdes)}.
|
|
The port has its revealed count reset to zero (and hence becomes eligible
|
|
for closing on GC). See discussion below.
|
|
To really put a stake through a descriptor's heart without waiting for
|
|
associated ports to be GC'd, you must say one of
|
|
%
|
|
\begin{code}
|
|
(close (fdes->inport fdes))
|
|
(close (fdes->output fdes))\end{code}
|
|
|
|
The \var{revealed} field is an aid to garbage collection. It is an integer
|
|
semaphore. If it is zero, the port's file descriptor can be closed when
|
|
the port is collected. Essentially, the \var{revealed} field reflects whether
|
|
or not the port's file descriptor has escaped to the {\Scheme} user. If
|
|
the {\Scheme} user doesn't know what file descriptor is associated with
|
|
a given port, then he can't possibly retain an ``integer handle'' on the
|
|
port after dropping pointers to the port itself, so the garbage collector
|
|
is free to close the file.
|
|
|
|
Ports allocated with \ex{open-output-file} and \ex{open-input-file} are
|
|
unrevealed ports---\ie, \var{revealed} is initialised to 0.
|
|
No one knows the port's file descriptor, so the file descriptor can be closed
|
|
when the port is collected.
|
|
|
|
The functions \ex{fdes->output-port}, \ex{fdes->input-port}, \ex{port->fdes}
|
|
are used to shift back and forth between file descriptors and ports. When
|
|
\ex{port->fdes} reveals a port's file descriptor, it increments the port's
|
|
\var{revealed} field. When the user is through with the file descriptor, he
|
|
can call \ex{(release-port-handle \var{port})}, which decrements the count.
|
|
The function \ex{(call/fdes fd/port \var{proc})} automates this protocol.
|
|
\ex{call/fdes} uses \ex{dynamic-wind} to enforce the protocol.
|
|
If \var{proc} throws out of the \ex{call/fdes} application,
|
|
the unwind handler releases the descriptor handle;
|
|
if the user subsequently tries to throw \emph{back} into \var{proc}'s
|
|
context, the wind handler raises an error. When the user maps a file
|
|
descriptor to a port with \ex{fdes->outport} or \ex{fdes->inport}, the port
|
|
has its revealed field incremented.
|
|
|
|
Not all file descriptors are created by requests to make ports. Some are
|
|
inherited on process invocation via \ex{exec(2)}, and are simply part of the
|
|
global environment. Subprocesses may depend upon them, so if a port is later
|
|
allocated for these file descriptors, is should be considered as a revealed
|
|
port. For example, when the {\Scheme} shell's process starts up, it opens ports
|
|
on file descriptors 0, 1, and 2 for the initial values of
|
|
\ex{(current-input-port)}, \ex{(current-output-port)}, and
|
|
\ex{(error-output-port)}.
|
|
These ports are initialised with \var{revealed} set to 1,
|
|
so that stdin, stdout, and stderr are not closed even if the user drops the
|
|
port.
|
|
|
|
Unrevealed file ports have the nice property that they can be closed when all
|
|
pointers to the port are dropped. This can happen during gc, or at an
|
|
\ex{exec()}---since all memory is dropped at an \ex{exec()}. No one knows the
|
|
file descriptor associated with the port, so the exec'd process certainly
|
|
can't refer to it.
|
|
|
|
This facility preserves the transparent close-on-collect property
|
|
for file ports that are used in straightforward ways, yet allows
|
|
access to the underlying {\Unix} substrate without interference from
|
|
the garbage collector. This is critical, since shell programming
|
|
absolutely requires access to the {\Unix} file descriptors, as their
|
|
numerical values are a critical part of the process interface.
|
|
|
|
A port's underlying file descriptor can be shifted around with \ex{dup(2)}
|
|
when convenient. That is, the actual file descriptor on top of which a port is
|
|
constructed can be shifted around underneath the port by the scsh kernel when
|
|
necessary. This is important, because when the user is setting up file
|
|
descriptors prior to a \ex{exec(2)}, he may explicitly use a file descriptor
|
|
that has already been allocated to some port. In this case, the scsh kernel
|
|
just shifts the port's file descriptor to some new location with \ex{dup},
|
|
freeing up its old descriptor. This prevents errors from happening in the
|
|
following scenario. Suppose we have a file open on port \ex{f}. Now we want
|
|
to run a program that reads input on file 0, writes output to file 1, errors
|
|
to file 2, and logs execution information on file 3. We want to run this
|
|
program with input from \ex{f}.
|
|
So we write:
|
|
%
|
|
\begin{code}
|
|
(run (/usr/shivers/bin/prog)
|
|
(> 1 output.txt)
|
|
(> 2 error.log)
|
|
(> 3 trace.log)
|
|
(= 0 ,f))\end{code}
|
|
%
|
|
Now, suppose by ill chance that, unbeknownst to us, when the operating system
|
|
opened \ex{f}'s file, it allocated descriptor 3 for it. If we blindly redirect
|
|
\ex{trace.log} into file descriptor 3, we'll clobber \ex{f}! However, the
|
|
port-shuffling machinery saves us: when the \ex{run} form tries to dup
|
|
\ex{trace.log}'s file descriptor to 3, \ex{dup} will notice that file
|
|
descriptor 3 is already associated with an unrevealed port (\ie, \ex{f}). So,
|
|
it will first move \ex{f} to some other file descriptor. This keeps \ex{f}
|
|
alive and well so that it can subsequently be dup'd into descriptor 0 for
|
|
\ex{prog}'s stdin.
|
|
|
|
The port-shifting machinery makes the following guarantee: a port is only
|
|
moved when the underlying file descriptor is closed, either by a \ex{close()}
|
|
or a \ex{dup2()} operation. Otherwise a port/file-descriptor association is
|
|
stable.
|
|
|
|
Under normal circumstances, all this machinery just works behind the scenes to
|
|
keep things straightened out. The only time the user has to think about it is
|
|
when he starts accessing file descriptors from ports, which he should almost
|
|
never have to do. If a user starts asking what file descriptors have been
|
|
allocated to what ports, he has to take responsibility for managing this
|
|
information.
|
|
|
|
\subsection{Port-mapping machinery}
|
|
|
|
The procedures provided in this section are almost never needed.
|
|
You may safely skim or completely skip this section on a first reading.
|
|
|
|
Here are the routines for manipulating ports in scsh. The important
|
|
points to remember are:
|
|
\begin{itemize}
|
|
\item A file port is associated with an open file, not a particular file
|
|
descriptor.
|
|
\item The association between a file port and a particular file descriptor
|
|
is never changed \emph{except} when the file descriptor is explicitly
|
|
closed. ``Closing'' includes being used as the target of a \ex{dup2}, so
|
|
the set of procedures below that close their targets are
|
|
\ex{close}, two-argument \ex{dup}, and \ex{move->fdes}.
|
|
If the target file descriptor of one of these routines has an
|
|
allocated port, the port will be shifted to another freshly-allocated
|
|
file descriptor, and marked as unrevealed, thus preserving the port
|
|
but freeing its old file descriptor.
|
|
\end{itemize}
|
|
These rules are what is necessary to ``make things work out'' with no
|
|
surprises in the general case.
|
|
|
|
\defun {fdes->inport} {fd} {port}
|
|
\defunx {fdes->outport} {fd} {port}
|
|
\defunx {port->fdes} {port} {\fixnum}
|
|
\begin{desc}
|
|
These increment the port's revealed count.
|
|
\end{desc}
|
|
|
|
\defun {port-revealed} {port} {{\integer} or \sharpf}
|
|
\begin{desc}
|
|
Return the port's revealed count if positive, otherwise \sharpf.
|
|
\end{desc}
|
|
|
|
\defun{release-port-handle} {port} {\undefined}
|
|
\begin{desc}
|
|
Decrement the port's revealed count.
|
|
\end{desc}
|
|
|
|
\defun {call/fdes} {fd/port consumer} {value(s) of consumer}
|
|
\begin{desc}
|
|
Calls \var{consumer} on a file descriptor;
|
|
takes care of revealed bookkeeping.
|
|
If \var{fd/port} is a file descriptor, this is just
|
|
\ex{(\var{consumer} \var{fd/port})}.
|
|
If \var{fd/port} is a port,
|
|
calls \var{consumer} on its underlying file descriptor.
|
|
While \var{consumer} is running, the port's revealed count is incremented.
|
|
|
|
When \ex{call/fdes} is called with port argument, you are not allowed to
|
|
throw into \var{consumer} with a stored continuation, as that would violate
|
|
the revealed-count bookkeeping.
|
|
\end{desc}
|
|
|
|
\defun{move->fdes} {fd/port target-fd} {port or fdes}
|
|
\begin{desc}
|
|
Maps fd$\rightarrow$fd and port$\rightarrow$port.
|
|
|
|
If \var{fd/port} is a file-descriptor not equal to \var{target-fd},
|
|
dup it to \var{target-fd} and close it. Returns \var{target-fd}.
|
|
|
|
If \var{fd/port} is a port, it is shifted to \var{target-fd},
|
|
by duping its underlying file-descriptor if necessary.
|
|
\var{Fd/port}'s original file descriptor is
|
|
closed (if it was different from \var{target-fd}).
|
|
Returns the port.
|
|
This operation resets \var{fd/port}'s revealed count to 1.
|
|
|
|
In all cases when \var{fd/port} is actually shifted, if there is a port
|
|
already using \var{target-fd}, it is first relocated to some other file
|
|
descriptor.
|
|
\end{desc}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{{\Unix} I/O}
|
|
|
|
\defun {dup} {fd/port [newfd]} {fd/port}
|
|
\defunx{dup->inport} {fd/port [newfd]} {port}
|
|
\defunx{dup->outport} {fd/port [newfd]} {port}
|
|
\defunx{dup->fdes} {fd/port [newfd]} {fd}
|
|
\begin{desc}
|
|
These procedures provide the functionality of C's \ex{dup()} and \ex{dup2()}.
|
|
The different routines return different types of values:
|
|
\ex{dup->inport}, \ex{dup->outport}, and \ex{dup->fdes} return
|
|
input ports, output ports, and integer file descriptors, respectively.
|
|
\ex{dup}'s return value depends on on the type of
|
|
\var{fd/port}---it maps fd$\rightarrow$fd and port$\rightarrow$port.
|
|
|
|
These procedures use the {\Unix} \ex{dup()} syscall to replicate
|
|
the file descriptor or file port \var{fd/port}.
|
|
If a \var{newfd} file descriptor is given, it is used as the target of
|
|
the dup operation, \ie, the operation is a \ex{dup2()}.
|
|
In this case, procedures that return a port (such as \ex{dup->inport})
|
|
will return one with the revealed count set to one.
|
|
For example, \ex{(dup (current-input-port) 5)} produces
|
|
a new port with underlying file descriptor 5, whose revealed count is 1.
|
|
If \var{newfd} is not specified,
|
|
then the operating system chooses the file descriptor,
|
|
and any returned port is marked as unrevealed.
|
|
|
|
If the \var{newfd} target is given,
|
|
and some port is already using that file descriptor,
|
|
the port is first quietly shifted (with another \ex{dup})
|
|
to some other file descriptor (zeroing its revealed count).
|
|
|
|
Since {\Scheme} doesn't provide read/write ports,
|
|
\ex{dup->inport} and \ex{dup->outport} can be useful for
|
|
getting an output version of an input port, or \emph{vice versa}.
|
|
For example, if \ex{p} is an input port open on a tty, and
|
|
we would like to do output to that tty, we can simply use
|
|
\ex{(dup->outport p)} to produce an equivalent output port for the tty.
|
|
\end{desc}
|
|
|
|
\defun {seek} {fd/port offset [whence]} {\integer}
|
|
\begin{desc}
|
|
Reposition the I/O cursor for a file descriptor or port.
|
|
\var{whence} is one of \{\ex{seek/set}, \ex{seek/delta}, \ex{seek/end}\},
|
|
and defaults to \ex{seek/set}.
|
|
If \ex{seek/set}, then \var{offset} is an absolute index into the file;
|
|
if \ex{seek/delta}, then \var{offset} is a relative offset from the current
|
|
I/O cursor;
|
|
if \ex{seek/end}, then \var{offset} is a relative offset from the end of file.
|
|
The \var{fd/port} argument may be a port or an integer file descriptor.
|
|
Not all such values are seekable;
|
|
this is dependent on the OS implementation.
|
|
The return value is the resulting position of the I/O cursor in the I/O stream.
|
|
\oops{The current implementation doesn't handle \var{offset} arguments
|
|
that are not immediate integers (\ie, representable in 30 bits).}
|
|
\end{desc}
|
|
|
|
|
|
\defun {tell} {fd/port} {\integer}
|
|
\begin{desc}
|
|
Returns the position of the I/O cursor in the the I/O stream.
|
|
Not all file descriptors or ports support cursor-reporting;
|
|
this is dependent on the OS implementation.
|
|
\end{desc}
|
|
|
|
\begin{defundesc} {open-file} {fname flags [perms]} {\port}
|
|
\var{Perms} defaults to \cd{#o666}.
|
|
\var{Flags} is an integer bitmask, composed by or'ing together constants
|
|
listed in table~\ref{table:fdes-status-flags}
|
|
(page~\pageref{table:fdes-status-flags}).
|
|
You must use exactly one of the \ex{open/read}, \ex{open/write}, or
|
|
\ex{open/read+write} flags.
|
|
%
|
|
The returned port is an input port if the \var{flags} permit it,
|
|
otherwise an output port. \R4RS/\scm/scsh do not have input/output ports,
|
|
so it's one or the other. This should be fixed. (You can hack simultaneous
|
|
i/o on a file by opening it r/w, taking the result input port,
|
|
and duping it to an output port with \ex{dup->outport}.)
|
|
\end{defundesc}
|
|
|
|
\defun{open-input-file}{fname [flags]}\port
|
|
\begin{defundescx}{open-output-file}{fname [flags perms]}\port
|
|
These are equivalent to \ex{open-file}, after first setting the
|
|
read/write bits of the \var{flags} argument to \ex{open/read} or
|
|
\ex{open/write}, respectively.
|
|
\var{Flags} defaults to zero for \ex{open-input-file},
|
|
and
|
|
\codex{(bitwise-ior open/create open/truncate)}
|
|
for \ex{open-output-file}.
|
|
These defaults make the procedures backwards-compatible with their
|
|
unary {\R4RS} definitions.
|
|
\end{defundescx}
|
|
|
|
\begin{defundesc} {open-fdes} {fname flags [perms]} \integer
|
|
Returns a file descriptor.
|
|
\end{defundesc}
|
|
|
|
\defun{fdes-flags}{fd/port}{\integer}
|
|
\begin{defundescx}{set-fdes-flags}{fd/port \integer}{\undefined}
|
|
These procedures allow reading and writing of an open file's flags.
|
|
The only such flag defined by {\Posix} is \ex{fdflags/close-on-exec};
|
|
your {\Unix} implementation may provide others.
|
|
|
|
These procedures should not be particularly useful to the programmer,
|
|
as the scsh runtime already provides automatic control of the close-on-exec
|
|
property.
|
|
Unrevealed ports always have their file descriptors marked
|
|
close-on-exec, as they can be closed when the scsh process execs a new program.
|
|
Whenever the user reveals or unreveals a port's file descriptor,
|
|
the runtime automatically sets or clears the flag for the programmer.
|
|
Programmers that manipulate this flag should be aware of these extra, automatic
|
|
operations.
|
|
\end{defundescx}
|
|
|
|
\defun{fdes-status}{fd/port}{\integer}
|
|
\begin{defundescx}{set-fdes-status}{fd/port \integer}{\undefined}
|
|
These procedures allow reading and writing of an open file's status flags
|
|
(table~\ref{table:fdes-status-flags}).
|
|
%
|
|
\begin{table}
|
|
\begin{center}
|
|
\begin{tabular}{@{}rp{1.5in}>{\ttfamily}l@{}}
|
|
& Allowed operations & Status flag \\ \cline{2-3}
|
|
\textbf{Open+Get+Set} &
|
|
\parbox[t]{1.5in}{\raggedright
|
|
These flags can be used in \ex{open-file}, \ex{fdes-status},
|
|
and \ex{set-fdes-status} calls.} &
|
|
%
|
|
\begin{tabular}[t]{@{}>{\ttfamily}l@{}}
|
|
%% These are gettable and settable
|
|
open/append \\
|
|
open/non-blocking \\
|
|
open/async \textrm{(Non-\Posix)} \\
|
|
open/fsync \textrm{(Non-\Posix)}
|
|
\end{tabular}
|
|
\\\cline{2-3}
|
|
\textbf{Open+Get} &
|
|
\parbox[t]{1.5in}{\raggedright
|
|
These flags can be used in \ex{open-file} and \ex{fdes-status} calls,
|
|
but are ignored by \ex{set-fdes-status}.\strut} &
|
|
%
|
|
\begin{tabular}[t]{@{}>{\ttfamily}l@{}}
|
|
%% These are gettable, not settable
|
|
open/read \\
|
|
open/write \\
|
|
open/read+write \\
|
|
open/access-mask
|
|
\end{tabular}
|
|
\\\cline{2-3}
|
|
\textbf{Open} &
|
|
\parbox[t]{1.5in}{\raggedright
|
|
These flags are only relevant in
|
|
\ex{open-file} calls;
|
|
they are ignored by \ex{fdes-status} and \ex{set-fdes-status} calls.} &
|
|
%
|
|
\begin{tabular}[t]{@{}>{\ttfamily}l@{}}
|
|
%% These are neither gettable nor settable.
|
|
open/create \\
|
|
open/exclusive \\
|
|
open/no-control-tty \\
|
|
open/truncate
|
|
\end{tabular}
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Status flags for \texttt{open-file},
|
|
\texttt{fdes-status} and \texttt{set-fdes-status}.
|
|
Only {\Posix} flags are guaranteed to be present;
|
|
your operating system may define others.
|
|
The \texttt{open/access-mask} value is not an actual flag,
|
|
but a bit mask used to select the field for the \texttt{open/read},
|
|
\texttt{open/write} and \texttt{open/read+write} bits.
|
|
}
|
|
\label{table:fdes-status-flags}
|
|
\end{table}
|
|
|
|
Note that this file-descriptor state is shared between file descriptors
|
|
created by \ex{dup}---if you create port \var{b} by applying \ex{dup}
|
|
to port \var{a}, and change {\var{b}}'s status flags, you will also have
|
|
changed {\var{a}}'s status flags.
|
|
\end{defundescx}
|
|
|
|
\begin{defundesc}{pipe}{} {[\var{rport} \var{wport}]}
|
|
Returns two ports, the read and write end-points of a {\Unix} pipe.
|
|
\end{defundesc}
|
|
|
|
\defun{read-string}{nbytes [fd/port]} {{\str} or \sharpf}
|
|
\dfnix{read-string!} {str [fd/port start end]} {nread or \sharpf}{procedure}
|
|
{read-string"!@\texttt{read-string"!}}
|
|
\begin{desc}
|
|
These calls read exactly as much data as you requested, unless
|
|
there is not enough data (eof).
|
|
\ex{read-string!} reads the data into string \var{str}
|
|
at the indices in the half-open interval $[\var{start},\var{end})$;
|
|
the default interval is the whole string: $\var{start}=0$ and
|
|
$\var{end}=\ex{(string-length \var{string})}$.
|
|
They will persistently retry on partial reads and when interrupted
|
|
until (1) error, (2) eof, or (3) the input request is completely
|
|
satisfied.
|
|
Partial reads can occur when reading from an intermittent source,
|
|
such as a pipe or tty.
|
|
|
|
\ex{read-string} returns the string read; \ex{read-string!} returns
|
|
the number of characters read. They both return false at eof.
|
|
A request to read zero bytes returns immediately, with no eof check.
|
|
|
|
The values of \var{start} and \var{end} must specify a well-defined
|
|
interval in \var{str},
|
|
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
|
|
|
|
Any partially-read data is included in the error exception packet.
|
|
Error returns on non-blocking input are considered an error.
|
|
\end{desc}
|
|
|
|
\defun {read-string/partial} {nbytes [fd/port]} {{\str} or \sharpf}
|
|
\dfnix{read-string!/partial} {str [fd/port start end]} {nread or \sharpf}
|
|
{procedure}{read-string"!/partial@\texttt{read-string"!/partial}}
|
|
\begin{desc}
|
|
%
|
|
These are atomic best-effort/forward-progress calls.
|
|
Best effort: they may read less than you request if there is a
|
|
lesser amount of data immediately available (\eg, because you
|
|
are reading from a pipe or a tty).
|
|
Forward progress: if no data is immediately available
|
|
(\eg, empty pipe), they will block.
|
|
Therefore, if you request an $n>0$ byte read,
|
|
while you may not get everything you asked for, you will always get something
|
|
(barring eof).
|
|
|
|
There is one case in which the forward-progress guarantee is cancelled:
|
|
when the programmer explicitly sets the port to non-blocking i/o.
|
|
In this case, if no data is immediately available,
|
|
the procedure will not block, but will immediately return a zero-byte read.
|
|
|
|
\ex{read-string/partial} reads the data into a freshly allocated string,
|
|
which it returns as its value.
|
|
\ex{read-string!/partial} reads the data into string \var{str}
|
|
at the indices in the half-open interval $[\var{start},\var{end})$;
|
|
the default interval is the whole string: $\var{start}=0$ and
|
|
$\var{end}=\ex{(string-length \var{string})}$.
|
|
The values of \var{start} and \var{end} must specify a well-defined
|
|
interval in \var{str},
|
|
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
|
|
It returns the number of bytes read.
|
|
|
|
A request to read zero bytes returns immediatedly, with no eof check.
|
|
|
|
In sum, there are only three ways you can get a zero-byte read:
|
|
(1) you request one, (2) you turn on non-blocking i/o, or (3) you
|
|
try to read at eof.
|
|
|
|
These are the routines to use for non-blocking input.
|
|
They are also useful when you wish to efficiently process data
|
|
in large blocks, and your algorithm is insensitive to the block size
|
|
of any particular read operation.
|
|
\end{desc}
|
|
|
|
\defun {select }{rvec wvec evec [timeout]}{[rvec' wvec' evec']}
|
|
\defunx{select!}{rvec wvec evec [timeout]}{[nr nw ne]}
|
|
\begin{desc}
|
|
The \ex{select} procedure allows a process to block and wait for events on
|
|
multiple I/O channels.
|
|
The \var{rvec} and \var{evec} arguments are vectors of input ports and
|
|
integer file descriptors; \var{wvec} is a vector of output ports and
|
|
integer file descriptors.
|
|
The procedure returns three vectors whose elements are subsets of the
|
|
corresponding arguments.
|
|
Every element of \var{rvec'} is ready for input;
|
|
every element of \var{wvec'} is ready for output;
|
|
every element of \var{evec'} has an exceptional condition pending.
|
|
|
|
The \ex{select} call will block until at least one of the I/O channels
|
|
passed to it is ready for operation.
|
|
The \var{timeout} value can be used to force the call to time-out
|
|
after a given number of seconds. It defaults to the special value
|
|
\ex{\#f}, meaning wait indefinitely. A zero value can be used to poll
|
|
the I/O channels.
|
|
|
|
If an I/O channel appears more than once in a given vector---perhaps
|
|
occuring once as a Scheme port, and once as the port's underlying
|
|
integer file descriptor---only one of these two references may appear
|
|
in the returned vector.
|
|
Buffered I/O ports are handled specially---if an input port's buffer is
|
|
not empty, or an output port's buffer is not yet full, then these
|
|
ports are immediately considered eligible for I/O without using
|
|
the actual, primitive \ex{select} system call to check the underlying
|
|
file descriptor.
|
|
This works pretty well for buffered input ports, but is a little
|
|
problematic for buffered output ports.
|
|
|
|
The \ex{select!} procedure is similar, but indicates the subset
|
|
of active I/O channels by side-effecting the argument vectors.
|
|
Non-active I/O channels in the argument vectors are overwritten with
|
|
{\sharpf} values.
|
|
The call returns the number of active elements remaining in each
|
|
vector.
|
|
As a convenience, the vectors passed in to \ex{select!} are
|
|
allowed to contain {\sharpf} values as well as integers and ports.
|
|
|
|
\remark{I have found the \ex{select!} interface to be the more
|
|
useful of the two. After the system call, it allows you
|
|
to check a specific I/O channel in constant time.}
|
|
\end{desc}
|
|
|
|
|
|
\begin{defundescx}{write-string}{string [fd/port start end]}\undefined
|
|
This procedure writes all the data requested.
|
|
If the procedure cannot perform the write with a single kernel call
|
|
(due to interrupts or partial writes),
|
|
it will perform multiple write operations until all the data is written
|
|
or an error has occurred.
|
|
A non-blocking i/o error is considered an error.
|
|
(Error exception packets for this syscall include the amount of
|
|
data partially transferred before the error occurred.)
|
|
|
|
The data written are the characters of \var{string} in the half-open
|
|
interval $[\var{start},\var{end})$.
|
|
The default interval is the whole string: $\var{start}=0$ and
|
|
$\var{end}=\ex{(string-length \var{string})}$.
|
|
The values of \var{start} and \var{end} must specify a well-defined
|
|
interval in \var{str},
|
|
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
|
|
A zero-byte write returns immediately, with no error.
|
|
|
|
Output to buffered ports: \ex{write-string}'s efforts end as soon
|
|
as all the data has been placed in the output buffer.
|
|
Errors and true output may not happen until a later time, of course.
|
|
\end{defundescx}
|
|
|
|
\begin{defundescx}{write-string/partial}{string [fd/port start end]}{nwritten}
|
|
This routine is the atomic best-effort/forward-progress analog
|
|
to \ex{write-string}.
|
|
It returns the number of bytes written, which may be less than you
|
|
asked for.
|
|
Partial writes can occur when (1) we write off the physical end of
|
|
the media, (2) the write is interrrupted, or (3) the file descriptor
|
|
is set for non-blocking i/o.
|
|
|
|
If the file descriptor is not set up for non-blocking i/o, then
|
|
a successful return from these procedures makes a forward progress
|
|
guarantee---that is, a partial write took place of at least one byte:
|
|
\begin{itemize}
|
|
\item If we are at the end of physical media, and no write takes place,
|
|
an error exception is raised.
|
|
So a return implies we wrote \emph{something}.
|
|
\item If the call is interrupted after a partial transfer, it returns
|
|
immediately. But if the call is interrupted before any data transfer,
|
|
then the write is retried.
|
|
\end{itemize}
|
|
|
|
If we request a zero-byte write, then the call immediately returns 0.
|
|
If the file descriptor is set for non-blocking i/o, then the call
|
|
may return 0 if it was unable to immediately write anything
|
|
(\eg, full pipe).
|
|
Barring these two cases, a write either returns $\var{nwritten} > 0$,
|
|
or raises an error exception.
|
|
|
|
Non-blocking i/o is only available on file descriptors and unbuffered
|
|
ports. Doing non-blocking i/o to a buffered port is not well-defined,
|
|
and is an error (the problem is the subsequent flush operation).
|
|
\end{defundescx}
|
|
|
|
\subsection{Buffered I/O}
|
|
|
|
{\scm} ports use buffered I/O---data is transferred to or from the
|
|
OS in blocks. Scsh provides control of this mechanism: the programmer
|
|
may force saved-up output data to be transferred to the OS when
|
|
he chooses,
|
|
and may also choose which I/O buffering policy to employ for a given
|
|
port (or turn buffering off completely).
|
|
|
|
It can be useful to turn I/O buffering off in some cases, for example
|
|
when an I/O stream is to be shared by multiple subprocesses.
|
|
For this reason, scsh allocates an unbuffered port for file descriptor 0
|
|
at start-up time.
|
|
Because shells frequently share stdin with subprocesses, if the shell
|
|
does buffered reads, it might ``steal'' input intended for a subprocess. For
|
|
this reason, all shells, including sh, csh, and scsh, read stdin unbuffered.
|
|
Applications that can tolerate buffered input on stdin can reset
|
|
\ex{(current-input-port)} to block buffering for higher performance.
|
|
|
|
\begin{defundesc}{set-port-buffering}{port policy [size]}\undefined
|
|
This procedure allows the programmer to assign a particular I/O buffering
|
|
policy to a port, and to choose the size of the associated buffer.
|
|
It may only be used on new ports, \ie, before I/O is performed on the port.
|
|
There are three buffering policies that may be chosen:
|
|
\begin{inset}
|
|
\begin{tabular}{l@{\qquad}l}
|
|
\ex{bufpol/block} & General block buffering (general default) \\
|
|
\ex{bufpol/line} & Line buffering (tty default) \\
|
|
\ex{bufpol/none} & Direct I/O---no buffering
|
|
\end{tabular}
|
|
\end{inset}
|
|
The line buffering policy flushes output whenever a newline is output;
|
|
whenever the buffer is full; or whenever an input is read from stdin.
|
|
Line buffering is the default for ports open on terminal devices.
|
|
|
|
The \var{size} argument requests an I/O buffer of \var{size} bytes.
|
|
If not given, a reasonable default is used; if given and zero,
|
|
buffering is turned off
|
|
(\ie, $\var{size} = 0$ for any policy is equivalent to
|
|
$\var{policy} = \ex{bufpol/none}$).
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{force-output} {[fd/port]}{\undefined}
|
|
This procedure does nothing when applied to an integer file descriptor
|
|
or unbuffered port.
|
|
It flushes buffered output when applied to a buffered port,
|
|
and raises a write-error exception on error. Returns no value.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{flush-all-ports} {}{\undefined}
|
|
This procedure flushes all open output ports with buffered data.
|
|
\end{defundesc}
|
|
|
|
\subsection{File locking}
|
|
|
|
Scsh provides {\Posix} advisory file locking.
|
|
\emph{Advisory} locks are locks that can be checked by user code,
|
|
but do not affect other I/O operations.
|
|
For example, if a process has an exclusive lock on a region of a file,
|
|
other processes will not be able to obtain locks on that region of the file,
|
|
but they will still be able to read and write the file with no hindrance.
|
|
Using advisory locks requires cooperation amongst the agents accessing
|
|
the shared resource.
|
|
|
|
\remark{
|
|
Unfortunately, {\Posix} file locks are associated with actual files,
|
|
not with associated open file descriptors.
|
|
Once a process locks a file, using some file descriptor \var{fd},
|
|
the next time \emph{any} file descriptor referencing that file is closed,
|
|
all associated locks are released.
|
|
This severely limits the utility of {\Posix} advisory file locks,
|
|
and we'd recommend caution when using them.
|
|
It is not without reason that the FreeBSD man pages refer to {\Posix}
|
|
file locking as ``completely stupid.''
|
|
|
|
Scsh moves Scheme ports from file descriptor to file descriptor with
|
|
\ex{dup()} and \ex{close()} as required by the runtime,
|
|
so it is impossible to keep file locks open across one of these shifts.
|
|
Hence we can only offer {\Posix} advisory file locking directly on raw
|
|
integer file descriptors;
|
|
regrettably, there are no facilities for locking Scheme ports.
|
|
|
|
Note that once a Scheme port is revealed in scsh, the runtime will not
|
|
shift the port around with \ex{dup()} and \ex{close()}.
|
|
This means the file-locking procedures can then be applied to the port's
|
|
associated file descriptor.
|
|
}
|
|
|
|
{\Posix} allows the user to lock a region of a file with either
|
|
an exclusive or shared lock.
|
|
Locked regions are described by the \emph{lock-region} record:
|
|
\begin{code}
|
|
(define-record lock-region
|
|
exclusive?
|
|
start
|
|
len
|
|
whence
|
|
proc)\end{code}%
|
|
\index{lock-region?}%
|
|
\index{lock-region:exclusive?} \index{lock-region:whence}%
|
|
\index{lock-region:start} \index{lock-region:end}%
|
|
\index{lock-region:len} \index{lock-region:proc}%
|
|
%
|
|
\begin{itemize}
|
|
\item
|
|
The \ex{exclusive?} field is true if the lock is exclusive;
|
|
false if it is shared.
|
|
|
|
\item
|
|
The \ex{whence} field is one of the values from the \ex{seek} call:
|
|
\ex{seek/set}, \ex{seek/delta}, or \ex{seek/end},
|
|
and determines the interpretation of the \ex{start} field:
|
|
\begin{itemize}
|
|
\item If \ex{seek/set}, the \ex{start} value is simply an absolute index
|
|
into the file.
|
|
\item If \ex{seek/delta}, the \ex{start} value is an offset from the
|
|
file descriptor's current position in the file.
|
|
\item If \ex{seek/end}, the \ex{start} value is an offset from the
|
|
end of the file.
|
|
\end{itemize}
|
|
The region of the file being locked is given by the \ex{start} and \ex{len}
|
|
fields;
|
|
if \ex{len} is zero, it means ``infinity,'' that is, the region extends
|
|
from the starting point through the end of the file, even as the file is
|
|
extended by subsequent write operations.
|
|
|
|
\item
|
|
The \ex{proc} field gives the process object for the process holding the region
|
|
lock, when relevant (see \ex{get-lock-region} below).
|
|
\end{itemize}
|
|
|
|
\begin{defundesc}{make-lock-region}{exclusive? start len [whence]}{lock-region}
|
|
This procedure makes a lock-region record.
|
|
The \ex{whence} field defaults to \ex{seek/set}.
|
|
\end{defundesc}
|
|
|
|
\defun {lock-region}{fdes lock}{\undefined}
|
|
\defunx{lock-region/no-block}{fdes lock}{\boolean}
|
|
\begin{desc}
|
|
These procedures lock a region of the file referenced by file descriptor
|
|
\var{fdes}.
|
|
The \ex{lock-region} procedure blocks until the lock is granted;
|
|
the non-blocking variant returns a boolean indicating whether or not
|
|
the lock was granted.
|
|
To take an exclusive (write) lock, you must have the file descriptor
|
|
open with write access;
|
|
to take a shared (read) lock, you must have the file descriptor
|
|
open with read access.
|
|
\end{desc}
|
|
|
|
\begin{defundesc}{get-lock-region}{fdes lock}{lock-region or \sharpf}
|
|
Return the first lock region on \var{fdes} that would conflict with
|
|
lock region \var{lock}.
|
|
If there is no such lock region, return false.
|
|
This procedure fills out the \ex{proc} field of the returned lock region,
|
|
and is the only procedure that has anything to do with this field.
|
|
(See section~\ref{sec:proc-objects} for a description of process objects.)
|
|
Note that if you apply this procedure to a file system that is shared
|
|
across multiple operating systems (\ie, an NFS file system), the \ex{proc}
|
|
field may be ambiguous.
|
|
We note, again, that {\Posix} advisory file locking is not a terribly useful
|
|
or well-designed facility.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{unlock-region}{fdes lock}{\undefined}
|
|
Release a lock from a file.
|
|
\end{defundesc}
|
|
|
|
\defun{with-region-lock*}{fdes lock thunk}{value(s) of thunk}
|
|
\dfnx{with-region-lock}{fdes lock body \ldots}{value(s) of body}{syntax}
|
|
\begin{desc}
|
|
This procedure obtains the requested lock, and then calls
|
|
\ex{(\var{thunk})}. When \var{thunk} returns, the lock is released.
|
|
A non-local exit (\eg, throwing to a saved continuation or raising
|
|
an exception) also causes the lock to be released.
|
|
|
|
After a normal return from \var{thunk}, its return values are returned
|
|
by \ex{with-region-lock*}.
|
|
The \ex{with-region-lock} special form is equivalent syntactic sugar.
|
|
\end{desc}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{File system}
|
|
|
|
Besides the following procedures, which allow access to the
|
|
computer's file system, scsh also provides a set of procedures
|
|
which manipulate file \emph{names}. These string-processing
|
|
procedures are documented in section \ref{sec:filenames}.
|
|
|
|
\defun {create-directory} {fname [perms override?]} {\undefined}
|
|
\defunx{create-fifo} {fname [perms override?]} {\undefined}
|
|
\defunx{create-hard-link} {oldname newname [override?]} {\undefined}
|
|
\begin{defundescx}
|
|
{create-symlink} {old-name new-name [override?]} {\undefined}
|
|
|
|
These procedures create objects of various kinds in the file system.
|
|
|
|
The \var{override?} argument controls the action if there is already an
|
|
object in the file system with the new name:
|
|
\begin{optiontable}
|
|
\sharpf & signal an error (default) \\
|
|
'query & prompt the user \\
|
|
\textnormal{\emph{other}}& \parbox[t]{0.7\linewidth}{
|
|
delete the old object (with \ex{delete-file}
|
|
or \ex{delete-directory,} as appropriate) before
|
|
creating the new object.}
|
|
|
|
\end{optiontable}
|
|
|
|
\var{Perms} defaults to \cd{#o777} (but is masked by the current umask).
|
|
|
|
\remark{Currently, if you try to create a hard or symbolic link from a
|
|
file to itself, you will error out with \var{override?} false, and simply
|
|
delete your file with \var{override?} true. Catching this will require
|
|
some sort of true-name procedure, which I currently do not have.}
|
|
\end{defundescx}
|
|
|
|
\defun {delete-directory} {fname} \undefined
|
|
\defunx{delete-file} {fname} \undefined
|
|
\begin{defundescx} {delete-filesys-object} {fname} \undefined
|
|
These procedures delete objects from the file system.
|
|
The {\ttt delete\=filesys\=object} procedure will delete an object
|
|
of any type from the file system: files, (empty) directories, symlinks, fifos,
|
|
\etc.
|
|
|
|
If the object being deleted doesn't exist, \ex{delete-directory} and
|
|
\ex{delete-file} raise an error,
|
|
while \ex{delete-filesys-object} simply returns.
|
|
\end{defundescx}
|
|
|
|
\begin{defundescx}{read-symlink}{fname} \str
|
|
Return the filename referenced by symbolic link \ex{fname}.
|
|
\end{defundescx}
|
|
|
|
\begin{defundescx} {rename-file} {old-fname new-fname [override?]} \undefined
|
|
If you override an existing object, then \var{old-fname}
|
|
and \var{new-fname} must type-match---either both directories,
|
|
or both non-directories.
|
|
This is required by the semantics of {\Unix} \ex{rename()}.
|
|
|
|
\remark{
|
|
There is an unfortunate atomicity problem with the \ex{rename-file}
|
|
procedure: if you
|
|
specify no-override, but create file \ex{new-fname} sometime between
|
|
\ex{rename-file}'s existence check and the actual rename operation,
|
|
your file will be clobbered with \ex{old-fname}. There is no way to fix
|
|
this problem, given the semantics of {\Unix} \ex{rename()};
|
|
at least it is highly unlikely to occur in practice.
|
|
}
|
|
\end{defundescx}
|
|
|
|
\defun {set-file-mode} {fname/fd/port mode} \undefined
|
|
\defunx{set-file-owner} {fname/fd/port uid} {\undefined}
|
|
\defunx{set-file-group} {fname/fd/port gid} {\undefined}
|
|
\begin{desc}
|
|
These procedures set the permission bits, owner id, and group id of a
|
|
file, respectively.
|
|
The file can be specified by giving the file name, or either an
|
|
integer file descriptor or a port open on the file.
|
|
Setting file user ownership usually requires root privileges.
|
|
\end{desc}
|
|
|
|
\defun {set-file-times} {fname [access-time mod-time]} {\undefined}
|
|
\begin{desc}
|
|
This procedure sets the access and modified times for the file
|
|
\var{fname} to the supplied values (see section~\ref{sec:time}
|
|
for the scsh representation of time).
|
|
If neither time argument is supplied, they are both taken to be
|
|
the current time. You must provide both times or neither.
|
|
If the procedure completes successfully, the file's time of last
|
|
status-change (\ex{ctime}) is set to the current time.
|
|
\end{desc}
|
|
|
|
\defun {sync-file} {fd/port} \undefined
|
|
\defunx{sync-file-system}{} \undefined
|
|
\begin{desc}
|
|
Calling \ex{sync-file}
|
|
causes {\Unix} to update the disk data structures for a given file.
|
|
If \var{fd/port} is a port, any buffered data it may have is first
|
|
flushed.
|
|
Calling \ex{sync-file-system} synchronises the kernel's entire file
|
|
system with the disk.
|
|
|
|
These procedures are not {\Posix}.
|
|
Interestingly enough, \ex{sync\=file\=system} doesn't actually
|
|
do what it is claimed to do. We just threw it in for humor value.
|
|
See the \ex{sync(2)} man page for {\Unix} enlightenment.
|
|
\end{desc}
|
|
|
|
\begin{defundesc} {truncate-file} {fname/fd/port len} \undefined
|
|
The specified file is truncated to \var{len} bytes in length.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{file-info} {fname/fd/port [chase?]} {file-info-record}
|
|
The \ex{file-info} procedure
|
|
returns a record structure containing everything
|
|
there is to know about a file. If the \var{chase?} flag is true
|
|
(the default), then the procedure chases symlinks and reports on
|
|
the files to which they refer. If \var{chase?} is false, then
|
|
the procedure checks the actual file itself, even if it's a symlink.
|
|
The \var{chase?} flag is ignored if the file argument is a file descriptor
|
|
or port.
|
|
|
|
The value returned is a \emph{file-info record}, defined to have the
|
|
following structure:
|
|
\begin{code}
|
|
(define-record file-info
|
|
type ; \{block-special, char-special, directory,
|
|
; fifo, regular, socket, symlink\}
|
|
device ; Device file resides on.
|
|
inode ; File's inode.
|
|
mode ; File's mode bits: permissions, setuid, setgid
|
|
nlinks ; Number of hard links to this file.
|
|
uid ; Owner of file.
|
|
gid ; File's group id.
|
|
size ; Size of file, in bytes.
|
|
atime ; Time of last access.
|
|
mtime ; Time of last mod.
|
|
ctime) ; Time of last status change.\end{code}
|
|
\index{file-info:type}\index{file-info:device}\index{file-info:inode}%
|
|
\index{file-info:mode}\index{file-info:nlinks}\index{file-info:uid}%
|
|
\index{file-info:gid}\index{file-info:size}\index{file-info:atime}%
|
|
\index{file-info:mtime}\index{file-info:ctime}%
|
|
%
|
|
The uid field of a file-info record is accessed with the procedure
|
|
\codex{(file-info:uid x)}
|
|
and similarly for the other fields.
|
|
The \ex{type} field is a symbol; all other fields are integers.
|
|
A file-info record is discriminated with the \ex{file-info?} predicate.
|
|
|
|
The following procedures all return selected information about
|
|
a file; they are built on top of \ex{file-info}, and are
|
|
called with the same arguments that are passed to it.
|
|
\begin{inset}
|
|
\newcommand{\Ex}[1]{\ex{#1}\index{#1@{\tt{#1}}}}
|
|
\begin{tabular}{ll}
|
|
Procedure & returns \\\hline
|
|
\Ex{file-type} & type \\
|
|
\Ex{file-inode} & inode \\
|
|
\Ex{file-mode} & mode \\
|
|
\Ex{file-nlinks} & nlinks \\
|
|
\Ex{file-owner} & uid \\
|
|
\Ex{file-group} & gid \\
|
|
\Ex{file-size} & size \\
|
|
\Ex{file-last-access} & atime \\
|
|
\Ex{file-last-mod} & mtime \\
|
|
\Ex{file-last-status-change} & ctime
|
|
\end{tabular}
|
|
\end{inset}
|
|
%
|
|
Example:
|
|
\begin{code}
|
|
;; All my files in /usr/tmp:
|
|
(filter (\l{f} (= (file-owner f) (user-uid)))
|
|
(directory-files "/usr/tmp")))\end{code}
|
|
|
|
\remark{\ex{file-info} was named \ex{file-attributes} in releases of scsh
|
|
prior to release 0.4. We changed the name to \ex{file-info} for
|
|
consistency with the other information-retrieval procedures in
|
|
scsh: \ex{user-info}, \ex{group-info}, \ex{host-info},
|
|
\ex{network-info }, \ex{service-info}, and \ex{protocol-info}.
|
|
|
|
The \ex{file-attributes} binding is still supported in the current
|
|
release of scsh, but is deprecated, and may go away in a future
|
|
release.}
|
|
\end{defundesc}
|
|
|
|
\defun {file-directory?}{fname/fd/port [chase?]}{\boolean}
|
|
\defunx {file-fifo?}{fname/fd/port [chase?]}{\boolean}
|
|
\defunx {file-regular?}{fname/fd/port [chase?]}{\boolean}
|
|
\defunx {file-socket?}{fname/fd/port [chase?]}{\boolean}
|
|
\defunx {file-special?}{fname/fd/port [chase?]}{\boolean}
|
|
\defunx {file-symlink?}{fname/fd/port}{\boolean}
|
|
\begin{desc}
|
|
These procedures are file-type predicates that test the
|
|
type of a given file.
|
|
They are applied to the same arguments to which \ex{file-info} is applied;
|
|
the sole exception is \ex{file-symlink?}, which does not take
|
|
the optional \var{chase?} second argument.
|
|
\begin{inset}
|
|
\newcommand{\Ex}[1]{\ex{#1}\index{\tt{#1}}}
|
|
\begin{tabular}{l@{\qquad}l}
|
|
\end{tabular}
|
|
\end{inset}
|
|
For example,
|
|
\codex{(file-directory? "/usr/dalbertz")\qquad\evalto\qquad\sharpt}
|
|
\end{desc}
|
|
|
|
\defun {file-not-readable?} {fname} \boolean
|
|
\defunx{file-not-writable?} {fname} \boolean
|
|
\defunx{file-not-executable?} {fname} \boolean
|
|
\begin{desc}
|
|
Returns:
|
|
\begin{optiontable}
|
|
\textnormal{Value} & meaning \\ \hline
|
|
\sharpf & Access permitted \\
|
|
'search-denied & {\renewcommand{\arraystretch}{1}%
|
|
\begin{tabular}[t]{@{}l@{}}
|
|
Can't stat---a protected directory \\
|
|
is blocking access.\end{tabular}} \\
|
|
'permission & Permission denied. \\
|
|
'no-directory & Some directory doesn't exist. \\
|
|
'nonexistent & File doesn't exist.
|
|
\end{optiontable}
|
|
%
|
|
A file is considered writeable if either (1) it exists and is writeable
|
|
or (2) it doesn't exist and the directory is writeable.
|
|
Since symlink permission bits are ignored by the filesystem, these
|
|
calls do not take a \var{chase?} flag.
|
|
|
|
Note that these procedures use the process' \emph{effective} user
|
|
and group ids for permission checking. {\Posix} defines an \ex{access()}
|
|
function that uses the process' real uid and gids. This is handy
|
|
for setuid programs that would like to find out if the actual user
|
|
has specific rights; scsh ought to provide this functionality (but doesn't
|
|
at the current time).
|
|
|
|
There are several problems with these procedures. First, there's an
|
|
atomicity issue. In between checking permissions for a file and then trying
|
|
an operation on the file, another process could change the permissions,
|
|
so a return value from these functions guarantees nothing. Second,
|
|
the code special-cases permission checking when the uid is root---if
|
|
the file exists, root is assumed to have the requested permission.
|
|
However, not even root can write a file that is on a read-only file system,
|
|
such as a CD ROM. In this case, \ex{file-not-writable?} will lie, saying
|
|
that root has write access, when in fact the opening the file for write
|
|
access will fail.
|
|
Finally, write permission confounds write access and create access.
|
|
These should be disentangled.
|
|
|
|
Some of these problems could be avoided if {\Posix} had a real-uid
|
|
variant of the \ex{access()} call we could use, but the atomicity
|
|
issue is still a problem. In the final analysis, the only way to
|
|
find out if you have the right to perform an operation on a file
|
|
is to try and open it for the desired operation. These permission-checking
|
|
functions are mostly intended for script-writing, where loose guarantees
|
|
are tolerated.
|
|
\end{desc}
|
|
|
|
\defun {file-readable?} {fname} \boolean
|
|
\defunx {file-writable?} {fname} \boolean
|
|
\defunx {file-executable?} {fname} \boolean
|
|
\begin{desc}
|
|
These procedures are the logical negation of the
|
|
preceding \ex{file-not-\ldots?} procedures.
|
|
Refer to them for a discussion of their problems and limitations.
|
|
\end{desc}
|
|
|
|
\begin{defundesc}{file-not-exists?} {fname [chase?]} \object
|
|
Returns:
|
|
\begin{optiontable}
|
|
\sharpf & Exists. \\
|
|
\sharpt & Doesn't exist. \\
|
|
'search-denied & \parbox[t]{0.5\linewidth}{\sloppy\raggedright
|
|
Some protected directory
|
|
is blocking the search.}
|
|
\end{optiontable}
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{file-exists?} {fname [chase?]} \boolean
|
|
This is simply
|
|
\ex{(not (file-not-exists? \var{fname} \var{[chase?]}))}
|
|
\end{defundesc}
|
|
|
|
\defun {directory-files} {[dir dotfiles?]} {string list}
|
|
\begin{desc}
|
|
Return the list of files in directory \var{dir},
|
|
which defaults to the current working directory.
|
|
The \var{dotfiles?} flag (default {\sharpf}) causes dot files to be
|
|
included in the list.
|
|
Regardless of the value of \var{dotfiles?}, the two files \ex{.} and
|
|
\ex{..} are \emph{never} returned.
|
|
|
|
The directory \var{dir} is not prepended to each file name in the
|
|
result list. That is,
|
|
\codex{(directory-files "/etc")}
|
|
returns
|
|
\codex{("chown" "exports" "fstab" \ldots)}
|
|
\emph{not}
|
|
\codex{("/etc/chown" "/etc/exports" "/etc/fstab" \ldots)}
|
|
To use the files in returned list, the programmer can either manually
|
|
prepend the directory:
|
|
\codex{(map (\l{f} (string-append dir "/" f)) files)}
|
|
or cd to the directory before using the file names:
|
|
%
|
|
\begin{code}
|
|
(with-cwd dir
|
|
(for-each delete-file (directory-files)))\end{code}
|
|
%
|
|
or use the \ex{glob} procedure, defined below.
|
|
|
|
A directory list can be generated by \ex{(run/strings (ls))}, but this
|
|
is unreliable, as filenames with whitespace in their names will be
|
|
split into separate entries. Using \ex{directory-files} is reliable.
|
|
\end{desc}
|
|
|
|
\defun {glob} {\vari{pat}1 \ldots} {string list}
|
|
\begin{desc}
|
|
Glob each pattern against the filesystem and return the sorted list.
|
|
Duplicates are not removed. Patterns matching nothing are not included
|
|
literally.\footnote{Why bother to mention such a silly possibility?
|
|
Because that is what sh does.}
|
|
C shell \verb|{a,b,c}| patterns are expanded. Backslash quotes
|
|
characters, turning off the special meaning of
|
|
\verb|{|, \verb|}|, \cd{*}, \verb|[|, \verb|]|, and \verb|?|.
|
|
|
|
Note that the rules of backslash for {\Scheme} strings and glob patterns
|
|
work together to require four backslashes in a row to specify a
|
|
single literal backslash. Fortunately, it is very rare that a backslash
|
|
occurs in a Unix file name.
|
|
|
|
A glob subpattern will not match against dot files unless the first
|
|
character of the subpattern is a literal ``\ex{.}''.
|
|
Further, a dot subpattern will not match the files \ex{.} or \ex{..}
|
|
unless it is a constant pattern, as in \ex{(glob "../*/*.c")}.
|
|
So a directory's dot files can be reliably generated
|
|
with the simple glob pattern \ex{".*"}.
|
|
|
|
Some examples:
|
|
\begin{inset}
|
|
\begin{verbatim}
|
|
(glob "*.c" "*.h")
|
|
;; All the C and #include files in my directory.
|
|
|
|
(glob "*.c" "*/*.c")
|
|
;; All the C files in this directory and
|
|
;; its immediate subdirectories.
|
|
|
|
(glob "lexer/*.c" "parser/*.c")
|
|
(glob "{lexer,parser}/*.c")
|
|
;; All the C files in the lexer and parser dirs.
|
|
|
|
(glob "\\{lexer,parser\\}/*.c")
|
|
;; All the C files in the strange
|
|
;; directory "{lexer,parser}".
|
|
|
|
(glob "*\\*")
|
|
;; All the files ending in "*", e.g.
|
|
;; ("foo*" "bar*")
|
|
|
|
(glob "*lexer*")
|
|
("mylexer.c" "lexer1.notes")
|
|
;; All files containing the string "lexer".
|
|
|
|
(glob "lexer")
|
|
;; Either ("lexer") or ().\end{verbatim}
|
|
\end{inset}
|
|
%
|
|
If the first character of the pattern (after expanding braces) is a slash,
|
|
the search begins at root; otherwise, the search begins in the current
|
|
working directory.
|
|
|
|
If the last character of the pattern (after expanding braces) is a slash,
|
|
then the result matches must be directories, \eg,
|
|
\begin{code}
|
|
(glob "/usr/man/man?/") \evalto
|
|
("/usr/man/man1/" "/usr/man/man2/" \ldots)\end{code}
|
|
|
|
Globbing can sometimes be useful when we need a list of a directory's files
|
|
where each element in the list includes the pathname for the file.
|
|
Compare:
|
|
\begin{code}
|
|
(directory-files "../include") \evalto
|
|
("cig.h" "decls.h" \ldots)
|
|
|
|
(glob "../include/*") \evalto
|
|
("../include/cig.h" "../include/decls.h" \ldots)\end{code}
|
|
\end{desc}
|
|
|
|
\defun{glob-quote}{str}\str
|
|
\begin{desc}
|
|
Returns a constant glob pattern that exactly matches \var{str}.
|
|
All wild-card characters in \var{str} are quoted with a backslash.
|
|
\begin{code}
|
|
(glob-quote "Any *.c files?")
|
|
{\evalto}"Any \\*.c files\\?"\end{code}
|
|
\end{desc}
|
|
|
|
|
|
\begin{defundesc}{file-match}{root dot-files? \vari{pat}1 \vari{pat}2 {\ldots} \vari{pat}n}{string list}
|
|
\note{This procedure is deprecated, and will probably either go away or
|
|
be substantially altered in a future release. New code should not
|
|
call this procedure. The problem is that it relies upon
|
|
Posix-notation regular expressions; the rest of scsh has been
|
|
converted over to the new SRE notation.}
|
|
|
|
\ex{file-match} provides a more powerful file-matching service, at the
|
|
expense of a less convenient notation. It is intermediate in
|
|
power between most shell matching machinery and recursive \ex{find(1)}.
|
|
|
|
Each pattern is a regexp. The procedure searches from \var{root},
|
|
matching the first-level files against pattern \vari{pat}1, the
|
|
second-level files against \vari{pat}2, and so forth.
|
|
The list of files matching the whole path pattern is returned,
|
|
in sorted order.
|
|
The matcher uses Spencer's regular expression package.
|
|
|
|
The files \ex{.} and \ex{..} are never matched. Other dot files are only
|
|
matched if the \var{dot-files?} argument is \sharpt.
|
|
|
|
A given \vari{pat}i pattern is matched as a regexp, so it is not forced
|
|
to match the entire file name. \Eg, pattern \ex{"t"} matches any
|
|
file containing a ``t'' in its name, while pattern \verb|"^t$"| matches
|
|
only a file whose entire name is ``\ex{t}''.
|
|
|
|
The \vari{pat}i patterns can be more general than stated above.
|
|
\begin{itemize}
|
|
\item A single pattern can specify multiple levels of the path by
|
|
embedding \ex{/} characters within the pattern. For example,
|
|
the pattern \ex{"a/b/c"} gives a match equivalent to the
|
|
list of patterns \ex{"a" "b" "c"}.
|
|
|
|
\item A \vari{pat}i pattern can be a procedure,
|
|
which is used as a match predicate.
|
|
It will be repeatedly called with a candidate file-name to test.
|
|
The file-name will be the entire path accumulated.
|
|
If the procedure raises an error condition, \ex{file-match} will
|
|
catch the error and treat it as a failed match.
|
|
This keeps \ex{file-match} from being blown out of the water
|
|
by applying tests to dangling symlinks and other similar situations.
|
|
|
|
\end{itemize}
|
|
|
|
Some examples:
|
|
%% UGH. Because we are using code instead of verbatim, we have to
|
|
%% double up on backslashes.
|
|
\begin{tightleftinset}
|
|
\begin{code}
|
|
(file-match "/usr/lib" #f "m$" "^tab") \evalto
|
|
("/usr/lib/term/tab300" "/usr/lib/term/tab300-12" \ldots)
|
|
\cb
|
|
(file-match "." #f "^lex|parse|codegen$" "\\\\.c$") \evalto
|
|
("lex/lex.c" "lex/lexinit.c" "lex/test.c"
|
|
"parse/actions.c" "parse/error.c" parse/test.c"
|
|
"codegen/io.c" "codegen/walk.c")
|
|
\cb
|
|
(file-match "." #f "^lex|parse|codegen$/\\\\.c$")
|
|
;; The same.
|
|
\cb
|
|
(file-match "." #f file-directory?)
|
|
;; Return all subdirs of the current directory.
|
|
\cb
|
|
(file-match "/" #f file-directory?) \evalto
|
|
("/bin" "/dev" "/etc" "/tmp" "/usr")
|
|
;; All subdirs of root.
|
|
\cb
|
|
(file-match "." #f "\\\\.c")
|
|
;; All the C files in my directory.
|
|
\cb
|
|
(define (ext extension)
|
|
(\l{fn} (string-suffix? fn extension)))
|
|
\cb
|
|
(define (true . x) #t)
|
|
\cb
|
|
(file-match "." #f "./\\\\.c")
|
|
(file-match "." #f "" "\\\\.c")
|
|
(file-match "." #f true "\\\\.c")
|
|
(file-match "." #f true (ext "c"))
|
|
;; All the C files of all my immediate subdirs.
|
|
\cb
|
|
(file-match "." #f "lexer") \evalto
|
|
("mylexer.c" "lexer.notes")
|
|
;; Compare with (glob "lexer"), above.\end{code}
|
|
\end{tightleftinset}
|
|
|
|
Note that when \var{root} is the current working directory (\ex{"."}),
|
|
when it is converted to directory form, it becomes \ex{""}, and doesn't
|
|
show up in the result file-names.
|
|
|
|
It is regrettable that the regexp wild card char, ``\ex{.}'',
|
|
is such an important file name literal, as dot-file prefix and extension
|
|
delimiter.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc} {create-temp-file} {[prefix]} \str
|
|
\ex{Create-temp-file} creates a new temporary file and return its name.
|
|
The optional argument specifies the filename prefix to use, and defaults
|
|
to \ex{"/usr/tmp/\var{pid}"}, where \var{pid} is the current process' id.
|
|
The procedure generates a sequence of filenames that have \var{prefix} as
|
|
a common prefix, looking for a filename that doesn't already exist in the
|
|
file system. When it finds one, it creates it, with permission \cd{#o600}
|
|
and returns the filename. (The file permission can be changed to a more
|
|
permissive permission with \ex{set-file-mode} after being created).
|
|
|
|
This file is guaranteed to be brand new. No other process will have it
|
|
open. This procedure does not simply return a filename that is very
|
|
likely to be unused. It returns a filename that definitely did not exist
|
|
at the moment \ex{create-temp-file} created it.
|
|
|
|
It is not necessary for the process' pid to be a part of the filename
|
|
for the uniqueness guarantees to hold. The pid component of the default
|
|
prefix simply serves to scatter the name searches into sparse regions, so
|
|
that collisions are less likely to occur. This speeds things up, but does
|
|
not affect correctness.
|
|
|
|
Security note: doing i/o to files created this way in \ex{/usr/tmp/} is
|
|
not necessarily secure. General users have write access to \ex{/usr/tmp/},
|
|
so even if an attacker cannot access the new temp file, he can delete it
|
|
and replace it with one of his own. A subsequent open of this filename
|
|
will then give you his file, to which he has access rights. There are
|
|
several ways to defeat this attack,
|
|
\begin{enumerate}
|
|
\item Use \ex{temp-file-iterate}, below, to return the file descriptor
|
|
allocated when the file is opened. This will work if the file
|
|
only needs to be opened once.
|
|
\item If the file needs to be opened twice or more, create it in a
|
|
protected directory, \eg, \verb|$HOME|.
|
|
\item Ensure that \ex{/usr/tmp} has its sticky bit set. This
|
|
requires system administrator privileges.
|
|
\end{enumerate}
|
|
The actual default prefix used is controlled by the dynamic variable
|
|
\ex{*temp-file-template*}, and can be overridden for increased security.
|
|
See \ex{temp-file-iterate}.
|
|
\end{defundesc}
|
|
|
|
\defunx {temp-file-iterate} {maker [template]} {\object\+}
|
|
\defvarx {*temp-file-template*} \str
|
|
\begin{desc}
|
|
This procedure can be used to perform certain atomic transactions on
|
|
the file system involving filenames. Some examples:
|
|
\begin{itemize}
|
|
\item Linking a file to a fresh backup temp name.
|
|
\item Creating and opening an unused, secure temp file.
|
|
\item Creating an unused temporary directory.
|
|
\end{itemize}
|
|
|
|
This procedure uses \var{template} to generate a series of trial file
|
|
names.
|
|
\var{Template} is a \ex{format} control string, and defaults to
|
|
\codex{"/usr/tmp/\var{pid}.\~a"}
|
|
where \var{pid} is the current process' process id.
|
|
File names are generated by calling \ex{format} to instantiate the
|
|
template's \verb|~a| field with a varying string.
|
|
|
|
\var{Maker} is a procedure which is serially called on each file name
|
|
generated. It must return at least one value; it may return multiple
|
|
values. If the first return value is {\sharpf} or if \var{maker} raises the
|
|
\ex{errno/exist} errno exception, \ex{temp-file-iterate} will loop,
|
|
generating a new file name and calling \var{maker} again. If the first
|
|
return value is true, the loop is terminated, returning whatever value(s)
|
|
\var{maker} returned.
|
|
|
|
After a number of unsuccessful trials, \ex{temp-file-iterate} may give up
|
|
and signal an error.
|
|
|
|
Thus, if we ignore its optional \var{prefix} argument,
|
|
\ex{create-temp-file} could be defined as:
|
|
\begin{code}
|
|
(define (create-temp-file)
|
|
(let ((flags (bitwise-ior open/create open/exclusive)))
|
|
(temp-file-iterate
|
|
(\l{f}
|
|
(close (open-output-file f flags #o600))
|
|
f))))\end{code}
|
|
|
|
To rename a file to a temporary name:
|
|
\begin{code}
|
|
(temp-file-iterate (\l{backup}
|
|
(create-hard-link old-file backup)
|
|
backup)
|
|
".#temp.\~a") ; Keep link in cwd.
|
|
(delete-file old-file)\end{code}
|
|
Recall that scsh reports syscall failure by raising an error
|
|
exception, not by returning an error code. This is critical to
|
|
to this example---the programmer can assume that if the
|
|
\ex{temp-file-iterate} call returns, it returns successully.
|
|
So the following \ex{delete-file} call can be reliably invoked,
|
|
safe in the knowledge that the backup link has definitely been established.
|
|
|
|
To create a unique temporary directory:
|
|
\begin{code}
|
|
(temp-file-iterate (\l{dir} (create-directory dir) dir)
|
|
"/usr/tmp/tempdir.\~a")\end{code}
|
|
%
|
|
Similar operations can be used to generate unique symlinks and fifos,
|
|
or to return values other than the new filename (\eg, an open file
|
|
descriptor or port).
|
|
|
|
The default template is in fact taken from the value of the dynamic
|
|
variable \ex{*temp-file-template*}, which itself defaults to
|
|
\ex{"/usr/tmp/\var{pid}.\~a"}, where \var{pid} is the scsh process'
|
|
pid.
|
|
For increased security, a user may wish to change the template
|
|
to use a directory not allowing world write access
|
|
(\eg, his home directory).
|
|
\end{desc}
|
|
|
|
\defun{temp-file-channel}{} {[inp outp]}
|
|
\begin{desc}
|
|
This procedure can be used to provide an interprocess communications
|
|
channel with arbitrary-sized buffering. It returns two values, an input
|
|
port and an output port, both open on a new temp file. The temp file
|
|
itself is deleted from the {\Unix} file tree before \ex{temp-file-channel}
|
|
returns, so the file is essentially unnamed, and its disk storage is
|
|
reclaimed as soon as the two ports are closed.
|
|
|
|
\ex{Temp-file-channel} is analogous to \ex{port-pipe} with two exceptions:
|
|
\begin{itemize}
|
|
\item If the writer process gets ahead of the reader process, it will
|
|
not hang waiting for some small pipe buffer to drain. It will simply
|
|
buffer the data on disk. This is good.
|
|
|
|
\item If the reader process gets ahead of the writer process, it will
|
|
also not hang waiting for data from the writer process. It will
|
|
simply see and report an end of file. This is bad.
|
|
|
|
In order to ensure that an end-of-file returned to the reader is
|
|
legitimate, the reader and writer must serialise their i/o. The
|
|
simplest way to do this is for the reader to delay doing input
|
|
until the writer has completely finished doing output, or exited.
|
|
\end{itemize}
|
|
\end{desc}
|
|
|
|
\section{Processes}
|
|
|
|
\defun {exec} {prog arg1 \ldots argn} \noreturn
|
|
\defunx {exec-path} {prog arg1 \ldots argn} \noreturn
|
|
\defunx {exec/env} {prog env arg1 \ldots argn} \noreturn
|
|
\defunx {exec-path/env} {prog env arg1 \ldots argn} \noreturn
|
|
\begin{desc}
|
|
|
|
The \ex{\ldots/env} variants take an environment specified as a
|
|
string$\rightarrow$string alist.
|
|
An environment of {\sharpt} is taken to mean the current process' environment
|
|
(\ie, the value of the external char \ex{**environ}).
|
|
|
|
[Rationale: {\sharpf} is a more convenient marker for the current environment
|
|
than {\sharpt}, but would cause an ambiguity on Schemes that identify
|
|
{\sharpf} and \ex{()}.]
|
|
|
|
The path-searching variants search the directories in the list
|
|
{\ttt exec\=path\=list} for the program.
|
|
A path-search is not performed if the program name contains
|
|
a slash character---it is used directly. So a program with a name like
|
|
\ex{"bin/prog"} always executes the program \ex{bin/prog} in the current working
|
|
directory. See \verb|$path| and \verb|exec-path-list|, below.
|
|
|
|
Note that there is no analog to the C function \ex{execv()}.
|
|
To get the effect just do
|
|
\codex{(apply exec prog arglist)}
|
|
|
|
All of these procedures flush buffered output and close unrevealed ports
|
|
before executing the new binary.
|
|
To avoid flushing buffered output, see \verb|%exec| below.
|
|
|
|
Note that the C \ex{exec()} procedure allows the zeroth element of the
|
|
argument vector to be different from the file being executed, \eg
|
|
%
|
|
\begin{inset}
|
|
\begin{verbatim}
|
|
char *argv[] = {"-", "-f", 0};
|
|
exec("/bin/csh", argv, envp);\end{verbatim}
|
|
\end{inset}
|
|
%
|
|
The scsh \ex{exec}, \ex{exec-path}, \ex{exec/env}, and \ex{exec-path/env}
|
|
procedures do not give this functionality---element 0 of the arg vector is
|
|
always identical to the \ex{prog} argument. In the rare case the user wishes
|
|
to differentiate these two items, he can use the low-level \verb|%exec| and
|
|
\verb|exec-path-search| procedures.
|
|
These procedures never return under any circumstances.
|
|
As with any other system call, if there is an error, they raise
|
|
an exception.
|
|
\end{desc}
|
|
|
|
|
|
\defun {\%exec} {prog arglist env} \undefined
|
|
\defunx{exec-path-search} {fname pathlist} {{\str} or \sharpf}
|
|
\begin{desc}
|
|
The \ex{\%exec} procedure is the low-level interface to the system call.
|
|
The \var{arglist} parameter is a list of arguments;
|
|
\var{env} is either a string$\rightarrow$string alist or {\sharpt}.
|
|
The new program's \cd{argv[0]} will be taken from \ex{(car \var{arglist})},
|
|
\emph{not} from \var{prog}.
|
|
An environment of {\sharpt} means the current process' environment.
|
|
\verb|%exec| does not flush buffered output
|
|
(see \ex{flush-all-ports}).
|
|
|
|
All exec procedures, including \verb|%exec|, coerce the \cd{prog} and \cd{arg}
|
|
values to strings using the usual conversion rules: numbers are converted to
|
|
decimal numerals, and symbols converted to their print-names.
|
|
|
|
\ex{exec-path-search} searches the directories of \var{pathlist} looking for
|
|
an occurrence of file \ex{fname}. If no executable file is found, it returns
|
|
{\sharpf}. If \ex{fname} contains a slash character, the path search is
|
|
short-circuited, but the procedure still checks to ensure that the file exists
|
|
and is executable---if not, it still returns {\sharpf}.
|
|
Users of this procedure should be aware that it invites a potential race
|
|
condition: between checking the file with \ex{exec-path-search} and executing
|
|
it with \ex{\%exec}, the file's status might change.
|
|
The only atomic way to do the search is to loop over the candidate
|
|
file names, exec'ing each one and looping when the exec operation fails.
|
|
|
|
See \cd{$path} and \ex{exec-path-list}, below.
|
|
\end{desc}
|
|
|
|
\defun {exit} {[status]} \noreturn
|
|
\defunx {\%exit} {[status]} \noreturn
|
|
\begin{desc}
|
|
These procedures terminate the current process with a given exit status.
|
|
The default exit status is 0.
|
|
The low-level \verb|%exit| procedure immediately terminates the process
|
|
without flushing buffered output.
|
|
\end{desc}
|
|
|
|
\begin{defundesc} {call-terminally} {thunk} \noreturn
|
|
\ex{call-terminally} calls its thunk. When the thunk returns, the process
|
|
exits. Although \ex{call-terminally} could be implemented as
|
|
\codex{(\l{thunk} (thunk) (exit 0))}
|
|
an implementation can take advantage of the fact that this procedure never
|
|
returns. For example, the runtime can start with a fresh stack and also
|
|
start with a fresh dynamic environment, where shadowed bindings are
|
|
discarded. This can allow the old stack and dynamic environment to be
|
|
collected (assuming this data is not reachable through some live
|
|
continuation).
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{suspend}{} \undefined
|
|
Suspend the current process with a SIGSTOP signal.
|
|
\end{defundesc}
|
|
|
|
\defun {fork} {[thunk]} {proc or \sharpf}
|
|
\defunx {\%fork} {[thunk]} {proc or \sharpf}
|
|
\begin{desc}
|
|
\ex{fork} with no arguments is like C \ex{fork()}.
|
|
In the parent process, it returns the child's \emph{process object}
|
|
(see below for more information on process objects).
|
|
In the child process, it returns {\sharpf}.
|
|
|
|
\ex{fork} with an argument only returns in the parent process, returning
|
|
the child's process object.
|
|
The child process calls \var{thunk} and then exits.
|
|
|
|
\ex{fork} flushes buffered output before forking, and sets the child
|
|
process to non-interactive. \verb|%fork| does not perform this bookkeeping;
|
|
it simply forks.
|
|
\end{desc}
|
|
|
|
\defun {fork/pipe} {[thunk]} {proc or \sharpf}
|
|
\defunx{\%fork/pipe} {[thunk]} {proc or \sharpf}
|
|
\begin{desc}
|
|
Like \ex{fork} and \ex{\%fork}, but the parent and child communicate via a
|
|
pipe connecting the parent's stdin to the child's stdout. These procedures
|
|
side-effect the parent by changing his stdin.
|
|
|
|
In effect, \ex{fork/pipe} splices a process into the data stream
|
|
immediately upstream of the current process.
|
|
This is the basic function for creating pipelines.
|
|
Long pipelines are built by performing a sequence of \ex{fork/pipe} calls.
|
|
For example, to create a background two-process pipe \ex{a | b}, we write:
|
|
%
|
|
\begin{code}
|
|
(fork (\l{} (fork/pipe a) (b)))\end{code}
|
|
%
|
|
which returns the process object for \ex{b}'s process.
|
|
|
|
To create a background three-process pipe \ex{a | b | c}, we write:
|
|
%
|
|
\begin{code}
|
|
(fork (\l{} (fork/pipe a)
|
|
(fork/pipe b)
|
|
(c)))\end{code}
|
|
%
|
|
which returns the process object for \ex{c}'s process.
|
|
|
|
Note that these procedures affect file descriptors, not ports.
|
|
That is, the pipe is allocated connecting the child's file descriptor
|
|
1 to the parent's file descriptor 0.
|
|
\emph{Any previous Scheme port built over these affected file descriptors
|
|
is shifted to a new, unused file descriptor with \ex{dup} before
|
|
allocating the I/O pipe.}
|
|
This means, for example, that the ports bound to \ex{(current-input-port)}
|
|
and \ex{(current-output-port)} in either process are not affected---they
|
|
still refer to the same I/O sources and sinks as before.
|
|
Remember the simple scsh rule: Scheme ports are bound to I/O sources
|
|
and sinks, \emph{not} particular file descriptors.
|
|
|
|
If the child process wishes to rebind the current output port
|
|
to the pipe on file descriptor 1, it can do this using
|
|
\ex{with-current-output-port} or a related form.
|
|
Similarly, if the parent wishes to change the current input port
|
|
to the pipe on file descriptor 0, it can do this using
|
|
\ex{set-current-input-port!} or a related form.
|
|
Here is an example showing how to set up the I/O ports on both sides
|
|
of the pipe:
|
|
\begin{code}
|
|
(fork/pipe (\l{}
|
|
(with-current-output-port (fdes->outport 1)
|
|
(display "Hello, world.\\n"))))
|
|
|
|
(set-current-input-port! (fdes->inport 0)
|
|
(read-line) ; Read the string output by the child.\end{code}
|
|
None of this is necessary when the I/O is performed by an exec'd
|
|
program in the child or parent process, only when the pipe will
|
|
be referenced by Scheme code through one of the default current I/O
|
|
ports.
|
|
\end{desc}
|
|
|
|
\defun {fork/pipe+} {conns [thunk]} {proc or \sharpf}
|
|
\defunx {\%fork/pipe+} {conns [thunk]} {proc or \sharpf}
|
|
\begin{desc}
|
|
Like \ex{fork/pipe}, but the pipe connections between the child and parent
|
|
are specified by the connection list \var{conns}.
|
|
See the
|
|
\codex{(|+ \var{conns} \vari{pf}{\!1} \ldots{} \vari{pf}{\!n})}
|
|
process form for a description of connection lists.
|
|
\end{desc}
|
|
|
|
\subsection{Process objects and process reaping}
|
|
\label{sec:proc-objects}
|
|
Scsh uses \emph{process objects} to represent Unix processes.
|
|
They are created by the \ex{fork} procedure, and have the following
|
|
exposed structure:
|
|
\begin{code}
|
|
(define-record proc
|
|
pid)\end{code}
|
|
\index{proc}\index{proc?}\index{proc:pid}
|
|
The only exposed slot in a proc record is the process' pid,
|
|
the integer id assigned by Unix to the process.
|
|
The only exported primitive procedures for manipulating process objects
|
|
are \ex{proc?} and \ex{proc:pid}.
|
|
Process objects are created with the \ex{fork} procedure.
|
|
|
|
\begin{defundesc}{pid->proc}{pid [probe?]}{proc}
|
|
This procedure maps integer Unix process ids to scsh process objects.
|
|
It is intended for use in interactive and debugging code,
|
|
and is deprecated for use in production code.
|
|
If there is no process object in the system indexed by the given pid,
|
|
\ex{pid->proc}'s action is determined by the \var{probe?} parameter
|
|
(default \sharpf):
|
|
\begin{center}
|
|
\begin{tabular}{|l|l|}
|
|
\hline
|
|
\var{probe?} & Return \\ \hline\hline
|
|
\sharpf & \emph{signal error condition.} \\ \hline
|
|
\ex{'create} & Create new proc object. \\ \hline
|
|
True value & \sharpf \\ \hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\end{defundesc}
|
|
|
|
Sometime after a child process terminates, scsh will perform a \ex{wait}
|
|
system call on the child in background, caching the process' exit status
|
|
in the child's proc object.
|
|
This is called ``reaping'' the process.
|
|
Once the child has been waited, the Unix kernel can free the storage allocated
|
|
for the dead process' exit information, so process reaping prevents the process
|
|
table from becoming cluttered with un-waited dead child processes
|
|
(a.k.a. ``zombies'').
|
|
This can be especially severe if the scsh process never waits on child
|
|
processes at all; if the process table overflows with forgotten zombies,
|
|
the OS may be unable to fork further processes.
|
|
|
|
Reaping a child process moves its exit status information from the kernel
|
|
into the scsh process, where it is cached inside the child's process object.
|
|
If the scsh user drops all pointers to the process object, it will simply be
|
|
garbage collected.
|
|
On the other hand, if the scsh program retains a pointer to the process object,
|
|
it can use scsh's \ex{wait} system call to synchronise with the child and
|
|
retrieve its exit status multiple times (this is not possible with simple
|
|
Unix integer pids in C---the programmer can only wait on a pid once).
|
|
|
|
Thus, process objects allow scsh programmer to do two things not allowed
|
|
in other programming environments:
|
|
\begin{itemize}
|
|
\item Subprocesses that are never waited on are still removed from the
|
|
process table, and their associated exit status data is eventually
|
|
automatically garbage collected.
|
|
\item Subprocesses can be waited on multiple times.
|
|
\end{itemize}
|
|
|
|
However, note that once a child has exited, if the scsh programmer
|
|
drops all pointers to the child's proc object, the child's exit status
|
|
will be reaped and thrown away.
|
|
This is the intended behaviour, and it means that integer pids are not
|
|
enough to cause a process's exit status to be retained by the scsh runtime.
|
|
(This is because it is clearly impossible to GC data referenced by integers.)
|
|
|
|
As a convenience for interactive use and debugging, all procedures that
|
|
take process objects will also accept integer Unix pids as arguments,
|
|
coercing them to the corresponding process objects.
|
|
Since integer process ids are not reliable ways to keep a child's exit
|
|
status from being reaped and garbage collected, programmers are encouraged
|
|
to use process objects in production code.
|
|
|
|
\begin{defundesc}{autoreap-policy}{[policy]}{old-policy}
|
|
The scsh programmer can choose different policies for automatic
|
|
process reaping.
|
|
The policy is determined by applying this procedure to one of the
|
|
values \ex{'early}, \ex{'late}, or {\sharpf} (\ie, no autoreap).
|
|
\begin{description}
|
|
\item [early]
|
|
The child is reaped from the {\Unix} kernel's process table
|
|
into scsh as soon as it dies. This is done by having a
|
|
signal handler for the \ex{SIGCHLD} signal reap the process.
|
|
\emph{
|
|
If a scsh program sets its own handler for the \ex{SIGCHLD}
|
|
signal, the handler must reap dead children
|
|
by calling \ex{wait}, \ex{wait-any}, or \ex{reap-zombies}.}
|
|
We deprecate interrupt-driven code, and hope to provide
|
|
alternative tools in a future, multi-threaded release of scsh.
|
|
|
|
\item [late]
|
|
The child is not autoreaped until it dies \emph{and} the scsh program
|
|
drops all pointers to its process object. That is, the process
|
|
table is cleaned out during garbage collection.
|
|
\oops{The \ex{late} policy is not supported under the current
|
|
release of scsh. It requires more sophisticated gc hooks than
|
|
we can get from the release of {\scm} that we use.}
|
|
|
|
\item [\sharpf]
|
|
If autoreaping is turned off, process reaping is completely under
|
|
control of the programmer, who can force outstanding zombies to
|
|
be reaped by manually calling the \ex{reap-zombies} procedure
|
|
(see below).
|
|
\end{description}
|
|
Note that under any of the autoreap policies, a particular process $p$ can
|
|
be manually reaped into scsh by simply calling \ex{(wait $p$)}.
|
|
\emph{All} zombies can be manually reaped with \ex{reap-zombies}.
|
|
|
|
The \ex{autoreap-policy} procedure returns the policy's previous value.
|
|
Calling \ex{autoreap-policy} with no arguments returns the current
|
|
policy without no change.
|
|
\end{defundesc}
|
|
|
|
|
|
\begin{defundesc}{reap-zombies}{}{\boolean}
|
|
This procedure reaps all outstanding exited child processes into scsh.
|
|
It returns true if there are no more child processes to wait on, and
|
|
false if there are outstanding processes still running or suspended.
|
|
\end{defundesc}
|
|
|
|
\subsubsection{Issues with process reaping}
|
|
Reaping a process does not reveal its process group at the time of
|
|
death; this information is lost when the process reaped.
|
|
This means that a dead, reaped process is \emph{not eligible} as a return
|
|
value for a future \ex{wait-process-group} call.
|
|
This is not likely to be a problem for most code, as programs almost
|
|
never wait on exited processes by process group.
|
|
Process group waiting is usually applied to \emph{stopped} processes,
|
|
which are never reaped.
|
|
So it is unlikely that this will be a problem for most programs.
|
|
|
|
%%% Actually, this is *not* a problem if you stick with proc objects, instead
|
|
%%% of using pids, so I commented it out.
|
|
%
|
|
%\paragraph{Pid aliasing}
|
|
%Second, once a process has been reaped, its 16-bit process id becomes
|
|
%available to Unix for re-use.
|
|
%So it is conceivable that a long time in the future, a \ex{fork} operation
|
|
%could produce a subprocess with the identical pid, causing \ex{wait}
|
|
%operations on the old, dead, reaped child, and the new child to become
|
|
%confused.
|
|
%This kind of pid aliasing is intrinsic to the nature of Unix's single-use pid
|
|
%deallocation policy,
|
|
%but is very, very unlikely to happen in practice,
|
|
%given the 16-bit size of the pid space.
|
|
%Scsh will detect occurences of pid aliasing,
|
|
%in the unlikely event that one occurs.
|
|
%When \ex{fork} creates a proc object, it checks to see if the scsh heap
|
|
%contains an already existing proc object with the same pid as the newly forked
|
|
%process.
|
|
%If so, an exception is raised; if not handled by the program, this will stop
|
|
%the program, either killing the process or invoking an interactive debugger.
|
|
|
|
Automatic process reaping is a useful programming convenience.
|
|
However, if a program is careful to wait for all children, and does not wish
|
|
automatic reaping to happen, the programmer can simply turn process
|
|
autoreaping off.
|
|
|
|
Programs that do not wish to use automatic process reaping should be
|
|
aware that some scsh routines create subprocesses but do not return
|
|
the child's pid: \ex{run/port*}, and its related procedures and
|
|
special forms (\ex{run/strings}, \emph{et al.}).
|
|
Automatic process reaping will clean the child processes created by
|
|
these procedures out of the kernel's process table.
|
|
If a program doesn't use process reaping, it should either avoid these
|
|
forms, or use \ex{wait-any} to wait for the children to exit.
|
|
|
|
\subsection{Process waiting}
|
|
|
|
\defun {wait} {proc/pid [flags]} {status}
|
|
\begin{desc}
|
|
This procedure waits until a child process exits, and returns its
|
|
exit code. The \var{proc/pid} argument is either a process object
|
|
(section \ref{sec:proc-objects}) or an integer process id.
|
|
\ex{Wait} returns the child's exit status code (or suspension code,
|
|
if the \ex{wait/stopped-children} option is used, see below).
|
|
Status values can be queried with the procedures in section
|
|
\ref{sec:wait-codes}.
|
|
|
|
The \var{flags} argument is an integer whose bits specify
|
|
additional options. It is composed by or'ing together the following
|
|
flags:
|
|
\begin{center}
|
|
\begin{tabular}{|l|l|}
|
|
\hline
|
|
Flag & Meaning \\ \hline \hline
|
|
\ex{wait/poll} & Return {\sharpf} immediately if
|
|
child still active. \\ \hline
|
|
\ex{wait/stopped-children} & Wait for suspend as well as exit. \\ \hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\end{desc}
|
|
|
|
\begin{defundesc} {wait-any} {[flags]} {[proc status]}
|
|
The optional \var{flags} argument is as for \ex{wait}.
|
|
This procedure waits for any child process to exit (or stop, if the
|
|
\ex{wait/stopped-children} flag is used)
|
|
It returns the process' process object and status code.
|
|
If there are no children left for which to wait, the two values
|
|
\ex{[{\sharpf} {\sharpt}]} are returned.
|
|
If the \ex{wait/poll} flag is used, and none of the children
|
|
are immediately eligble for waiting,
|
|
then the values \ex{[{\sharpf} {\sharpf}]} are returned:
|
|
\begin{center}
|
|
\begin{tabular}{|l|l|}
|
|
\hline
|
|
[{\sharpf} {\sharpf}] & Poll, none ready \\ \hline
|
|
[{\sharpf} {\sharpt}] & No children \\ \hline
|
|
\end{tabular}
|
|
\end{center}
|
|
|
|
\ex{Wait-any} will not return a process that has been previously waited
|
|
by any other process-wait procedure (\ex{wait}, \ex{wait-any},
|
|
and \ex{wait-process-group}).
|
|
It will return reaped processes that haven't yet been waited.
|
|
|
|
The use of \ex{wait-any} is deprecated.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc} {wait-process-group} {proc/pid [flags]} {[proc status]}
|
|
This procedure waits for any child whose process group is \var{proc/pid}
|
|
(either a process object or a pid).
|
|
The \var{flags} argument is as for \ex{wait}.
|
|
|
|
Note that if the programmer wishes to wait for exited processes
|
|
by process group, the program should take care not to use process
|
|
reaping (section \ref{sec:proc-objects}), as this loses
|
|
process group information. However, most process-group waiting is
|
|
for stopped processes (to implement job control), so this is rarely
|
|
an issue, as stopped processes are not subject to reaping.
|
|
\end{defundesc}
|
|
|
|
|
|
\subsection{Analysing process status codes}
|
|
\label{sec:wait-codes}
|
|
When a child process dies (or is suspended), its parent can call the \ex{wait}
|
|
procedure to recover the exit (or suspension) status of the child.
|
|
The exit status is a small integer that encodes information
|
|
describing how the child terminated.
|
|
The bit-level format of the exit status is not defined by {\Posix};
|
|
you must use the following three functions to decode one.
|
|
However, if a child terminates normally with exit code 0,
|
|
{\Posix} does require \ex{wait} to return an exit status that is exactly
|
|
zero.
|
|
So \ex{(zero? \var{status})} is a correct way to test for non-error,
|
|
normal termination, \eg,
|
|
\begin{code}
|
|
(if (zero? (run (rcp scsh.tar.gz lambda.csd.hku.hk:)))
|
|
(delete-file "scsh.tar.gz"))\end{code}
|
|
|
|
\defun {status:exit-val}{status}{{\integer} or \sharpf}
|
|
\defunx{status:stop-sig}{status}{{\integer} or \sharpf}
|
|
\defunx{status:term-sig}{status}{{\integer} or \sharpf}
|
|
\begin{desc}
|
|
For a given status value produced by calling \ex{wait},
|
|
exactly one of these routines will return a true value.
|
|
|
|
If the child process exited normally, \ex{status:exit-val} returns the
|
|
exit code for the child process (\ie, the value the child passed to \ex{exit}
|
|
or returned from \ex{main}). Otherwise, this function returns false.
|
|
|
|
If the child process was suspended by a signal, \ex{status:stop-sig}
|
|
returns the signal that suspended the child.
|
|
Otherwise, this function returns false.
|
|
|
|
If the child process terminated abnormally, \ex{status:term-sig}
|
|
returns the signal that terminated the child.
|
|
Otherwise, this function returns false.
|
|
\end{desc}
|
|
|
|
%% Dereleased until we have a more portable implementation.
|
|
|
|
%\defun{halts?}{proc}\boolean
|
|
%\begin{desc}
|
|
%This procedure, ported from early T implementations,
|
|
%returns true iff \ex{(\var{proc})} returns at all.
|
|
%\remark{The current implementation is a constant function returning {\sharpt},
|
|
% which suffices for all {\Unix} implementations of which we are aware.}
|
|
%\end{desc}
|
|
|
|
\section{Process state}
|
|
|
|
\defun {umask}{} \fixnum
|
|
\defunx {set-umask} {perms} \undefined
|
|
\defunx {with-umask*} {perms thunk} {value(s) of thunk}
|
|
\dfnx {with-umask} {perms . body} {value(s) of body} {syntax}
|
|
\begin{desc}
|
|
The process' current umask is retrieved with \ex{umask}, and set with
|
|
\ex{(set-umask \var{perms})}. Calling \ex{with-umask*} changes the umask
|
|
to \var{perms} for the duration of the call to \var{thunk}. If the
|
|
program throws out of \var{thunk} by invoking a continuation, the umask is
|
|
reset to its external value. If the program throws back into \var{thunk}
|
|
by calling a stored continuation, the umask is restored to the \var{perms}
|
|
value. The special form \ex{with-umask} is equivalent in effect to
|
|
the procedure \ex{with-umask*}, but does not require the programmer
|
|
to explicitly wrap a \ex{(\l{} \ldots)} around the body of the code
|
|
to be executed.
|
|
\end{desc}
|
|
|
|
|
|
|
|
\defun {chdir} {[fname]} \undefined
|
|
\defunx {cwd}{} \str
|
|
\defunx {with-cwd*} {fname thunk} {value(s) of thunk}
|
|
\dfnx {with-cwd} {fname . body} {value(s) of body} {syntax}
|
|
\begin{desc}
|
|
These forms manipulate the current working directory.
|
|
The cwd can be changed with \ex{chdir}
|
|
(although in most cases, \ex{with-cwd} is preferrable).
|
|
If \ex{chdir} is called with no arguments, it changes the cwd to
|
|
the user's home directory.
|
|
The \ex{with-cwd*} procedure calls \ex{thunk} with the cwd temporarily
|
|
set to \var{fname}; when \var{thunk} returns, or is exited in a non-local
|
|
fashion (\eg, by raising an exception or by invoking a continuation),
|
|
the cwd is returned to its original value.
|
|
The special form \ex{with-cwd} is simply syntactic sugar for \ex{with-cwd*}.
|
|
\end{desc}
|
|
|
|
\defun {pid}{} \fixnum
|
|
\defunx {parent-pid}{} \fixnum
|
|
\defunx {process-group} {} \fixnum
|
|
\defunx {set-process-group} {[proc/pid] pgrp} \undefined % [not implemented]
|
|
\begin{desc}
|
|
\ex{(pid)} and \ex{(parent-pid)} retrieve the process id for the
|
|
current process and its parent.
|
|
\ex{(process-group)} returns the process group of the current process.
|
|
A process' process-group can be set with \ex{set-process-group};
|
|
the value \var{proc/pid} specifies the affected process. It may be either
|
|
a process object or an integer process id, and defaults to the current
|
|
process.
|
|
\end{desc}
|
|
|
|
\defun {set-priority} {which who priority} \undefined %; priority stuff unimplemented
|
|
\defunx {priority} {which who} \fixnum % ; not implemented
|
|
\defunx {nice} {[proc/pid delta]} \undefined %; not implemented
|
|
\begin{desc}
|
|
These procedures set and access the priority of processes.
|
|
I can't remember how \ex{set-priority} and \ex{priority} work, so no
|
|
documentation, and besides, they aren't implemented yet, anyway.
|
|
\end{desc}
|
|
|
|
\defunx {user-login-name}{} \str
|
|
\defunx {user-uid}{} \fixnum
|
|
\defunx {user-effective-uid}{} \fixnum
|
|
\defunx {user-gid}{} \fixnum
|
|
\defunx {user-effective-gid}{} \fixnum
|
|
\defunx {user-supplementary-gids}{} {{\fixnum} list}
|
|
\defunx {set-uid} {uid} \undefined
|
|
\defunx {set-gid} {gid} \undefined
|
|
\begin{desc}
|
|
These routines get and set the effective and real user and group ids.
|
|
The \ex{set-uid} and \ex{set-gid} routines correspond to the {\Posix}
|
|
\ex{setuid()} and \ex{setgid()} procedures.
|
|
\end{desc}
|
|
|
|
|
|
\defun {process-times} {} {[{\fixnum} {\fixnum} {\fixnum} \fixnum]}
|
|
\begin{desc}
|
|
Returns four values:
|
|
\begin{tightinset}
|
|
\begin{flushleft}
|
|
user CPU time in clock-ticks \\
|
|
system CPU time in clock-ticks \\
|
|
user CPU time of all descendant processes \\
|
|
system CPU time of all descendant processes
|
|
\end{flushleft}
|
|
\end{tightinset}
|
|
Note that CPU time clock resolution is not the same as
|
|
the real-time clock resolution provided by \ex{time+ticks}.
|
|
That's Unix.
|
|
\end{desc}
|
|
|
|
\defun{cpu-ticks/sec}{} {integer}
|
|
\begin{desc}
|
|
Returns the resolution of the CPU timer in clock ticks per second.
|
|
This can be used to convert the times reported by \ex{process-times}
|
|
to seconds.
|
|
\end{desc}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{User and group database access}
|
|
These procedures are used to access the user and group databases
|
|
(\eg, the ones traditionally stored in \ex{/etc/passwd} and \ex{/etc/group}.)
|
|
|
|
\defun {user-info} {uid/name} {record}
|
|
\begin{desc}
|
|
Return a \ex{user-info} record giving the recorded information for a
|
|
particular user:
|
|
\index{user-info}
|
|
\index{user-info:name}
|
|
\index{user-info:uid}
|
|
\index{user-info:gid}
|
|
\index{user-info:home-dir}
|
|
\index{user-info:shell}
|
|
\begin{code}
|
|
(define-record user-info
|
|
name uid gid home-dir shell)\end{code}
|
|
The \var{uid/name} argument is either an integer uid or a string user-name.
|
|
\end{desc}
|
|
|
|
\defun {->uid} {uid/name} \fixnum
|
|
\defunx {->username} {uid/name} \str
|
|
\begin{desc}
|
|
These two procedures coerce integer uid's and user names to a particular
|
|
form.
|
|
\end{desc}
|
|
|
|
\defun {group-info} {gid/name} {record}
|
|
\begin{desc}
|
|
Return a \ex{group-info} record giving the recorded information for a
|
|
particular group:
|
|
\index{group-info}
|
|
\index{group-info:name}
|
|
\index{group-info:gid}
|
|
\index{group-info:members}
|
|
\begin{code}
|
|
(define-record group-info
|
|
name gid members)\end{code}
|
|
The \var{gid/name} argument is either an integer gid or a string group-name.
|
|
\end{desc}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Accessing command-line arguments}
|
|
|
|
\defvar {command-line-arguments}{{\str} list}
|
|
\defunx {command-line}{} {{\str} list}
|
|
\begin{desc}
|
|
The list of strings \ex{command-line-arguments} contains the arguments
|
|
passed to the scsh process on the command line.
|
|
Calling \ex{(command-line)} returns the complete \ex{argv}
|
|
string list, including the program. So if we run a scsh program
|
|
\codex{/usr/shivers/bin/myls -CF src}
|
|
then \ex{command-line-arguments} is
|
|
\codex{("-CF" "src")}
|
|
and \ex{(command-line)} returns
|
|
\codex{("/usr/shivers/bin/myls" "-CF" "src")}
|
|
\ex{command-line} returns a fresh list each time it is called.
|
|
In this way, the programmer can get a fresh copy of the original
|
|
argument list if \ex{command-line-arguments} has been modified or is lexically
|
|
shadowed.
|
|
\end{desc}
|
|
|
|
\defun {arg} {arglist n [default]} \str
|
|
\defunx {arg*} {arglist n [default-thunk]} \str
|
|
\defunx {argv} {n [default]} \str
|
|
\begin{desc}
|
|
These procedures are useful for accessing arguments from argument
|
|
lists.
|
|
\ex{arg} returns the $n^{\rm{th}}$ element of \var{arglist}.
|
|
The index is 1-based.
|
|
If \var{n} is too large, \var{default} is returned;
|
|
if no \var{default}, then an error is signaled.
|
|
|
|
\ex{arg*} is similar, except that the \var{default-thunk} is called to generate
|
|
the default value.
|
|
|
|
\ex{(argv \var{n})} is simply \ex{(arg (command-line) (+ \var{n} 1))}.
|
|
The +1 offset ensures that the two forms
|
|
%
|
|
\begin{code}
|
|
(arg command-line-arguments \var{n})
|
|
(argv \var{n})\end{code}
|
|
%
|
|
return the same argument
|
|
(assuming the user has not rebound or modified \ex{command-line-arguments}).
|
|
|
|
Example:
|
|
%
|
|
\begin{code}
|
|
(if (null? command-line-arguments)
|
|
(& (xterm -n ,host -title ,host
|
|
-name ,(string-append "xterm_" host)))
|
|
(let* ((progname (file-name-nondirectory (argv 1)))
|
|
(title (string-append host ":" progname)))
|
|
(& (xterm -n ,title
|
|
-title ,title
|
|
-e ,@command-line-arguments))))\end{code}
|
|
%
|
|
A subtlety: when the scsh interpreter is used to execute a scsh program,
|
|
the program name reported in the head of the \ex{(command-line)} list
|
|
is the scsh program, {\em not} the interpreter.
|
|
For example, if we have a shell script in file \ex{fullecho}:
|
|
\begin{code}
|
|
#!/usr/local/bin/scsh -s
|
|
!#
|
|
(for-each (\l{arg} (display arg) (display " "))
|
|
(command-line))\end{code}
|
|
and we run the program
|
|
\codex{fullecho hello world}
|
|
the program will print out
|
|
\codex{fullecho hello world}
|
|
not
|
|
\codex{/usr/local/bin/scsh -s fullecho hello world}
|
|
|
|
This argument line processing ensures that if a scsh program is subsequently
|
|
compiled into a standalone executable or byte-compiled to a heap-image
|
|
executable by the {\scm} virtual machine, its semantics will be
|
|
unchanged---the arglist processing is invariant. In effect, the
|
|
\codex{/usr/local/bin/scsh -s}
|
|
is not part of the program;
|
|
it's a specification for the machine to execute the program on, so it is
|
|
not properly part of the program's argument list.
|
|
|
|
\end{desc}
|
|
|
|
\section{System parameters}
|
|
|
|
%\defun {maximum-fds}{}\fixnum
|
|
%\defunx {page-size}{} \fixnum
|
|
\defun {system-name}{} \str
|
|
\begin{desc}
|
|
Returns the name of the host on which we are executing.
|
|
This may be a local name, such as ``solar,'' as opposed to a
|
|
fully-qualified domain name such as ``solar.csie.ntu.edu.tw.''
|
|
\end{desc}
|
|
|
|
\section{Signal system}
|
|
|
|
Signal numbers are bound to the variables \ex{signal/hup}, \ex{signal/int},
|
|
\ldots. See tables~\ref{table:signals-and-interrupts} and
|
|
\ref{table:uncatchable-signals} for the full list.
|
|
|
|
\defun {signal-process} {proc sig} \undefined
|
|
\defunx {signal-process-group} {prgrp sig} \undefined
|
|
\begin{desc}
|
|
These two procedures send signals to a specific process, and all the processes
|
|
in a specific process group, respectively.
|
|
The \var{proc} and \var{prgrp} arguments are either processes
|
|
or integer process ids.
|
|
\end{desc}
|
|
|
|
\defun{itimer}{???} \undefined
|
|
\defunx{pause-until-interrupt}{} \undefined
|
|
|
|
\defun{sleep}{secs} \undefined
|
|
\defunx{sleep-until}{time}\undefined
|
|
\begin{desc}
|
|
The \ex{sleep} procedure causes the process to sleep for \var{secs} seconds.
|
|
The \ex{sleep-until} procedure causes the process to sleep until \var{time}
|
|
(see section~\ref{sec:time}).
|
|
\end{desc}
|
|
|
|
\subsubsection{Interrupt handlers}
|
|
Scsh interrupt handlers are complicated by the fact that scsh is implemented on
|
|
top of the {\scm} virtual machine, which has its own interrupt system,
|
|
independent of the Unix signal system.
|
|
This means that {\Unix} signals are delivered in two stages: first,
|
|
{\Unix} delivers the signal to the {\scm} virtual machine, then
|
|
the {\scm} virtual machine delivers the signal to the executing Scheme program
|
|
as a {\scm} interrupt.
|
|
This ensures that signal delivery happens between two vm instructions,
|
|
keeping individual instructions atomic.
|
|
|
|
The {\scm} machine has its own set of interrupts, which includes the
|
|
asynchronous {\Unix} signals (table~\ref{table:signals-and-interrupts}).
|
|
\begin{table}
|
|
\begin{minipage}{\textwidth}
|
|
\begin{center}
|
|
\newcommand{\kwd}[1]{\index{\texttt{#1}}\texttt{#1}}
|
|
\begin{tabular}{lll}\hline
|
|
Interrupt & Unix signal & OS Variant \\ \hline\hline
|
|
\kwd{interrupt/alrm}\footnote{Also bound to {\scm} interrupt
|
|
\kwd{interrupt/alarm}.}
|
|
& \kwd{signal/alrm} & \Posix \\
|
|
%
|
|
\kwd{interrupt/int}\footnote{Also bound to {\scm} interrupt
|
|
\kwd{interrupt/keyboard}.}
|
|
& \kwd{signal/int} & \Posix \\
|
|
%
|
|
\kwd{interrupt/memory-shortage} & N/A & \\
|
|
\kwd{interrupt/chld} & \kwd{signal/chld} & \Posix \\
|
|
\kwd{interrupt/cont} & \kwd{signal/cont} & \Posix \\
|
|
\kwd{interrupt/hup} & \kwd{signal/hup} & \Posix \\
|
|
\kwd{interrupt/quit} & \kwd{signal/quit} & \Posix \\
|
|
\kwd{interrupt/term} & \kwd{signal/term} & \Posix \\
|
|
\kwd{interrupt/tstp} & \kwd{signal/tstp} & \Posix \\
|
|
\kwd{interrupt/usr1} & \kwd{signal/usr1} & \Posix \\
|
|
\kwd{interrupt/usr2} & \kwd{signal/usr2} & \Posix \\
|
|
\\
|
|
\kwd{interrupt/info} & \kwd{signal/info} & BSD only \\
|
|
\kwd{interrupt/io} & \kwd{signal/io} & BSD + SVR4 \\
|
|
\kwd{interrupt/poll} & \kwd{signal/poll} & SVR4 only \\
|
|
\kwd{interrupt/prof} & \kwd{signal/prof} & BSD + SVR4 \\
|
|
\kwd{interrupt/pwr} & \kwd{signal/pwr} & SVR4 only \\
|
|
\kwd{interrupt/urg} & \kwd{signal/urg} & BSD + SVR4 \\
|
|
\kwd{interrupt/vtalrm} & \kwd{signal/vtalrm} & BSD + SVR4 \\
|
|
\kwd{interrupt/winch} & \kwd{signal/winch} & BSD + SVR4 \\
|
|
\kwd{interrupt/xcpu} & \kwd{signal/xcpu} & BSD + SVR4 \\
|
|
\kwd{interrupt/xfsz} & \kwd{signal/xfsz} & BSD + SVR4 \\
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{{\scm} virtual-machine interrupts and related {\Unix} signals.
|
|
Only the {\Posix} signals are guaranteed to be defined; however,
|
|
your implementation and OS may define other signals and
|
|
interrupts not listed here.}
|
|
\end{minipage}
|
|
\label{table:signals-and-interrupts}
|
|
\end{table}
|
|
%
|
|
\begin{table}
|
|
\newcommand{\kwd}[1]{\index{\texttt{#1}}\texttt{#1}}
|
|
\begin{center}
|
|
\begin{tabular}{lll}\hline
|
|
Unix signal & Type & OS Variant \\ \hline\hline
|
|
\kwd{signal/stop} & Uncatchable & \Posix \\
|
|
\kwd{signal/kill} & Uncatchable & \Posix \\
|
|
\\
|
|
\kwd{signal/abrt} & Synchronous & \Posix \\
|
|
\kwd{signal/fpe} & Synchronous & \Posix \\
|
|
\kwd{signal/ill} & Synchronous & \Posix \\
|
|
\kwd{signal/pipe} & Synchronous & \Posix \\
|
|
\kwd{signal/segv} & Synchronous & \Posix \\
|
|
\kwd{signal/ttin} & Synchronous & \Posix \\
|
|
\kwd{signal/ttou} & Synchronous & \Posix \\
|
|
\\
|
|
\kwd{signal/bus} & Synchronous & BSD + SVR4 \\
|
|
\kwd{signal/emt} & Synchronous & BSD + SVR4 \\
|
|
\kwd{signal/iot} & Synchronous & BSD + SVR4 \\
|
|
\kwd{signal/sys} & Synchronous & BSD + SVR4 \\
|
|
\kwd{signal/trap} & Synchronous & BSD + SVR4 \\
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Uncatchable and synchronous {\Unix} signals. While these signals
|
|
may be sent with \texttt{signal-process} or
|
|
\texttt{signal-process-group},
|
|
there are no corresponding scsh interrupt handlers.
|
|
Only the {\Posix} signals are guaranteed to be defined; however,
|
|
your implementation and OS may define other signals not listed
|
|
here.}
|
|
\label{table:uncatchable-signals}
|
|
\end{table}
|
|
Note that scsh does \emph{not} support signal handlers for ``synchronous''
|
|
{\Unix} signals, such as \ex{signal/ill} or \ex{signal/pipe}
|
|
(see table~\ref{table:uncatchable-signals}).
|
|
Synchronous occurrences of these signals are better handled by raising
|
|
a Scheme exception.
|
|
We recommend you avoid using signal handlers unless you absolutely have
|
|
to; we intend to provide a better, higher-level interface to {\Unix}
|
|
signals after scsh has been ported to a multi-threaded platform.
|
|
|
|
\begin{defundesc}{signal->interrupt}{\integer}{\integer}
|
|
The programmer maps from {\Unix} signals to {\scm} interrupts with the
|
|
\ex{signal->interrupt} procedure.
|
|
If the signal does not have a defined {\scm} interrupt, an errror is signaled.
|
|
\end{defundesc}
|
|
|
|
|
|
\begin{defundesc}{interrupt-set}{\zeroormore{\integer}}{\integer}
|
|
This procedure builds interrupt sets from its interrupt arguments.
|
|
A set is represented as an integer using a two's-complement representation of
|
|
the bit set.
|
|
\end{defundesc}
|
|
|
|
|
|
\defun{enabled-interrupts}{}{interrupt-set}
|
|
\defunx{set-enabled-interrupts}{interrupt-set}{interrupt-set}
|
|
\begin{desc}
|
|
Get and set the value of the enabled-interrupt set.
|
|
Only interrupts in this set have their handlers called when delivered.
|
|
When a disabled interrupt is delivered to the {\scm} machine, it is
|
|
held pending until it becomes enabled, at which time its handler is invoked.
|
|
|
|
Interrupt sets are represented as integer bit sets (constructed with
|
|
the \ex{interrupt-set} function).
|
|
The \ex{set-enabled-interrupts} procedure returns the previous value of
|
|
the enabled-interrupt set.
|
|
\end{desc}
|
|
|
|
\dfn {with-enabled-interrupts} {interrupt-set . body} {value(s) of body} {syntax}
|
|
\defunx{with-enabled-interrupts*}{interrupt-set thunk} {value(s) of thunk}
|
|
\begin{desc}
|
|
Run code with a given set of interrupts enabled.
|
|
Note that ``enabling'' an interrupt means enabling delivery from
|
|
the {\scm} vm to the scsh program.
|
|
Using the {\scm} interrupt system is fairly lightweight, and does not involve
|
|
actually making a system call.
|
|
Note that enabling an interrupt means that the assigned interrupt handler
|
|
is allowed to run when the interrupt is delivered.
|
|
Interrupts not enabled are held pending when delivered.
|
|
|
|
Interrupt sets are represented as integer bit sets (constructed with
|
|
the \ex{interrupt-set} function).
|
|
\end{desc}
|
|
|
|
|
|
\begin{defundesc}{set-interrupt-handler}{interrupt handler}{old-handler}
|
|
Assigns a handler for a given interrupt,
|
|
and returns the interrupt's old handler.
|
|
The \var{handler} argument is \ex{\#f} (ignore), \ex{\#t} (default), or a
|
|
procedure taking an integer argument;
|
|
the return value follows the same conventions.
|
|
Note that the \var{interrupt} argument is an interrupt value,
|
|
not a signal value.
|
|
An interrupt is delivered to the {\scm} machine by (1) blocking all interrupts,
|
|
and (2) applying the handler procedure to the set of interrupts
|
|
that were enabled prior to the interrupt delivery.
|
|
If the procedure returns normally (\ie, it doesn't throw to a continuation),
|
|
the set of enabled interrupts will be returned to its previous value.
|
|
(To restore the enabled-interrupt set before throwing out of an interrupt
|
|
handler, see \ex{set-enabled-interrupts})
|
|
|
|
\note{If you set a handler for the \ex{interrupt/chld} interrupt,
|
|
you may break scsh's autoreaping process machinery. See the
|
|
discussion of autoreaping in section~\ref{sec:proc-objects}.}
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc}{interrupt-handler}{interrupt}{handler}
|
|
Return the handler for a given interrupt.
|
|
Note that the argument is an interrupt value, not a signal value.
|
|
A handler is either \ex{\#f} (ignore), \ex{\#t} (default), or a
|
|
procedure taking an integer argument.
|
|
\end{defundesc}
|
|
|
|
% %set-unix-signal-handler
|
|
% %unix-signal-handler
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Time}
|
|
\label{sec:time}
|
|
|
|
Scsh's time system is fairly sophisticated, particularly with respect
|
|
to its careful treatment of time zones.
|
|
However, casual users shouldn't be intimidated;
|
|
all of the complexity is optional,
|
|
and defaulting all the optional arguments reduces the system
|
|
to a simple interface.
|
|
|
|
\subsection{Terminology}
|
|
``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the
|
|
official name for what is colloquially referred to as ``Greenwich Mean
|
|
Time.''
|
|
|
|
{\Posix} allows a single time zone to specify \emph{two} different offsets
|
|
from UTC: one standard one, and one for ``summer time.''
|
|
Summer time is frequently some sort of daylight savings time.
|
|
|
|
The scsh time package consistently uses this terminology: we never say
|
|
``gmt'' or ``dst;'' we always say ``utc'' and ``summer time.''
|
|
|
|
\subsection{Basic data types}
|
|
We have two types: \emph{time} and \emph{date}.
|
|
|
|
\index{time}
|
|
A \emph{time} specifies an instant in the history of the universe.
|
|
It is location and time-zone independent.\footnote{Physics pedants please note:
|
|
The scsh authors live in a Newtonian universe. We disclaim responsibility
|
|
for calculations performed in non-ANSI standard light-cones.}
|
|
A time is a real value
|
|
giving the number of elapsed seconds since the Unix ``epoch''
|
|
(Midnight, January 1, 1970 UTC).
|
|
Time values provide arbitrary time resolution,
|
|
limited only by the number system of the underlying Scheme system.
|
|
|
|
\index{date}
|
|
A \emph{date} is a name for an instant in time that is specified
|
|
relative to some location/time-zone in the world, \eg:
|
|
\begin{tightinset}
|
|
Friday October 31, 1994 3:47:21 pm EST.
|
|
\end{tightinset}
|
|
Dates provide one-second resolution,
|
|
and are expressed with the following record type:
|
|
%
|
|
\begin{code}\index{date}
|
|
(define-record date ; A Posix tm struct
|
|
seconds ; Seconds after the minute [0-59]
|
|
minute ; Minutes after the hour [0-59]
|
|
hour ; Hours since midnight [0-23]
|
|
month-day ; Day of the month [1-31]
|
|
month ; Months since January [0-11]
|
|
year ; Years since 1900
|
|
tz-name ; Time-zone name: #f or a string.
|
|
tz-secs ; Time-zone offset: #f or an integer.
|
|
summer? ; Summer (Daylight Savings) time in effect?
|
|
week-day ; Days since Sunday [0-6]
|
|
year-day) ; Days since Jan. 1 [0-365]\end{code}
|
|
%
|
|
If the \ex{tz-secs} field is given, it specifies the time-zone's offset from
|
|
UTC in seconds. If it is specified, the \ex{tz-name} and \ex{summer?}
|
|
fields are ignored when using the date structure to determine a specific
|
|
instant in time.
|
|
|
|
If the \ex{tz-name} field is given, it is a time-zone string such as
|
|
\ex{"EST"} or \ex{"HKT"} understood by the OS.
|
|
Since {\Posix} time-zone strings can specify dual standard/summer time-zones
|
|
(e.g., "EST5EDT" specifies U.S. Eastern Standard/Eastern Daylight Time),
|
|
the value of the \ex{summer?} field is used to resolve the amiguous
|
|
boundary cases. For example, on the morning of the Fall daylight savings
|
|
change-over, 1:00am--2:00am happens twice. Hence the date 1:30 am
|
|
on this morning can specify two different seconds;
|
|
the \ex{summer?} flag says which one.
|
|
|
|
A date with $\ex{tz-name} = \ex{tz-secs} = \ex{\#f}$ is a date that
|
|
is specified in terms of the system's current time zone.
|
|
|
|
There is redundancy in the \ex{date} data structure.
|
|
For example, the \ex{year-day} field is redundant
|
|
with the \ex{month-day} and \ex{month} fields.
|
|
Either of these implies the values of the \ex{week-day} field.
|
|
The \ex{summer?} and \ex{tz-name} fields are redundant with the \ex{tz-secs}
|
|
field in terms of specifying an instant in time.
|
|
This redundancy is provided because consumers of dates may want it broken out
|
|
in different ways.
|
|
The scsh procedures that produce date records fill them out completely.
|
|
However, when date records produced by the programmer are passed to
|
|
scsh procedures, the redundancy is resolved by ignoring some of the
|
|
secondary fields.
|
|
This is described for each procedure below.
|
|
|
|
\defun{make-date} {s min h mday mon y [tzn tzs summ? wday yday]} {date}
|
|
\begin{desc}
|
|
When making a \ex{date} record, the last five elements of the record
|
|
are optional, and default to \ex{\#f}, \ex{\#f}, \ex{\#f}, 0,
|
|
and 0 respectively.
|
|
This is useful when creating a \ex{date} record to pass as an
|
|
argument to \ex{time}.
|
|
\end{desc}
|
|
|
|
\subsection{Time zones}
|
|
Several time procedures take time zones as arguments. When optional,
|
|
the time zone defaults to local time zone. Otherwise the time zone
|
|
can be one of:
|
|
\begin{inset}
|
|
\begin{tabular}{lp{0.7\linewidth}}
|
|
\ex{\#f} & Local time \\
|
|
Integer & Seconds of offset from UTC. For example,
|
|
New York City is -18000 (-5 hours), San Francisco
|
|
is -28800 (-8 hours). \\
|
|
String & A {\Posix} time zone string understood by the OS
|
|
(\ie., the sort of time zone assigned to the \ex{\$TZ}
|
|
environment variable).
|
|
\end{tabular}
|
|
\end{inset}
|
|
An integer time zone gives the number of seconds you must add to UTC
|
|
to get time in that zone. It is \emph{not} ``seconds west'' of UTC---that
|
|
flips the sign.
|
|
|
|
To get UTC time, use a time zone of either 0 or \ex{"UCT0"}.
|
|
|
|
\subsection{Procedures}
|
|
\defun {time+ticks} {} {[secs ticks]}
|
|
\defunx{ticks/sec} {} \real
|
|
\begin{desc}
|
|
The current time, with sub-second resolution.
|
|
Sub-second resolution is not provided by {\Posix},
|
|
but is available on many systems.
|
|
The time is returned as elapsed seconds since the Unix epoch, plus
|
|
a number of sub-second ``ticks.''
|
|
The length of a tick may vary from implementation to implementation;
|
|
it can be determined from \ex{(ticks/sec)}.
|
|
|
|
The system clock is not required to report time at the full resolution
|
|
given by \ex{(ticks/sec)}. For example, on BSD, time is reported at
|
|
$1\mu$s resolution, so \ex{(ticks/sec)} is 1,000,000. That doesn't mean
|
|
the system clock has micro-second resolution.
|
|
|
|
If the OS does not support sub-second resolution, the \var{ticks} value
|
|
is always 0, and \ex{(ticks/sec)} returns 1.
|
|
|
|
\begin{remarkenv}
|
|
I chose to represent system clock resolution as ticks/sec
|
|
instead of sec/tick to increase the odds that the value could
|
|
be represented as an exact integer, increasing efficiency and
|
|
making it easier for Scheme implementations that don't have
|
|
sophisticated numeric support to deal with the quantity.
|
|
|
|
You can convert seconds and ticks to seconds with the expression
|
|
\codex{(+ \var{secs} (/ \var{ticks} (ticks/sec)))}
|
|
Given that, why not have the fine-grain time procedure just
|
|
return a non-integer real for time? Following Common Lisp, I chose to
|
|
allow the system clock to report sub-second time in its own units to
|
|
lower the overhead of determining the time. This would be important
|
|
for a system that wanted to precisely time the duration of some
|
|
event. Time stamps could be collected with little overhead, deferring
|
|
the overhead of precisely calculating with them until after collection.
|
|
|
|
This is all a bit academic for the {\scm} implementation, where
|
|
we determine time with a heavyweight system call, but it's nice
|
|
to plan for the future.
|
|
\end{remarkenv}
|
|
\end{desc}
|
|
|
|
\defun {date} {} {date-record}
|
|
\defunx{date} {[time tz]} {date-record}
|
|
\begin{desc}
|
|
Simple \ex{(date)} returns the current date, in the local time zone.
|
|
|
|
With the optional arguments, \ex{date} converts the time to the date as
|
|
specified by the time zone \var{tz}.
|
|
\var{Time} defaults to the current time; \var{tz} defaults to local time,
|
|
and is as described in the time-zone section.
|
|
|
|
If the \var{tz} argument is an integer, the date's \ex{tz-name}
|
|
field is a {\Posix} time zone of the form
|
|
``\ex{UTC+\emph{hh}:\emph{mm}:\emph{ss}}'';
|
|
the trailing \ex{:\emph{mm}:\emph{ss}} portion is deleted if it is zeroes.
|
|
|
|
\oops{The Posix facility for converting dates to times, \ex{mktime()},
|
|
has a broken design: it indicates an error by returning -1, which
|
|
is also a legal return value (for date 23:59:59 UCT, 12/31/1969).
|
|
Scsh resolves the ambiguity in a paranoid fashion: it always
|
|
reports an error if the underlying Unix facility returns -1.
|
|
We feel your pain.
|
|
}
|
|
\end{desc}
|
|
|
|
\defun {time} {} \integer
|
|
\defunx{time} {[date]} \integer
|
|
\begin{desc}
|
|
Simple \ex{(time)} returns the current time.
|
|
|
|
With the optional date argument, \ex{time} converts a date to a time.
|
|
\var{Date} defaults to the current date.
|
|
|
|
Note that the input \var{date} record is overconstrained.
|
|
\ex{time} ignores \var{date}'s \ex{week-day} and \ex{year-day} fields.
|
|
If the date's \ex{tz-secs} field is set, the \ex{tz-name} and
|
|
\ex{summer?} fields are ignored.
|
|
|
|
If the \ex{tz-secs} field is \ex{\#f}, then the time-zone is taken
|
|
from the \ex{tz-name} field. A false \ex{tz-name} means the system's
|
|
current time zone. When calculating with time-zones, the date's
|
|
\ex{summer?} field is used to resolve ambiguities:
|
|
\begin{tightinset}
|
|
\begin{tabular}{ll}
|
|
\ex{\#f} & Resolve an ambiguous time in favor of non-summer time. \\
|
|
true & Resolve an ambiguous time in favor of summer time.
|
|
\end{tabular}
|
|
\end{tightinset}
|
|
This is useful in boundary cases during the change-over. For example,
|
|
in the Fall, when US daylight savings time changes over at 2:00 am,
|
|
1:30 am happens twice---it names two instants in time, an hour apart.
|
|
|
|
Outside of these boundary cases, the \ex{summer?} flag is ignored. For
|
|
example, if the standard/summer change-overs happen in the Fall and the
|
|
Spring, then the value of \ex{summer?} is ignored for a January or
|
|
July date. A January date would be resolved with standard time, and a
|
|
July date with summer time, regardless of the \ex{summer?} value.
|
|
|
|
The \ex{summer?} flag is also ignored if the time zone doesn't have
|
|
a summer time---for example, simple UTC.
|
|
\end{desc}
|
|
|
|
|
|
\defun {date->string} {date} \str
|
|
\defunx{format-date} {fmt date} \str
|
|
\begin{desc}
|
|
\ex{Date->string} formats the date as a 24-character string of the
|
|
form:
|
|
\begin{tightinset}
|
|
Sun Sep 16 01:03:52 1973
|
|
\end{tightinset}
|
|
|
|
\ex{Format-date} formats the date according to the format string
|
|
\var{fmt}. The format string is copied verbatim, except that tilde
|
|
characters indicate conversion specifiers that are replaced by fields from
|
|
the date record. Figure \ref{fig:dateconv} gives the full set of
|
|
conversion specifiers supported by \ex{format-date}.
|
|
|
|
\begin{boxedfigure}{tbp}
|
|
\renewcommand{\arraystretch}{1.25}
|
|
\begin{tabular}{l>{\raggedrightparbox}p{0.9\linewidth}}
|
|
\verb|~~| & Converted to the \verb|~| character. \\
|
|
\verb|~a| & abbreviated weekday name \\
|
|
\verb|~A| & full weekday name \\
|
|
\verb|~b| & abbreviated month name \\
|
|
\verb|~B| & full month name \\
|
|
\verb|~c| & time and date using the time and date representation
|
|
for the locale (\verb|~X ~x|) \\
|
|
\verb|~d| & day of the month as a decimal number (01-31) \\
|
|
\verb|~H| & hour based on a 24-hour clock
|
|
as a decimal number (00-23) \\
|
|
\verb|~I| & hour based on a 12-hour clock
|
|
as a decimal number (01-12) \\
|
|
\verb|~j| & day of the year as a decimal number (001-366) \\
|
|
\verb|~m| & month as a decimal number (01-12) \\
|
|
\verb|~M| & minute as a decimal number (00-59) \\
|
|
\verb|~p| & AM/PM designation associated with a 12-hour clock \\
|
|
\verb|~S| & second as a decimal number (00-61) \\
|
|
\verb|~U| & week number of the year;
|
|
Sunday is first day of week (00-53) \\
|
|
\verb|~w| & weekday as a decimal number (0-6), where Sunday is 0 \\
|
|
\verb|~W| & week number of the year;
|
|
Monday is first day of week (00-53) \\
|
|
\verb|~x| & date using the date representation for the locale \\
|
|
\verb|~X| & time using the time representation for the locale \\
|
|
\verb|~y| & year without century (00-99) \\
|
|
\verb|~Y| & year with century (\eg 1990) \\
|
|
\verb|~Z| & time zone name or abbreviation, or no characters
|
|
if no time zone is determinable
|
|
\end{tabular}
|
|
|
|
\caption{\texttt{format-date} conversion specifiers}
|
|
\label{fig:dateconv}
|
|
\end{boxedfigure}
|
|
\end{desc}
|
|
|
|
%\defun{utc-offset} {[time tz]} \integer
|
|
%\begin{desc}
|
|
% Returns the offset from UTC of time zone \var{tz} at instant \var{time}.
|
|
% \var{time} defaults to the current time; \var{tz} defaults to local time,
|
|
% and is as described in the time-zone section.
|
|
%
|
|
% The offset is the number of seconds you add to UTC time to get
|
|
% local time.
|
|
%
|
|
% Note: Be aware that other time interfaces (\eg, the BSD C interface)
|
|
% give offsets as seconds \emph{west} of UTC, which flips the sign. The scsh
|
|
% definition is chosen for arithmetic simplicity. It's easy to remember
|
|
% the definition of the offset: what you add to UTC to get local.
|
|
%\end{desc}
|
|
%
|
|
%\defun{time-zone} {[summer? tz]} \str
|
|
%\begin{desc}
|
|
% Returns the name of the time zone as a string. \var{Summer?} is
|
|
% used to choose between the summer name and the standard name
|
|
% (\eg, ``EST'' and ``EDT'')\@. \var{Summer?} is interpreted as follows:
|
|
% \begin{inset}
|
|
% \begin{tabular}{lp{0.7\linewidth}}
|
|
% Integer & A time value.
|
|
% The variant in use at that time is returned. \\
|
|
% \ex{\#f} & The standard time name is returned. \\
|
|
% \emph{Otherwise} & The summer time name is returned.
|
|
% \end{tabular}
|
|
% \end{inset}
|
|
% \ex{Summer?} defaults to the case that pertains at the time of the call.
|
|
% It is ignored if the time zone doesn't have a summer variant.
|
|
%\end{desc}
|
|
|
|
\dfni {fill-in-date!}{date}{date}{procedure}
|
|
{fill-in-date"!@\texttt{fill-in-date"!}}
|
|
\begin{desc}
|
|
This procedure fills in missing, redundant slots in a date record.
|
|
In decreasing order of priority:
|
|
\begin{itemize}
|
|
\itum{year, month, month-day $\Rightarrow$ year-day}
|
|
If the \ex{year}, \ex{month}, and \ex{month-day} fields are all
|
|
defined (are all integers), the \ex{year-day}
|
|
field is set to the corresponding value.
|
|
\itum{year, year-day $\Rightarrow$ month, month-day}
|
|
If the \ex{month} and \ex{month-day} fields aren't set, but
|
|
the \ex{year} and \ex{year-day} fields are set, then
|
|
\ex{month} and \ex{month-day} are calculated.
|
|
\itum{year, month, month-day, year-day $\Rightarrow$ week-day}
|
|
If either of the above rules is able to determine what day it is,
|
|
the \ex{week-day} field is then set.
|
|
\itum{tz-secs $\Rightarrow$ tz-name}
|
|
If \ex{tz-secs} is defined, but \ex{tz-name} is not, it is assigned
|
|
a time-zone name of the form ``\ex{UTC+\emph{hh}:\emph{mm}:\emph{ss}}'';
|
|
the trailing \ex{:\emph{mm}:\emph{ss}} portion is deleted if it
|
|
is zeroes.
|
|
\itum{tz-name, date, summer? $\Rightarrow$ tz-secs, summer?}
|
|
If the date information is provided up to second resolution,
|
|
\ex{tz-name} is also provided, and \ex{tz-secs} is not set,
|
|
then \ex{tz-secs} and \ex{summer?} are set to their correct values.
|
|
Summer-time ambiguities are resolved using the original value of
|
|
\ex{summer?}. If the time zone doesn't have a
|
|
summer time variant, then \ex{summer?} is set to \ex{\#f}.
|
|
\itum{local time, date, summer? $\Rightarrow$ tz-name, tz-secs, summer?}
|
|
If the date information is provided up to second resolution,
|
|
but no time zone information is provided (both \ex{tz-name} and
|
|
\ex{tz-secs} aren't set), then we proceed as in the above case,
|
|
except the system's current time zone is used.
|
|
\end{itemize}
|
|
These rules allow one particular ambiguity to escape:
|
|
if both \ex{tz-name} and \ex{tz-secs} are set, they are not brought
|
|
into agreement. It isn't clear how to do this, nor is it clear which
|
|
one should take precedence.
|
|
|
|
\oops{\ex{fill-in-date!} isn't implemented yet.}
|
|
|
|
\end{desc}
|
|
|
|
|
|
\section{Environment variables}
|
|
|
|
\defun {setenv} {var val} \undefined
|
|
\defunx {getenv} {var} \str
|
|
\begin{desc}
|
|
These functions get and set the process environment, stored in the
|
|
external C variable \ex{char **environ}.
|
|
An environment variable \var{var} is a string.
|
|
If an environment variable is set to a string \var{val},
|
|
then the process' global environment structure is altered with an entry
|
|
of the form \ex{"\var{var}=\var{val}"}.
|
|
If \var{val} is {\sharpf}, then any entry for \var{var} is deleted.
|
|
\end{desc}
|
|
|
|
\defun {env->alist}{} {{\str$\rightarrow$\str} alist}
|
|
\begin{desc}
|
|
The \ex{env->alist} procedure converts the entire environment into
|
|
an alist, \eg,
|
|
\begin{code}
|
|
(("TERM" . "vt100")
|
|
("SHELL" . "/usr/local/bin/scsh")
|
|
("PATH" . "/sbin:/usr/sbin:/bin:/usr/bin")
|
|
("EDITOR" . "emacs")
|
|
\ldots)\end{code}
|
|
\end{desc}
|
|
|
|
\defun {alist->env} {alist} \undefined
|
|
\begin{desc}
|
|
\var{Alist} must be an alist whose keys are all strings, and whose values
|
|
are all either strings or string lists. String lists are converted to
|
|
colon lists (see below). The alist is installed as the current {\Unix}
|
|
environment (\ie, converted to a null-terminated C vector of
|
|
\ex{"\var{var}=\var{val}"} strings which is assigned to the global
|
|
\ex{char **environ}).
|
|
|
|
\begin{code}
|
|
;;; Note $PATH entry is converted
|
|
;;; to /sbin:/usr/sbin:/bin:/usr/bin.
|
|
(alist->env '(("TERM" . "vt100")
|
|
("PATH" "/sbin" "/usr/sbin" "/bin")
|
|
("SHELL" . "/usr/local/bin/scsh")))
|
|
\end{code}
|
|
|
|
Note that \ex{env->alist} and \ex{alist->env} are not exact
|
|
inverses---\ex{alist->env} will convert a list value into a single
|
|
colon-separated string, but \ex{env->alist} will not parse colon-separated
|
|
values into lists. (See the \ex{\$PATH} element in the examples given for
|
|
each procedure.)
|
|
|
|
\end{desc}
|
|
|
|
The following three functions help the programmer manipulate alist
|
|
tables in some generally useful ways. They are all defined using
|
|
\ex{equal?} for key comparison.
|
|
|
|
\begin{defundesc} {alist-delete} {key alist} {alist}
|
|
Delete any entry labelled by value \var{key}.
|
|
\end{defundesc}
|
|
|
|
\begin{defundesc} {alist-update} {key val alist} {alist}
|
|
Delete \var{key} from \var{alist}, then cons on a
|
|
\ex{(\var{key} . \var{val})} entry.
|
|
\end{defundesc}
|
|
|
|
\defun{alist-compress} {alist} {alist}
|
|
\begin{desc}
|
|
Compresses \var{alist} by removing shadowed entries.
|
|
Example:
|
|
\begin{code}
|
|
;;; Shadowed (1 . c) entry removed.
|
|
(alist-compress '( (1 . a) (2 . b) (1 . c) (3 . d) ))
|
|
{\evalto} ((1 . a) (2 . b) (3 . d))\end{code}
|
|
\end{desc}
|
|
|
|
\defun {with-env*} {env-alist-delta thunk} {value(s) of thunk}
|
|
\defunx {with-total-env*} {env-alist thunk} {value(s) of thunk}
|
|
\begin{desc}
|
|
These procedures call \var{thunk} in the context of an altered
|
|
environment. They return whatever values \var{thunk} returns.
|
|
Non-local returns restore the environment to its outer value;
|
|
throwing back into the thunk by invoking a stored continuation
|
|
restores the environment back to its inner value.
|
|
|
|
The \var{env-alist-delta} argument specifies
|
|
a \emph{modification} to the current en\-vi\-ron\-ment---\var{thunk}'s
|
|
environment is the original environment overridden with the
|
|
bindings specified by the alist delta.
|
|
|
|
The \var{env-alist} argument specifies a complete environment
|
|
that is installed for \var{thunk}.
|
|
\end{desc}
|
|
|
|
\dfn {with-env} {env-alist-delta . body} {value(s) of body} {syntax}
|
|
\dfnx {with-total-env} {env-alist . body} {value(s) of body} {syntax}
|
|
\begin{desc}
|
|
These special forms provide syntactic sugar for \ex{with-env*}
|
|
and {\ttt with\=total\=env*}.
|
|
The env alists are not evaluated positions, but are implicitly backquoted.
|
|
In this way, they tend to resemble binding lists for \ex{let} and
|
|
\ex{let*} forms.
|
|
\end{desc}
|
|
|
|
Example: These four pieces of code all run the mailer with special
|
|
\cd{$TERM} and \cd{$EDITOR} values.
|
|
{\small
|
|
\begin{code}
|
|
(with-env (("TERM" . "xterm") ("EDITOR" . ,my-editor))
|
|
(run (mail shivers@lcs.mit.edu)))
|
|
\cb
|
|
(with-env* `(("TERM" . "xterm") ("EDITOR" . ,my-editor))
|
|
(\l{} (run (mail shivers@csd.hku.hk))))
|
|
\cb
|
|
(run (begin (setenv "TERM" "xterm") ; Env mutation happens
|
|
(setenv "EDITOR" my-editor) ; in the subshell.
|
|
(exec-epf (mail shivers@research.att.com))))
|
|
\cb
|
|
;; In this example, we compute an alternate environment ENV2
|
|
;; as an alist, and install it with an explicit call to the
|
|
;; EXEC-PATH/ENV procedure.
|
|
(let* ((env (env->alist)) ; Get the current environment,
|
|
(env1 (alist-update env "TERM" "xterm")) ; and compute
|
|
(env2 (alist-update env1 "EDITOR" my-editor))) ; the new env.
|
|
(run (begin (exec-path/env "mail" env2 "shivers@cs.cmu.edu"))))\end{code}}
|
|
|
|
\subsection{Path lists and colon lists}
|
|
|
|
When environment variables such as \ex{\$PATH} need to encode a list of
|
|
strings (such as a list of directories to be searched),
|
|
the common Unix convention is to separate the list elements with
|
|
colon delimiters.\footnote{\ldots and hope the individual list elements
|
|
don't contain colons themselves.}
|
|
To convert between the colon-separated string encoding and the
|
|
list-of-strings representation, see the \ex{infix-splitter} function
|
|
(section~\ref{sec:field-splitter}) and the string library's
|
|
\ex{string-join} function.
|
|
For example,
|
|
\begin{code}
|
|
(define split (infix-splitter (rx ":")))
|
|
(split "/sbin:/bin::/usr/bin") {\evalsto}
|
|
'("/sbin" "/bin" "" "/usr/bin")
|
|
(string-join ":" '("/sbin" "/bin" "" "/usr/bin")) {\evalsto}
|
|
"/sbin:/bin::/usr/bin"\end{code}
|
|
The following two functions are useful for manipulating these ordered lists,
|
|
once they have been parsed from their colon-separated form.
|
|
|
|
%\remark{An earlier release of scsh provided the \ex{split-colon-list}
|
|
% and \ex{string-list->colon-list} functions. These have been
|
|
% removed from scsh, and are replaced by the more general
|
|
% parsers and unparsers of the field-reader module.}
|
|
%
|
|
%\defun {split-colon-list} {string} {{\str} list}
|
|
%\defunx {string-list->colon-list} {string-list} \str
|
|
%\begin{desc}
|
|
% Many {\Unix} lists, such as the \cd{$PATH} search path,
|
|
% are stored as ``colon lists.''
|
|
% A colon list is a string containing elements delimited by colon characters.
|
|
% These functions provide conversions between colon lists and true
|
|
% {\Scheme} lists.
|
|
%%
|
|
%\begin{code}
|
|
%(split-colon-list "/foo:/bar::/usr/tmp") \evalto
|
|
% ("/foo" "/bar" "" "/usr/tmp")\end{code}
|
|
%%
|
|
% \ex{string-list->colon-list} is the inverse function.
|
|
%
|
|
% \ex{with-env*}, \ex{with-total-env*}, and \ex{alist->env} all coerce
|
|
% string lists to colon lists where appropriate.
|
|
%\end{desc}
|
|
|
|
\defun {add-before} {elt before list} {list}
|
|
\defunx {add-after} {elt after list} {list}
|
|
\begin{desc}
|
|
These functions are for modifying search-path lists, where element order
|
|
is significant.
|
|
|
|
\ex{add-before} adds \var{elt} to the list immediately
|
|
before the first occurrence of \var{before} in the list.
|
|
If \var{before} is not in the list, \var{elt} is added to the end
|
|
of the list.
|
|
|
|
\ex{add-after} is similar:
|
|
\var{elt} is added after the last occurrence of \var{after}.
|
|
If \var{after} is not found,
|
|
\var{elt} is added to the beginning of the list.
|
|
|
|
Neither function destructively alters the original path-list.
|
|
The result may share structure with the original list.
|
|
Both functions use \ex{equal?} for comparing elements.
|
|
\end{desc}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{\protect{\tt\$USER}, \protect{\tt\$HOME}, and \protect{\tt\$PATH}}
|
|
|
|
Like sh and unlike csh, scsh has \emph{no} interactive dependencies on
|
|
environment variables.
|
|
It does, however, initialise certain internal values at startup time from the
|
|
initial process environment, in particular \cd{$HOME} and \cd{$PATH}.
|
|
Scsh never uses \cd{$USER} at all.
|
|
It computes \ex{(user-login-name)} from the system call \ex{(user-uid)}.
|
|
|
|
\defvar {home-directory} \str
|
|
\defvarx {exec-path-list} {{\str} list fluid}
|
|
\begin{desc}
|
|
Scsh accesses \cd{$HOME} at start-up time, and stores the value in the
|
|
global variable \ex{home-directory}. It uses this value for \ex{\~}
|
|
lookups and for returning to home on \ex{(chdir)}.
|
|
|
|
Scsh accesses \cd{$PATH} at start-up time, colon-splits the path list, and
|
|
stores the value in the fluid \ex{exec-path-list}. This list is
|
|
used for \ex{exec-path} and \ex{exec-path/env} searches.
|
|
|
|
To access, rebind or side-effect fluid cells, you must open
|
|
the \ex{fluids} package.
|
|
\end{desc}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\input{tty}
|