scsh-0.5/doc/scsh-manual/syscalls.tex

2313 lines
96 KiB
TeX

%&latex -*- latex -*-
\chapter{System Calls}
\label{chapter:syscalls}
Scsh provides (almost) complete access to the basic {\Unix} kernel services:
processes, files, signals and so forth. These procedures comprise a first cut
at a {\Scheme} binding for {\Posix}, with a few extras thrown in (\eg,
symbolic links, \ex{fchown}, \ex{fstat}). A few have been punted for the
current release (tty control, ioctl, and a few others.)
\section{Errors}
Scsh syscalls never return error codes, and do not use a global
\ex{errno} variable to report errors.
Errors are consistently reported by raising exceptions.
This frees up the procedures to return useful values,
and allows the programmer to assume that
\emph{if a syscall returns, it succeeded.}
This greatly simplifies the flow of the code from the programmer's point
of view.
Since {\Scheme} does not yet have a standard exception system, the scsh
definition remains somewhat vague on the actual form of exceptions
and exception handlers. When a standard exception system is defined,
scsh will move to it. For now, scsh uses the {\scm} exception system,
with a simple sugaring on top to hide the details in the common case.
System call error exceptions contain the {\Unix} \ex{errno} code reported by
the system call. Unlike C, the \ex{errno} value is a part of the exception
packet, it is \emph{not} accessed through a global variable.
For reference purposes, the {\Unix} \ex{errno} numbers
are bound to the variables \ex{errno/perm}, \ex{errno/noent}, {\etc}
System calls never return \ex{error/intr}---they
automatically retry. (Currently only true for I/O calls.)
\begin{dfndesc}
{errno-error}{errno syscall .\ data}{\noreturn}{procedure}
Raises a {\Unix} error exception for {\Unix} error number \var{errno}.
The \var{syscall} and \var{data} arguments are packaged up in the exception
packet passed to the exception handler.
\end{dfndesc}
\defunx{with-errno-handler*}{handler thunk}{value(s) of thunk}
\begin{dfndescx}
{with-errno-handler}{handler-spec . body}{\valueofbody}{syntax}
{\Unix} syscalls raise error exceptions by calling \ex{errno-error}.
Programs can use \ex{with-errno-handler*} to establish
handlers for these exceptions.
If a {\Unix} error arises while \var{thunk} is executing,
\var{handler} is called on two arguments:
\codex{(\var{handler} \var{errno} \var{packet})}
\var{packet} is a list of the form
$$\var{packet} = \ex{(\var{errno-msg} \var{syscall} . \var{data})},$$
where \var{errno-msg} is the standard {\Unix} error message for the error,
\var{syscall} is the procedure that generated the error,
and \var{data} is a list of information generated by the error,
which varies from syscall to syscall.
If \var{handler} returns, the handler search continues upwards.
\var{Handler} can acquire the exception by invoking a saved continuation.
This procedure can be sugared over with the following syntax:
%
\begin{code}
(with-errno-handler
((\var{errno} \var{packet}) \var{clause} \ldots)
\var{body1}
\var{body2}
\ldots)\end{code}
%
This form executes the body forms with a particular errno handler installed.
When an errno error is raised, the handler search machinery will
bind variable \var{errno} to the error's integer code, and variable
\var{packet} to the error's auxiliary data packet.
Then, the clauses will be checked for a match.
The first clause that matches is executed, and its value is the
value of the entire \ex{with-errno-handler} form.
If no clause matches, the handler search continues.
Error clauses have two forms
%
\begin{code}
((\var{errno} \ldots) \var{body} \ldots)
(else \var{body} \ldots)\end{code}
%
In the first type of clause, the \var{errno} forms are integer expressions.
They are evaluated and compared to the error's errno value.
An \ex{else} clause matches any errno value.
Note that the \var{errno} and \var{data}
variables are lexically visible to the error clauses.
Example:
\begin{code}
(with-errno-handler
((errno packet) ; Only handle 3 particular errors.
((errno/wouldblock errno/again)
(loop))
((errno/acces)
(format #t "Not allowed access!")
#f))
(foo frobbotz)
(blatz garglemumph))\end{code}
%
It is not defined what dynamic context the handler executes in,
so fluid variables cannot reliably be referenced.
Note that Scsh system calls always retry when interrupted, so that
the \ex{errno/intr} exception is never raised.
If the programmer wishes to abort a system call on an interrupt, he
should have the interrupt handler explicitly raise an exception or
invoke a stored continuation to throw out of the system call.
\remark{This is not strictly true in the current implementation---only
some of the i/o syscalls loop.
But BSD variants never return \ex{EINTR} anyway, unless you explicitly
request it, so we'll live w/it for now.}
\end{dfndescx}
\subsection{Interactive mode and error handling}
Scsh runs in two modes: interactive and script mode. It starts up in
interactive mode if the scsh interpreter is started up with no script
argument. Otherwise, scsh starts up in script mode. The mode determines
whether scsh prints prompts in between reading and evaluating forms, and it
affects the default error handler. In interactive mode, the default error
handler will report the error, and generate an interactive breakpoint so that
the user can interact with the system to examine, fix, or dismiss from the
error. In script mode, the default error handler causes the scsh process to
exit.
When scsh forks a child with \ex{(fork)}, the child resets to script mode.
This can be overridden if the programmer wishes.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{I/O}
\subsection{Standard {\R4RS} I/O procedures}
In scsh, most standard {\R4RS} i/o operations (such as \ex{display} or
\ex{read-char}) work on both integer file descriptors and {\Scheme} ports.
When doing i/o with a file descriptor, the i/o operation is done
directly on the file, bypassing any buffered data that may have
accumulated in an associated port.
Note that character-at-a-time operations
(\eg, \ex{read-char} and \ex{read-line})
are likely to be quite slow when performed directly upon file
descriptors.
The standard {\R4RS} procedures \ex{read-char}, \ex{char-ready?}, \ex{write},
\ex{display}, \ex{newline},
and \ex{write-char} are all generic, accepting integer file descriptor
arguments as well as ports.
Scsh also mandates the availability of \ex{format}, and further requires
\ex{format} to accept file descriptor arguments as well as ports.
The procedures \ex{peek-char} and \ex{read} do \emph{not} accept
file descriptor arguments, since these functions require the ability to
read ahead in the input stream, a feature not supported by {\Unix} I/O.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Port manipulation and standard ports}
\defun {close-after} {port consumer} {value(s) of consumer}
\begin{desc}
Returns \ex{(\var{consumer} \var{port})}, but closes the port on return.
No dynamic-wind magic. \remark{Is there a less-awkward name?}
\end{desc}
\defun {error-output-port}{} {port}
\begin{desc}
This procedure is analogous to \ex{current-output-port}, but produces
a port used for error messages---the scsh equivalent of stderr.
\end{desc}
\defun {with-current-input-port*} {port thunk} {value(s) of thunk}
\defunx {with-current-output-port*} {port thunk} {value(s) of thunk}
\defunx {with-error-output-port*} {port thunk} {value(s) of thunk}
\begin{desc}
These procedures install \var{port} as the current input, current output,
and error output port, respectively, for the duration of a call to
\var{thunk}.
\end{desc}
\dfn {with-current-input-port} {port . body} {value(s) of body} {syntax}
\dfnx {with-current-output-port} {port . body} {value(s) of body} {syntax}
\dfnx {with-error-output-port} {port . body} {value(s) of body} {syntax}
\begin{desc}
These special forms are simply syntactic sugar for the
{\ttt with\=current\=input\=port*} procedure and friends.
\end{desc}
\defun {close} {port/fd} {\undefined}
\begin{desc}
Close the port or file descriptor.
If \var{port/fd} is a file descriptor, and it has a port allocated to it,
the port is shifted to a new file descriptor created with \ex{(dup
port/fd)} before closing \ex{port/fd}. The port then has its revealed
count set to zero. This reflects the design criteria that ports are not
associated with file descriptors, but with open files.
To close a file descriptor, and any associated port it might have, you
must instead say one of (as appropriate):
\begin{code}
(close (fdes->inport fd))
(close (fdes->outport fd))\end{code}
\end{desc}
\defun {stdports->stdio}{} {\undefined}
\defunx {stdio->stdports} {thunk} {value(s) of thunk}
\begin{desc}
\ex{(stdports->stdio)} is exactly equivalent to the series of
redirections:\footnote{Why not \ex{move->fdes}?
Because the current output port and error port
might be the same port.}
\begin{code}
(dup (current-input-port) 0)
(dup (current-output-port) 1)
(dup (error-output-port) 2)\end{code}
%
\ex{stdio->stdports} binds the standard ports \ex{(current-input-port)},
\ex{(current-output-port)}, and \ex{(error-output-port)} to be ports
on file descriptors 0, 1, 2, and then calls \var{thunk}.
It is equivalent to:
\begin{code}
(with-current-input-port (fdes->inport 0)
(with-current-output-port (fdes->inport 1)
(with-error-output-port (fdes->outport 2)
(thunk))))\end{code}
\end{desc}
\subsection{String ports}
{\scm} has string ports, which you can use. Scsh has not committed to the
particular interface or names that {\scm} uses, so be warned that the
interface described herein may be liable to change.
\defun {make-string-input-port} {string} {\port}
\begin{desc}
Returns a port that reads characters from the supplied string.
\end{desc}
\defun {make-string-output-port} {} {\port}
\defunx {string-output-port-output} {port} {\port}
\begin{desc}
A string output port is a port collects the characters given to it into
a string.
The accumulated string is retrieved by applying \ex{string-output-port-output}
to the port.
\end{desc}
\defun {call-with-string-output-port} {procedure} {\str}
\begin{desc}
The procedure is called on a port. When it returns,
\ex{call-with-string-output-port} returns a string containing the
characters written to the port.
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Revealed ports and file descriptors}
The material in this section and the following one is not critical for most
applications.
You may safely skim or completely skip this section on a first reading.
Dealing with {\Unix} file descriptors in a {\Scheme} environment is difficult.
In {\Unix}, open files are part of the process environment, and are referenced
by small integers called \emph{file descriptors}. Open file descriptors are
the fundamental way i/o redirections are passed to subprocesses, since
file descriptors are preserved across fork's and exec's.
{\Scheme}, on the other hand, uses ports for specifying i/o sources. Ports are
garbage-collected {\Scheme} objects, not integers. Ports can be garbage
collected; when a port is collected, it is also closed. Because file
descriptors are just integers, it's impossible to garbage collect them---you
wouldn't be able to close file descriptor 3 unless there were no 3's in the
system, and you could further prove that your program would never again
compute a 3. This is difficult at best.
If a {\Scheme} program only used {\Scheme} ports, and never actually used
file descriptors, this would not be a problem. But {\Scheme} code
must descend to the file descriptor level in at least two circumstances:
%
\begin{itemize}
\item when interfacing to foreign code
\item when interfacing to a subprocess.
\end{itemize}
%
This causes a problem. Suppose we have a {\Scheme} port constructed
on top of file descriptor 2. We intend to fork off a program that
will inherit this file descriptor. If we drop references to the port,
the garbage collector may prematurely close file 2 before we fork
the subprocess. The interface described below is intended to fix this and
other problems arising from the mismatch between ports and file descriptors.
The {\Scheme} kernel maintains a port table that maps a file descriptor
to the {\Scheme} port allocated for it (or, {\sharpf} if there is no port
allocated for this file descriptor). This is used to ensure that
there is at most one open port for each open file descriptor.
The port data structure for file ports has two fields besides the descriptor:
revealed and closed?. When a file port is closed with \ex{(close port)}, the
port's file descriptor is closed, its entry in the port table is cleared, and
the port's closed? field is set to true.
When a file descriptor is closed with \ex{(close fdes)}, any associated
port is shifted to a new file descriptor created with \ex{(dup fdes)}.
The port has its revealed count reset to zero. See discussion below.
To really put a stake through a descriptor's heart, you must say one of
%
\begin{code}
(close (fdes->inport fdes))
(close (fdes->output fdes))\end{code}
The revealed field is an aid to garbage collection. It is an integer
semaphore. If it is zero, the port's file descriptor can be closed when
the port is collected. Essentially, the revealed field reflects whether
or not the port's file descriptor has escaped to the {\Scheme} user. If
the {\Scheme} user doesn't know what file descriptor is associated with
a given port, then he can't possibly retain an ``integer handle'' on the
port after dropping pointers to the port itself, so the garbage collector
is free to close the file.
Ports allocated with \ex{open-output-file} and \ex{open-input-file} are
unrevealed ports---\ie, revealed is initialised to 0. No one knows the port's
file descriptor, so the file descriptor can be closed when the port is
collected.
The functions \ex{fdes->output-port}, \ex{fdes->input-port}, \ex{port->fdes}
are used to shift back and forth between file descriptors and ports. When
\ex{port->fdes} reveals a port's file descriptor, it increments the port's
revealed field. When the user is through with the file descriptor, he can
call \ex{(release-port-handle port)}, which decrements the count. The function
\ex{(call/fdes fdes/port proc)} automates this protocol. \ex{call/fdes} uses
\ex{dynamic-wind} to enforce the protocol. If \ex{proc} throws out of the
\ex{call/fdes}, unwind handler releases the descriptor handle; if the user
subsequently tries to throw \emph{back} into \ex{proc}'s context, the wind handler
raises an error. When the user maps a file descriptor to a port with
\ex{fdes->outport} or \ex{fdes->inport}, the port has its revealed field
incremented.
Not all file descriptors are created by requests to make ports. Some are
inherited on process invocation via \ex{exec(2)}, and are simply part of the
global environment. Subprocesses may depend upon them, so if a port is later
allocated for these file descriptors, is should be considered as a revealed
port. For example, when the {\Scheme} shell's process starts up, it opens ports
on file descriptors 0, 1, and 2 for the initial values of
\ex{(current-input-port)}, \ex{(current-output-port)}, and
\ex{(error-output-port)}. These ports are initialised with revealed set to 1,
so that stdin, stdout, and stderr are not closed even if the user drops the
port. A fine point: the stdin file descriptor is allocated an unbuffered
port. Because shells frequently share stdin with subprocesses, if the shell
does buffered reads, it might ``steal'' input intended for a subprocess. For
this reason, all shells, including sh, csh, and scsh, read stdin unbuffered.
Responsibility for deciding which other files must be opened unbuffered rests
with the shell programmer.
Unrevealed file ports have the nice property that they can be closed when all
pointers to the port are dropped. This can happen during gc, or at an
\ex{exec()}---since all memory is dropped at an \ex{exec()}. No one knows the
file descriptor associated with the port, so the exec'd process certainly
can't refer to it.
This facility preserves the transparent close-on-collect property
for file ports that are used in straightforward ways, yet allows
access to the underlying {\Unix} substrate without interference from
the garbage collector. This is critical, since shell programming
absolutely requires access to the {\Unix} file descriptors, as their
numerical values are a critical part of the process interface.
A port's underlying file descriptor can be shifted around with \ex{dup(2)} when
convenient. That is, the actual fd on top of which a port is constructed can be
shifted around underneath the port by the scsh kernel when necessary. This is
important, because when the user is setting up file descriptors prior to a
\ex{exec(2)}, he may explicitly use a file descriptor that has already been
allocated to some port. In this case, the scsh kernel just shifts the port's
file descriptor to some new location with \ex{dup}, freeing up its old
descriptor. This prevents errors from happening in the following scenario.
Suppose we have a file open on port \ex{f}. Now we want to run a program that
reads input on file 0, writes output to file 1, errors to file 2, and logs
execution information on file 3. We want to run this program with input from
\ex{f}. So we write:
%
\begin{code}
(run (/usr/shivers/bin/prog)
(> 1 output.txt)
(> 2 error.log)
(> 3 trace.log)
(= 0 ,f))\end{code}
%
Now, suppose by ill chance that, unbeknownst to us, when the operating system
opened \ex{f}'s file, it allocated descriptor 3 for it. If we blindly redirect
\ex{trace.log} into file descriptor 3, we'll clobber \ex{f}! However, the
port-shuffling machinery saves us: when the \ex{run} form tries to dup
\ex{trace.log}'s file descriptor to 3, \ex{dup} will notice that file
descriptor 3 is already associated with an unrevealed port (\ie, \ex{f}). So,
it will first move \ex{f} to some other file descriptor. This keeps \ex{f}
alive and well so that it can subsequently be dup'd into descriptor 0 for
\ex{prog}'s stdin.
The port-shifting machinery makes the following guarantee: a port is only
moved when the underlying file descriptor is closed, either by a \ex{close()}
or a \ex{dup2()} operation. Otherwise a port/file-descriptor association is
stable.
Under normal circumstances, all this machinery just works behind the scenes to
keep things straightened out. The only time the user has to think about it is
when he starts accessing file descriptors from ports, which he should almost
never have to do. If a user starts asking what file descriptors have been
allocated to what ports, he has to take responsibility for managing this
information.
\subsection{Port-mapping machinery}
The procedures provided in this section are almost never needed.
You may safely skim or completely skip this section on a first reading.
Here are the routines for manipulating ports in scsh. The important
points to remember are:
\begin{itemize}
\item A file port is associated with an open file, not a particular file
descriptor.
\item The association between a file port and a particular file descriptor
is never changed \emph{except} when the file descriptor is explicitly
closed. ``Closing'' includes being used as the target of a \ex{dup2}, so
the set of procedures below that close their targets are
\ex{close}, two-argument \ex{dup}, and \ex{move->fdes}.
If the target file descriptor of one of these routines has an
allocated port, the port will be shifted to another freshly-allocated
file descriptor, and marked as unrevealed, thus preserving the port
but freeing its old file descriptor.
\end{itemize}
These rules are what is necessary to ``make things work out'' with no
surprises in the general case.
\defun {fdes->inport} {fd} {port}
\defunx {fdes->outport} {fd} {port}
\defunx {port->fdes} {port} {\fixnum}
\begin{desc}
These increment the port's revealed count.
\end{desc}
\defun {port-revealed} {port} {{\integer} or \sharpf}
\begin{desc}
Return the port's revealed count if positive, otherwise \sharpf.
\end{desc}
\defun{release-port-handle} {port} {\undefined}
\begin{desc}
Decrement the port's revealed count.
\end{desc}
\defun {call/fdes} {fd/port consumer} {value(s) of consumer}
\begin{desc}
Calls \var{consumer} on a file descriptor;
takes care of revealed bookkeeping.
If \var{fd/port} is a file descriptor, this is just
\ex{(\var{consumer} \var{fd/port})}.
If \var{fd/port} is a port,
calls \var{consumer} on its underlying file descriptor.
While \var{consumer} is running, the port's revealed count is incremented.
When \ex{call/fdes} is called with port argument, you are not allowed to
throw into \var{consumer} with a stored continuation, as that would violate
the revealed-count bookkeeping.
\end{desc}
\defun{move->fdes} {fd/port target-fd} {port or fdes}
\begin{desc}
Maps fd$\rightarrow$fd and port$\rightarrow$port.
If \var{fd/port} is a file-descriptor not equal to \var{target-fd},
dup it to \var{target-fd} and close it. Returns \var{target-fd}.
If \var{fd/port} is a port, it is shifted to \var{target-fd},
by duping its underlying file-descriptor if necessary.
\var{Fd/port}'s original file descriptor is
closed (if it was different from \var{target-fd}).
Returns the port.
This operation resets \var{fd/port}'s revealed count to 1.
In all cases when \var{fd/port} is actually shifted, if there is a port
already using \var{target-fd}, it is first relocated to some other file
descriptor.
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{{\Unix} I/O}
\defun {dup} {port/fd [newfd]} {port/fd}
\defunx{dup->inport} {port/fd [newfd]} {port}
\defunx{dup->outport} {port/fd [newfd]} {port}
\defunx{dup->fdes} {port/fd [newfd]} {fd}
\begin{desc}
These procedures subsume the functionality of C's \ex{dup()} and \ex{dup2()}.
The different routines return different types of values:
\ex{dup->inport}, \ex{dup->outport}, and \ex{dup->fdes} return
input ports, output ports, and integer file descriptors, respectively.
\ex{dup}'s return value depends on on the type of
\var{port/fd}---it maps fd$\rightarrow$fd and port$\rightarrow$port.
These procedures use the {\Unix} \ex{dup()} syscall to replicate
the file descriptor or file port \var{port/fd}.
If a \var{newfd} file descriptor is given, it is used as the target of
the dup operation, \ie, the operation is a \ex{dup2()}.
In this case, procedures that return a port (such as \ex{dup->inport})
will return one with the revealed count set to one.
For example, \ex{(dup (current-input-port) 5)} produces
a new port with underlying file descriptor 5, whose revealed count is 1.
If \var{newfd} is not specified,
then the operating system chooses the file descriptor,
and any returned port is marked as unrevealed.
If the \var{newfd} target is given,
and some port is already using that file descriptor,
the port is first quietly shifted (with another \ex{dup})
to some other file descriptor (zeroing its revealed count).
Since {\Scheme} doesn't provide read/write ports,
\ex{dup->inport} and \ex{dup->outport} can be useful for
getting an output version of an input port, or \emph{vice versa}.
For example, if \ex{p} is an input port open on a tty, and
we would like to do output to that tty, we can simply use
\ex{(dup->outport p)} to produce an equivalent output port for the tty.
\end{desc}
\begin{defundesc} {file-seek} {fd/port offset whence} {\undefined}
\var{whence} is one of \{\ex{seek/set}, \ex{seek/delta}, \ex{seek/end}\}.
\oops{The current implementation doesn't handle \var{offset} arguments
that are not immediate integers (\ie, representable in 30 bits).}
\end{defundesc}
\begin{defundesc} {open-file} {fname flags [perms]} {\port}
\var{Perms} defaults to \cd{#o666}.
\var{Flags} is an integer bitmask, composed by or'ing together the following
constants:
\begin{code}\codeallowbreaks
open/read ; You may only
open/write ; choose one
open/read+write ; of these three
open/no-control-tty
open/nonblocking
open/append
open/create
open/truncate
open/exclusive
. ; Your Unix may have
. ; a few more.\end{code}
%
Returns a port. The port is an input port if the \ex{flags} permit it,
otherwise an output port. \R4RS/\scm/scsh do not have input/output ports,
so it's one or the other. This should be fixed. (You can hack simultaneous
i/o on a file by opening it r/w, taking the result input port,
and duping it to an output port with \ex{dup->outport}.)
\end{defundesc}
\defun{open-input-file}{fname [flags]}\port
\begin{defundescx}{open-output-file}{fname [flags perms]}\port
These are equivalent to \ex{open-file}, after first setting the
read/write bits of the \var{flags} argument to \ex{open/read} or
\ex{open/write}, respectively.
\var{Flags} defaults to zero for \ex{open-input-file},
and
\codex{(bitwise-ior open/create open/truncate)}
for \ex{open-output-file}.
These defaults make the procedures backwards-compatible with their
unary {\R4RS} definitions.
\end{defundescx}
\begin{defundesc} {open-fdes} {fname flags [perms]} \integer
Returns a file descriptor.
\end{defundesc}
\begin{defundesc}{pipe}{} {[\var{rport} \var{wport}]}
Returns two ports, the read and write end-points of a {\Unix} pipe.
\end{defundesc}
\begin{defundesc} {read-line} {[fd/port retain-newline?]} {{\str} or eof-object}
Reads and returns one line of text; on eof, returns the eof object.
A line is terminated by newline or eof.
\var{retain-newline?}\
defaults to {\sharpf}; if true, a terminating newline is included in the
result string, otherwise it is trimmed.
Using this argument allows one to tell whether or not the last line of
input in a file is newline terminated.
\end{defundesc}
\defun{read-string}{nbytes [fd/port]} {{\str} or \sharpf}
\begin{defundescx}
{read-string!} {str [fd/port start end]} {nread or \sharpf}
These calls read exactly as much data as you requested, unless
there is not enough data (eof).
\ex{read-string!} reads the data into string \var{str}
at the indices in the half-open interval $[\var{start},\var{end})$;
the default interval is the whole string: $\var{start}=0$ and
$\var{end}=\ex{(string-length \var{string})}$.
They will persistently retry on partial reads and when interrupted
until (1) error, (2) eof, or (3) the input request is completely
satisfied.
Partial reads can occur when reading from an intermittent source,
such as a pipe or tty.
\ex{read-string} returns the string read; \ex{read-string!} returns
the number of characters read. They both return false at eof.
A request to read zero bytes returns immediately, with no eof check.
The values of \var{start} and \var{end} must specify a well-defined
interval in \var{str},
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
Any partially-read data is included in the error exception packet.
Error returns on non-blocking input are considered an error.
\end{defundescx}
\defun {read-string/partial} {nbytes [fd/port]} {{\str} or \sharpf}
\begin{defundescx}
{read-string!/partial} {str [fd/port start end]} {nread or \sharpf}
%
These are atomic best-effort/forward-progress calls.
Best effort: they may read less than you request if there is a
lesser amount of data immediately available (\eg, because you
are reading from a pipe or a tty).
Forward progress: if no data is immediately available
(\eg, empty pipe), they will block.
Therefore, if you request an $n>0$ byte read,
while you may not get everything you asked for, you will always get something
(barring eof).
There is one case in which the forward-progress guarantee is cancelled:
when the programmer explicitly sets the port to non-blocking i/o.
In this case, if no data is immediately available,
the procedure will not block, but will immediately return a zero-byte read.
\ex{read-string/partial} reads the data into a freshly allocated string,
which it returns as its value.
\ex{read-string!/partial} reads the data into string \var{str}
at the indices in the half-open interval $[\var{start},\var{end})$;
the default interval is the whole string: $\var{start}=0$ and
$\var{end}=\ex{(string-length \var{string})}$.
The values of \var{start} and \var{end} must specify a well-defined
interval in \var{str},
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
It returns the number of bytes read.
A request to read zero bytes returns immediatedly, with no eof check.
In sum, there are only three ways you can get a zero-byte read:
(1) you request one, (2) you turn on non-blocking i/o, or (3) you
try to read at eof.
These are the routines to use for non-blocking input.
They are also useful when you wish to efficiently process data
in large blocks, and your algorithm is insensitive to the block size
of any particular read operation.
\end{defundescx}
\begin{defundesc}
{select}{readfds writefds exceptfds timeout}{rfds wfds efds}
%
\remark{Unimplemented. Should we implement a set-of abstraction first,
Or just use a twos-complement bitvector encoding with bignums?}
\end{defundesc}
\begin{defundescx}{write-string}{string [fd/port start end]}\undefined
This procedure writes all the data requested.
If the procedure cannot perform the write with a single kernel call
(due to interrupts or partial writes),
it will perform multiple write operations until all the data is written
or an error has occurred.
A non-blocking i/o error is considered an error.
(Error exception packets for this syscall include the amount of
data partially transferred before the error occurred.)
The data written are the characters of \var{string} in the half-open
interval $[\var{start},\var{end})$.
The default interval is the whole string: $\var{start}=0$ and
$\var{end}=\ex{(string-length \var{string})}$.
The values of \var{start} and \var{end} must specify a well-defined
interval in \var{str},
\ie, $0 \le \var{start} \le \var{end} \le \ex{(string-length \var{str})}$.
A zero-byte write returns immediately, with no error.
Output to buffered ports: \ex{write-string}'s efforts end as soon
as all the data has been placed in the output buffer.
Errors and true output may not happen until a later time, of course.
\end{defundescx}
\begin{defundescx}{write-string/partial}{string [fd/port start end]}{nwritten}
This routine is the atomic best-effort/forward-progress analog
to \ex{write-string}.
It returns the number of bytes written, which may be less than you
asked for.
Partial writes can occur when (1) we write off the physical end of
the media, (2) the write is interrrupted, or (3) the file descriptor
is set for non-blocking i/o.
If the file descriptor is not set up for non-blocking i/o, then
a successful return from these procedures makes a forward progress
guarantee---that is, a partial write took place of at least one byte:
\begin{itemize}
\item If we are at the end of physical media, and no write takes place,
an error exception is raised.
So a return implies we wrote \emph{something}.
\item If the call is interrupted after a partial transfer, it returns
immediately. But if the call is interrupted before any data transfer,
then the write is retried.
\end{itemize}
If we request a zero-byte write, then the call immediately returns 0.
If the file descriptor is set for non-blocking i/o, then the call
may return 0 if it was unable to immediately write anything
(\eg, full pipe).
Barring these two cases, a write either returns $\var{nwritten} > 0$,
or raises an error exception.
Non-blocking i/o is only available on file descriptors and unbuffered
ports. Doing non-blocking i/o to a buffered port is not well-defined,
and is an error (the problem is the subsequent flush operation).
\end{defundescx}
\begin{defundesc}{force-output} {[fd/port]}{\noreturn}
This procedure does nothing when applied to an integer file descriptor
or unbuffered port.
It flushes buffered output when applied to a buffered port,
and raises a write-error exception on error. Returns no value.
\end{defundesc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{File system}
Besides the following procedures, which allow access to the
computer's file system, scsh also provides a set of procedures
which manipulate file \emph{names}. These string-processing
procedures are documented in section \ref{sec:filenames}.
\defun {create-directory} {fname [perms override?]} {\undefined}
\defunx{create-fifo} {fname [perms override?]} {\undefined}
\defunx{create-hard-link} {oldname newname [override?]} {\undefined}
\begin{defundescx}
{create-symlink} {old-name new-name [override?]} {\undefined}
These procedures create objects of various kinds in the file system.
The \var{override?} argument controls the action if there is already an
object in the file system with the new name:
\begin{optiontable}
\sharpf & signal an error (default) \\
'query & prompt the user \\
\textnormal{\emph{other}}& \parbox[t]{0.7\linewidth}{
delete the old object (with \ex{delete-file}
or \ex{delete-directory,} as appropriate) before
creating the new object.}
\end{optiontable}
\var{Perms} defaults to \cd{#o777} (but is masked by the current umask).
\remark{Currently, if you try to create a hard or symbolic link from a
file to itself, you will error out with \var{override?} false, and simply
delete your file with \var{override?} true. Catching this will require
some sort of true-name procedure, which I currently do not have.}
\end{defundescx}
\defun {delete-directory} {fname} \undefined
\defunx{delete-file} {fname} \undefined
\begin{defundescx} {delete-filesys-object} {fname} \undefined
These procedures delete objects from the file system.
The {\ttt delete\=filesys\=object} procedure will delete an object
of any type from the file system: files, (empty) directories, symlinks, fifos,
\etc.
\end{defundescx}
\begin{defundescx}{read-symlink}{fname} \str
Return the filename referenced by symbolic link \ex{fname}.
\end{defundescx}
\begin{defundescx} {rename-file} {old-fname new-fname [override?]} \undefined
If you override an existing object, then \var{old-fname}
and \var{new-fname} must type-match---either both directories,
or both non-directories.
This is required by the semantics of {\Unix} \ex{rename()}.
\remark{
There is an unfortunate atomicity problem with the \ex{rename-file}
procedure: if you
specify no-override, but create file \ex{new-fname} sometime between
\ex{rename-file}'s existence check and the actual rename operation,
your file will be clobbered with \ex{old-fname}. There is no way to fix
this problem, given the semantics of {\Unix} \ex{rename()};
at least it is highly unlikely to occur in practice.
}
\end{defundescx}
\defun {set-file-mode} {fname/fd/port mode} \undefined
\defunx{set-file-owner} {fname/fd/port uid} {\undefined}
\defunx{set-file-group} {fname/fd/port gid} {\undefined}
\begin{desc}
These procedures set the permission bits, owner id, and group id of a
file, respectively.
The file can be specified by giving the file name, or either an
integer file descriptor or a port open on the file.
Setting file user or group ownership usually requires root privileges.
\end{desc}
\defun {sync-file} {fd/port} \undefined
\defunx{sync-file-system}{} \undefined
\begin{desc}
Calling \ex{sync-file}
causes {\Unix} to update the disk data structures for a given file.
If \var{fd/port} is a port, any buffered data it may have is first
flushed.
Calling \ex{sync-file-system} synchronises the kernel's entire file
system with the disk.
These procedures are not {\Posix}.
Interestingly enough, \ex{sync\=file\=system} doesn't actually
do what it is claimed to do. We just threw it in for humor value.
See the \ex{sync(2)} man page for {\Unix} enlightenment.
\end{desc}
\begin{defundesc} {truncate-file} {fname/fd/port len} \undefined
The specified file is truncated to \var{len} bytes in length.
\end{defundesc}
\begin{defundesc}{file-attributes} {fname/fd/port [chase?]} {file-info-record}
The \ex{file-attributes} procedure
returns a record structure containing everything
there is to know about a file. If the \var{chase?} flag is true
(the default), then the procedure chases symlinks and reports on
the files to which they refer. If \var{chase?} is false, then
the procedure checks the actual file itself, even if it's a symlink.
The \var{chase?} flag is ignored if the file argument is a file descriptor
or port.
The value returned is a \emph{file-info record}, defined to have the
following structure:
\begin{code}
(define-record file-info
type ; \{block-special, char-special, directory,
; fifo, regular, socket, symlink\}
device ; Device file resides on.
inode ; File's inode.
mode ; File's mode bits: permissions, setuid, setgid
nlinks ; Number of hard links to this file.
uid ; Owner of file.
gid ; File's group id.
size ; Size of file, in bytes.
atime ; Last access time.
mtime ; Last status-change time.
ctime) ; Creation time.\end{code}
\index{file-info}
\index{file-info:type}\index{file-info:device}\index{file-info:inode}%
\index{file-info:mode}\index{file-info:nlinks}\index{file-info:uid}%
\index{file-info:gid}\index{file-info:size}\index{file-info:atime}%
\index{file-info:mtime}\index{file-info:ctime}%
%
The uid field of a file-info record is accessed with the procedure
\codex{(file-info:uid x)}
and similarly for the other fields.
The \ex{type} field is a symbol; all other fields are integers.
A file-info record is discriminated with the \ex{file-info?} predicate.
The following procedures all return selected information about
a file; they are built on top of \ex{file-attributes}, and are
called with the same arguments that are passed to it.
\begin{inset}
\newcommand{\Ex}[1]{\ex{#1}\index{\tt{#1}}}
\begin{tabular}{ll}
Procedure & returns \\\hline
\Ex{file-type} & type \\
\Ex{file-inode} & inode \\
\Ex{file-mode} & mode \\
\Ex{file-nlinks} & nlinks \\
\Ex{file-owner} & uid \\
\Ex{file-group} & gid \\
\Ex{file-size} & size \\
\Ex{file-last-access} & atime \\
\Ex{file-last-mod} & mtime \\
\Ex{file-last-status-change} & ctime
\end{tabular}
\end{inset}
%
Example:
\begin{code}
;; All my files in /usr/tmp:
(filter (\l{f} (= (file-owner f) (user-uid)))
(directory-files "/usr/tmp")))\end{code}
\end{defundesc}
\defun {file-directory?}{fname/fd/port [chase?]}{\boolean}
\defunx {file-fifo?}{fname/fd/port [chase?]}{\boolean}
\defunx {file-regular?}{fname/fd/port [chase?]}{\boolean}
\defunx {file-socket?}{fname/fd/port [chase?]}{\boolean}
\defunx {file-special?}{fname/fd/port [chase?]}{\boolean}
\defunx {file-symlink?}{fname/fd/port}{\boolean}
\begin{desc}
These procedures are file-type predicates that test the
type of a given file.
The are applied to the same arguments to which \ex{file-attributes} is applied;
the sole exception is \ex{file-symlink?}, which does not take
the optional \var{chase?} second argument.
\begin{inset}
\newcommand{\Ex}[1]{\ex{#1}\index{\tt{#1}}}
\begin{tabular}{l@{\qquad}l}
\end{tabular}
\end{inset}
For example,
\codex{(file-directory? "/usr/dalbertz")\qquad\evalto\qquad\sharpt}
\end{desc}
\defun {file-not-readable?} {fname} \boolean
\defunx{file-not-writeable?} {fname} \boolean
\defunx{file-not-executable?} {fname} \boolean
\begin{desc}
Returns:
\begin{optiontable}
\textnormal{Value} & meaning \\ \hline
\sharpf & Access permitted \\
'search-denied & {\renewcommand{\arraystretch}{1}%
\begin{tabular}[t]{@{}l@{}}
Can't stat---a protected directory \\
is blocking access.\end{tabular}} \\
'permission & Permission denied. \\
'no-directory & Some directory doesn't exist. \\
'nonexistent & File doesn't exist.
\end{optiontable}
%
A file is considered writeable if either (1) it exists and is writeable
or (2) it doesn't exist and the directory is writeable.
Since symlink permission bits are ignored by the filesystem, these
calls do not take a \var{chase?} flag.
\oops{\ex{file-not-writeable?} does not currently do the directory
check.}
\end{desc}
\defun {file-readable?} {fname} \boolean
\defunx {file-writable?} {fname} \boolean
\defunx {file-executable?} {fname} \boolean
\begin{desc}
These procedures are the logical negation of the
preceding \ex{file-not-\ldots?} procedures.
\end{desc}
\begin{defundesc}{file-not-exists?} {fname [chase?]} \object
Returns:
\begin{optiontable}
\sharpf & Exists. \\
\sharpt & Doesn't exist. \\
'search-denied & \parbox[t]{0.5\linewidth}{\sloppy\raggedright
Some protected directory
is blocking the search.}
\end{optiontable}
\end{defundesc}
\begin{defundesc}{file-exists?} {fname [chase?]} \boolean
This is simply
\ex{(not (file-not-exists? \var{fname} \var{[chase?]}))}
\end{defundesc}
\defun {directory-files} {[dir dotfiles?]} {string list}
\begin{desc}
Return the list of files in directory \var{dir},
which defaults to the current working directory.
The \var{dotfiles?} flag (default {\sharpf}) causes dot files to be
included in the list.
Regardless of the value of \var{dotfiles?}, the two files \ex{.} and
\ex{..} are \emph{never} returned.
The directory \var{dir} is not prepended to each file name in the
result list. That is,
\codex{(directory-files "/etc")}
returns
\codex{("chown" "exports" "fstab" \ldots)}
\emph{not}
\codex{("/etc/chown" "/etc/exports" "/etc/fstab" \ldots)}
To use the files in returned list, the programmer can either manually
prepend the directory:
\codex{(map (\l{f} (string-append dir "/" f)) files)}
or cd to the directory before using the file names:
%
\begin{code}
(with-cwd dir
(for-each delete-file (directory-files)))\end{code}
%
or use the \ex{glob} procedure, defined below.
A directory list can be generated by \ex{(run/strings (ls))}, but this
is unreliable, as filenames with whitespace in their names will be
split into separate entries. Using \ex{directory-files} is reliable.
\end{desc}
\defun {glob} {\vari{pat}1 \ldots} {string list}
\begin{desc}
Glob each pattern against the filesystem and return the sorted list.
Duplicates are not removed. Patterns matching nothing are not included
literally.\footnote{Why bother to mention such a silly possibility?
Because that is what sh does.}
C shell \verb|{a,b,c}| patterns are expanded. Backslash quotes
characters, turning off the special meaning of
\verb|{|, \verb|}|, \cd{*}, \verb|[|, \verb|]|, and \verb|?|.
Note that the rules of backslash for {\Scheme} strings and glob patterns
work together to require four backslashes in a row to specify a
single literal backslash. Fortunately, this should be a rare
occurrence.
A glob subpattern will not match against dot files unless the first
character of the subpattern is a literal ``\ex{.}''.
Further, a dot subpattern will not match the files \ex{.} or \ex{..}
unless it is a constant pattern, as in \ex{(glob "../*/*.c")}.
So a directory's dot files can be reliably generated
with the simple glob pattern \ex{".*"}.
Some examples:
\begin{inset}
\begin{verbatim}
(glob "*.c" "*.h")
;; All the C and #include files in my directory.
(glob "*.c" "*/*.c")
;; All the C files in this directory and
;; its immediate subdirectories.
(glob "lexer/*.c" "parser/*.c")
(glob "{lexer,parser}/*.c")
;; All the C files in the lexer and parser dirs.
(glob "\\{lexer,parser\\}/*.c")
;; All the C files in the strange
;; directory "{lexer,parser}".
(glob "*\\*")
;; All the files ending in "*", e.g.
;; ("foo*" "bar*")
(glob "*lexer*")
("mylexer.c" "lexer1.notes")
;; All files containing the string "lexer".
(glob "lexer")
;; Either ("lexer") or ().\end{verbatim}
\end{inset}
%
If the first character of the pattern (after expanding braces) is a slash,
the search begins at root; otherwise, the search begins in the current
working directory.
If the last character of the pattern (after expanding braces) is a slash,
then the result matches must be directories, \eg,
\begin{code}
(glob "/usr/man/man?/") \evalto
("/usr/man/man1/" "/usr/man/man2/" \ldots)\end{code}
Globbing can sometimes be useful when we need a list of a directory's files
where each element in the list includes the pathname for the file.
Compare:
\begin{code}
(directory-files "../include") \evalto
("cig.h" "decls.h" \ldots)
(glob "../include/*") \evalto
("../include/cig.h" "../include/decls.h" \ldots)\end{code}
\end{desc}
\defun{glob-quote}{str}\str
\begin{desc}
Returns a constant glob pattern that exactly matches \var{str}.
All wild-card characters in \var{str} are quoted with a backslash.
\begin{code}
(glob-quote "Any *.c files?")
{\evalto}"Any \\*.c files\\?"\end{code}
\end{desc}
\begin{defundesc}{file-match}{root dot-files? \vari{pat}1 \vari{pat}2 {\ldots} \vari{pat}n}{string list}
\ex{file-match} provides a more powerful file-matching service, at the
expense of a less convenient notation. It is intermediate in
power between most shell matching machinery and recursive \ex{find(1)}.
Each pattern is a regexp. The procedure searches from \var{root},
matching the first-level files against pattern \vari{pat}1, the
second-level files against \vari{pat}2, and so forth.
The list of files matching the whole path pattern is returned,
in sorted order.
The matcher uses Spencer's regular expression package.
The files \ex{.} and \ex{..} are never matched. Other dot files are only
matched if the \var{dot-files?} argument is \sharpt.
A given \vari{pat}i pattern is matched as a regexp, so it is not forced
to match the entire file name. \Eg, pattern \ex{"t"} matches any
file containing a ``t'' in its name, while pattern \verb|"^t$"| matches
only a file whose entire name is ``\ex{t}''.
The \vari{pat}i patterns can be more general than stated above.
\begin{itemize}
\item A single pattern can specify multiple levels of the path by
embedding \ex{/} characters within the pattern. For example,
the pattern \ex{"a/b/c"} gives a match equivalent to the
list of patterns \ex{"a" "b" "c"}.
\item A \vari{pat}i pattern can be a procedure,
which is used as a match predicate.
It will be repeatedly called with a candidate file-name to test.
The file-name will be the entire path accumulated.
\end{itemize}
Some examples:
%% UGH. Because we are using code instead of verbatim, we have to
%% double up on backslashes.
\begin{tightleftinset}
\begin{code}
(file-match "/usr/lib" #f "m$" "^tab") \evalto
("/usr/lib/term/tab300" "/usr/lib/term/tab300-12" \ldots)
\cb
(file-match "." #f "^lex|parse|codegen$" "\\\\.c$") \evalto
("lex/lex.c" "lex/lexinit.c" "lex/test.c"
"parse/actions.c" "parse/error.c" parse/test.c"
"codegen/io.c" "codegen/walk.c")
\cb
(file-match "." #f "^lex|parse|codegen$/\\\\.c$")
;; The same.
\cb
(file-match "." #f file-directory?)
;; Return all subdirs of the current directory.
\cb
(file-match "/" #f file-directory?) \evalto
("/bin" "/dev" "/etc" "/tmp" "/usr")
;; All subdirs of root.
\cb
(file-match "." #f "\\\\.c")
;; All the C files in my directory.
\cb
(define (ext extension)
(\l{fn} (string-suffix? fn extension)))
\cb
(define (true . x) #t)
\cb
(file-match "." #f "./\\\\.c")
(file-match "." #f "" "\\\\.c")
(file-match "." #f true "\\\\.c")
(file-match "." #f true (ext "c"))
;; All the C files of all my immediate subdirs.
\cb
(file-match "." #f "lexer") \evalto
("mylexer.c" "lexer.notes")
;; Compare with (glob "lexer"), above.\end{code}
\end{tightleftinset}
Note that when \var{root} is the current working directory (\ex{"."}),
when it is converted to directory form, it becomes \ex{""}, and doesn't
show up in the result file-names.
It is regrettable that the regexp wild card char, ``\ex{.}'',
is such an important file name literal, as dot-file prefix and extension
delimiter.
\end{defundesc}
\begin{defundesc} {create-temp-file} {[prefix]} \str
\ex{Create-temp-file} creates a new temporary file and return its name.
The optional argument specifies the filename prefix to use, and defaults
to \ex{"/usr/tmp/\var{pid}"}, where \var{pid} is the current process' id.
The procedure generates a sequence of filenames that have \var{prefix} as
a common prefix, looking for a filename that doesn't already exist in the
file system. When it finds one, it creates it, with permission \cd{#o600}
and returns the filename. (The file permission can be changed to a more
permissive permission with \ex{set-file-mode} after being created).
This file is guaranteed to be brand new. No other process will have it
open. This procedure does not simply return a filename that is very
likely to be unused. It returns a filename that definitely did not exist
at the moment \ex{create-temp-file} created it.
It is not necessary for the process' pid to be a part of the filename
for the uniqueness guarantees to hold. The pid component of the default
prefix simply serves to scatter the name searches into sparse regions, so
that collisions are less likely to occur. This speeds things up, but does
not affect correctness.
Security note: doing i/o to files created this way in \ex{/usr/tmp/} is
not necessarily secure. General users have write access to \ex{/usr/tmp/},
so even if an attacker cannot access the new temp file, he can delete it
and replace it with one of his own. A subsequent open of this filename
will then give you his file, to which he has access rights. There are
several ways to defeat this attack,
\begin{enumerate}
\item Use \ex{temp-file-iterate}, below, to return the file descriptor
allocated when the file is opened. This will work if the file
only needs to be opened once.
\item If the file needs to be opened twice or more, create it in a
protected directory, \ex, \verb|$HOME|.
\item Ensure that \ex{/usr/tmp} has its sticky bit set. This
requires system administrator privileges.
\end{enumerate}
The actual default prefix used is controlled by the dynamic variable
\ex{*temp-file-template*}, and can be overridden for increased security.
See \ex{temp-file-iterate}.
\end{defundesc}
\defunx {temp-file-iterate} {maker [template]} {\object\+}
\defvarx {*temp-file-template*} \str
\begin{desc}
This procedure can be used to perform certain atomic transactions on
the file system involving filenames. Some examples:
\begin{itemize}
\item Linking a file to a fresh backup temp name.
\item Creating and opening an unused, secure temp file.
\item Creating an unused temporary directory.
\end{itemize}
This procedure uses \var{template} to generate a series of trial file
names.
\var{Template} is a \ex{format} control string, and defaults to
\codex{"/usr/tmp/\var{pid}.\~a"}
where \var{pid} is the current process' process id.
File names are generated by calling \ex{format} to instantiate the
template's \verb|~a| field with a varying string.
\var{Maker} is a procedure which is serially called on each file name
generated. It must return at least one value; it may return multiple
values. If the first return value is {\sharpf} or if \var{maker} raises the
\ex{errno/exist} errno exception, \ex{temp-file-iterate} will loop,
generating a new file name and calling \var{maker} again. If the first
return value is true, the loop is terminated, returning whatever value(s)
\var{maker} returned.
After a number of unsuccessful trials, \ex{temp-file-iterate} may give up
and signal an error.
Thus, if we ignore its optional \var{prefix} argument,
\ex{create-temp-file} could be defined as:
\begin{code}
(define (create-temp-file)
(let ((flags (bitwise-ior open/create open/exclusive)))
(temp-file-iterate
(\l{f}
(close (open-output-file f flags #o600))
f))))\end{code}
To rename a file to a temporary name:
\begin{code}
(temp-file-iterate (\l{backup}
(create-hard-link old-file backup)
backup)
".#temp.\~a") ; Keep link in cwd.
(delete-file old-file)\end{code}
Recall that scsh reports syscall failure by raising an error
exception, not by returning an error code. This is critical to
to this example---the programmer can assume that if the
\ex{temp-file-iterate} call returns, it returns successully.
So the following \ex{delete-file} call can be reliably invoked,
safe in the knowledge that the backup link has definitely been established.
To create a unique temporary directory:
\begin{code}
(temp-file-iterate (\l{dir} (create-directory dir) dir)
"/usr/tmp/tempdir.\~a")\end{code}
%
Similar operations can be used to generate unique symlinks and fifos,
or to return values other than the new filename (\eg, an open file
descriptor or port).
The default template is in fact taken from the value of the dynamic
variable \ex{*temp-file-template*}, which itself defaults to
\ex{"/usr/tmp/\var{pid}.\~a"}, where \var{pid} is the scsh process'
pid.
For increased security, a user may wish to change the template
to use a directory not allowing world write access
(\eg, his home directory).
\end{desc}
\defun{temp-file-channel}{} {[inp outp]}
\begin{desc}
This procedure can be used to provide an interprocess communications
channel with arbitrary-sized buffering. It returns two values, an input
port and an output port, both open on a new temp file. The temp file
itself is deleted from the {\Unix} file tree before \ex{temp-file-channel}
returns, so the file is essentially unnamed, and its disk storage is
reclaimed as soon as the two ports are closed.
\ex{Temp-file-channel} is analogous to \ex{port-pipe} with two exceptions:
\begin{itemize}
\item If the writer process gets ahead of the reader process, it will
not hang waiting for some small pipe buffer to drain. It will simply
buffer the data on disk. This is good.
\item If the reader process gets ahead of the writer process, it will
also not hang waiting for data from the writer process. It will
simply see and report an end of file. This is bad.
In order to ensure that an end-of-file returned to the reader is
legitimate, the reader and writer must serialise their i/o. The
simplest way to do this is for the reader to delay doing input
until the writer has completely finished doing output, or exited.
\end{itemize}
\end{desc}
\section{Processes}
\defun {exec} {prog arg1 \ldots argn} \noreturn
\defunx {exec-path} {prog arg1 \ldots argn} \noreturn
\defunx {exec/env} {prog env arg1 \ldots argn} \noreturn
\defunx {exec-path/env} {prog env arg1 \ldots argn} \noreturn
\begin{desc}
The \ex{\ldots/env} variants take an environment specified as a
string$\rightarrow$string alist.
An environment of {\sharpt} is taken to mean the current process' environment
(\ie, the value of the external char \ex{**environ}).
[Rationale: {\sharpf} is a more convenient marker for the current environment
than {\sharpt}, but would cause an ambiguity on Schemes that identify
{\sharpf} and \ex{()}.]
The path-searching variants search the directories in the list
{\ttt exec\=path\=list} for the program.
A path-search is not performed if the program name contains
a slash character---it is used directly. So a program with a name like
\ex{"bin/prog"} always executes the program \ex{bin/prog} in the current working
directory. See \verb|$path| and \verb|exec-path-list|, below.
Note that there is no analog to the C function \ex{execv()}.
To get the effect just do
\codex{(apply exec prog arglist)}
All of these procedures flush buffered output and close unrevealed ports
before executing the new binary.
To avoid flushing buffered output, see \verb|%exec| below.
Note that the C \ex{exec()} procedure allows the zeroth element of the
argument vector to be different from the file being executed, \eg
%
\begin{inset}
\begin{verbatim}
char *argv[] = {"-", "-f", 0};
exec("/bin/csh", argv, envp);\end{verbatim}
\end{inset}
%
The scsh \ex{exec}, \ex{exec-path}, \ex{exec/env}, and \ex{exec-path/env}
procedures do not give this functionality---element 0 of the arg vector is
always identical to the \ex{prog} argument. In the rare case the user wishes
to differentiate these two items, he can use the low-level \verb|%exec| and
\verb|exec-path-search| procedures.
These procedures never return under any circumstances.
As with any other system call, if there is an error, they raise
an exception.
\end{desc}
\defun {\%exec} {prog arglist env} \undefined
\defunx {exec-path-search} {fname pathlist} \str
\begin{desc}
\var{Arglist} is a list of arguments;
\var{env} is either a string$\rightarrow$string alist or {\sharpt}.
The new program's \cd{argv[0]} will be taken from \ex{(car \var{arglist})},
\emph{not} from \var{prog}.
An environment of {\sharpt} means the current process' environment.
\verb|%exec| does not flush buffered output
(see \ex{flush-all-ports}).
\ex{exec-path-search} searches the directories of \var{pathlist} looking for
an occurrence of file \ex{fname}. If no executable file is found, it returns
{\sharpf}. If \ex{fname} contains a slash character, the path search is
short-circuited, but the procedure still checks to ensure that the file exists
and is executable---if not, it still returns {\sharpf}.
See \cd{$path} and \ex{exec-path-list}, below.
All exec procedures, including \verb|%exec|, coerce the \cd{prog} and \cd{arg}
values to strings using the usual conversion rules: numbers are converted to
decimal numerals, and symbols converted to their print-names.
\end{desc}
\defun {exit} {[status]} \noreturn
\defunx {\%exit} {[status]} \noreturn
\begin{desc}
These procedures terminate the current process with a given exit status.
The default exit status is 0.
The low-level \verb|%exit| procedure immediately terminates the process
without flushing buffered output.
\end{desc}
\begin{defundesc}{suspend}{} \undefined
Suspend the current process with a SIGSTOP signal.
\end{defundesc}
\defun {fork} {[thunk]} {pid or \sharpf}
\defunx {\%fork} {[thunk]} {pid or \sharpf}
\begin{desc}
\ex{fork} with no arguments is like C \ex{fork()}.
In the parent process, it returns
the child's pid. In the child process, it returns {\sharpf}.
\ex{fork} with an argument only returns in the parent process, returning
the child pid. The child process calls \var{thunk} and then exits.
\ex{fork} flushes buffered output before forking, and sets the child
process to non-interactive. \verb|%fork| does not perform this bookkeeping;
it simply forks.
\end{desc}
\defun {fork/pipe} {[thunk]} {pid or \sharpf}
\defunx{\%fork/pipe} {[thunk]} {pid or \sharpf}
\begin{desc}
Like \ex{fork} and \ex{\%fork}, but the parent and child communicate via a
pipe connecting the parent's stdin to the child's stdout. These procedures
side-effect the parent by changing his stdin.
In effect, \ex{fork/pipe} splices a process into the data stream
immediately upstream of the current process.
This is the basic function for creating pipelines.
Long pipelines are built by performing a sequence of \ex{fork/pipe} calls.
For example, to create a background two-process pipe \ex{a | b}, we write:
%
\begin{code}
(fork (\l{} (fork/pipe a) (b)))\end{code}
%
which returns the pid of \ex{b}'s process.
To create a background three-process pipe \ex{a | b | c}, we write:
%
\begin{code}
(fork (\l{} (fork/pipe a)
(fork/pipe b)
(c)))\end{code}
%
which returns the pid of \ex{c}'s process.
\end{desc}
\defun {fork/pipe+} {conns [thunk]} {pid or \sharpf}
\defunx {\%fork/pipe+} {conns [thunk]} {pid or \sharpf}
\begin{desc}
Like \ex{fork/pipe}, but the pipe connections between the child and parent
are specified by the connection list \var{conns}.
See the
\codex{(|+ \var{conns} \vari{pf}{\!1} \ldots{} \vari{pf}{\!n})}
process form for a description of connection lists.
\end{desc}
\begin{defundesc} {wait} {[pid]} {status [pid]}
Simply calling \ex{(wait)} will wait for any child to die, then
return the child's exit status and pid as multiple values.
With an argument, \ex{(wait \var{pid})} waits for that specific process,
then returns its exit status as a single value.
If a candidate child has already exited but not yet been waited for,
\ex{wait} returns immediately.
\remark{Describe the way that wait reaps defunct processes into
the internal table. Document all the architected wait machinery.}
\end{defundesc}
When a child process dies, its parent can call the \ex{wait} procedure
to recover the exit status of the child.
The exit status is a small integer that can be encodes information
describing how the child terminated.
The bit-level format of the exit status is not defined by {\Posix}
(you must use the following three functions to decode one).
However, if a child terminates normally with exit code 0,
{\Posix} does require \ex{wait} to return an exit status that is exactly
zero.
So \ex{(zero? \var{status})} is a correct way to test for non-error,
normal termination.
\defun {status:exit-val}{status}{{\integer} or \sharpf}
\defunx{status:stop-sig}{status}{{\integer} or \sharpf}
\defunx{status:term-sig}{status}{{\integer} or \sharpf}
\begin{desc}
For a given status value produced by calling \ex{wait},
exactly one of these routines will return a true value.
If the child process exited normally, \ex{status:exit-val} returns the
exit code for the child process (\ie, the value the child passed to \ex{exit}
or returned from \ex{main}). Otherwise, this function returns false.
If the child process was suspended by a signal, \ex{status:stop-sig}
returns the signal that suspended the child.
Otherwise, this function returns false.
If the child process terminated abnormally, \ex{status:term-sig}
returns the signal that terminated the child.
Otherwise, this function returns false.
\end{desc}
\begin{defundesc} {call-terminally} {thunk} \noreturn
\ex{call-terminally} calls its thunk. When the thunk returns, the process
exits. Although \ex{call-terminally} could be implemented as
\codex{(\l{thunk} (thunk) (exit 0))}
an implementation can take advantage of the fact that this procedure never
returns. For example, the runtime can start with a fresh stack and also
start with a fresh dynamic environment, where shadowed bindings are
discarded. This can allow the old stack and dynamic environment to be
collected (assuming this data is not reachable through some live
continuation).
\end{defundesc}
%% Dereleased until we have a more portable implementation.
%\defun{halts?}{proc}\boolean
%\begin{desc}
%This procedure, ported from early T implementations,
%returns true iff \ex{(\var{proc})} returns at all.
%\remark{The current implementation is a constant function returning {\sharpt},
% which suffices for all {\Unix} implementations of which we are aware.}
%\end{desc}
\section{Process state}
\defun {umask}{} \fixnum
\defunx {set-umask} {perms} \undefined
\defunx {with-umask*} {perms thunk} {values of thunk}
\dfnx {with-umask} {perms . body} {values of body} {syntax}
\begin{desc}
The process' current umask is retrieved with \ex{umask}, and set with
\ex{(set-umask \var{perms})}. Calling \ex{with-umask*} changes the umask
to \var{perms} for the duration of the call to \var{thunk}. If the
program throws out of \var{thunk} by invoking a continuation, the umask is
reset to its external value. If the program throws back into \var{thunk}
by calling a stored continuation, the umask is restored to the \var{perms}
value. The special form \ex{with-umask} is equivalent in effect to
the procedure \ex{with-umask*}, but does not require the programmer
to explicitly wrap a \ex{(\l{} \ldots)} around the body of the code
to be executed.
\end{desc}
\defun {chdir} {[fname]} \undefined
\defunx {cwd}{} \str
\defunx {with-cwd*} {fname thunk} {value(s) of thunk}
\dfnx {with-cwd} {fname . body} {value(s) of body} {syntax}
\begin{desc}
These forms manipulate the current working directory.
The cwd can be changed with \ex{chdir}
(although in most cases, \ex{with-cwd} is preferrable).
If \ex{chdir} is called with no arguments, it changes the cwd to
the user's home directory.
The \ex{with-cwd*} procedure calls \ex{thunk} with the cwd temporarily
set to \var{fname}; when \var{thunk} returns, or is exited in a non-local
fashion (\eg, by raising an exception or by invoking a continuation),
the cwd is returned to its original value.
The special form \ex{with-cwd} is simply syntactic sugar for \ex{with-cwd*}.
\end{desc}
\defun {pid}{} \fixnum
\defunx {parent-pid}{} \fixnum
\defunx {process-group} {[pid]} \fixnum
\defunx {set-process-group} {[pid] pgrp} \undefined % [not implemented]
\begin{desc}
\ex{(pid)} and \ex{(parent-pid)} retrieve the process id for the
current process and its parent.
If the OS supports process groups, a process' process group can be
retrieved and set with \ex{process-group} and \ex{set-process-group}.
The affected process for these two procedures defaults to the current
process.
\end{desc}
\defun {set-priority} {which who priority} \undefined %; priority stuff unimplemented
\defunx {priority} {which who} \fixnum % ; not implemented
\defunx {nice} {[pid delta]} \undefined %; not implemented
\begin{desc}
These procedures set and access the priority of processes.
I can't remember how \ex{set-priority} and \ex{priority} work, so no
documentation, and besides, they aren't implemented yet, anyway.
\end{desc}
\defunx {user-login-name}{} \str
\defunx {user-uid}{} \fixnum
\defunx {user-effective-uid}{} \fixnum
\defunx {user-gid}{} \fixnum
\defunx {user-effective-gid}{} \fixnum
\defunx {user-supplementary-gids}{} {{\fixnum} list}
\defunx {set-uid} {uid} \undefined
\defunx {set-gid} {gid} \undefined
\begin{desc}
These routines get and set the effective and real user and group ids.
The \ex{set-uid} and \ex{set-gid} routines correspond to the {\Posix}
\ex{setuid()} and \ex{setgid()} procedures.
\end{desc}
\defun {process-times} {} {[{\fixnum} {\fixnum} {\fixnum} \fixnum]}
\begin{desc}
Returns four values:
\begin{tightinset}
\begin{tabular}{l}
user CPU time in clock-ticks \\
system CPU time in clock-ticks \\
user CPU time of all descendant processes \\
system CPU time of all descendant processes
\end{tabular}
\end{tightinset}
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{User and group db access}
These procedures are used to access the user and group database
(\eg, the ones traditionally stored in \ex{/etc/passwd} and \ex{/etc/group}.)
\defun {user-info} {uid/name} {record}
\begin{desc}
Return a \ex{user-info} record giving the recorded information for a
particular user:
\index{user-info}
\index{user-info:name}
\index{user-info:uid}
\index{user-info:gid}
\index{user-info:home-dir}
\index{user-info:shell}
\begin{code}
(define-record user-info
name uid gid home-dir shell)\end{code}
The \var{uid/name} argument is either an integer uid or a string user-name.
\end{desc}
\defun {->uid} {uid/name} \fixnum
\defunx {->username} {uid/name} \str
\begin{desc}
These two procedures coerce integer uid's and user names to a particular
form.
\end{desc}
\defun {group-info} {gid/name} {record}
\begin{desc}
Return a \ex{group-info} record giving the recorded information for a
particular user:
\index{group-info}
\index{group-info:name}
\index{group-info:gid}
\index{group-info:members}
\begin{code}
(define-record group-info
name gid members)\end{code}
The \var{gid/name} argument is either an integer gid or a string user-name.
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Accessing command-line arguments}
\defvar {command-line-arguments}{{\str} list}
\defunx {command-line}{} {{\str} list}
\begin{desc}
The list of strings \ex{command-line-arguments} contains the arguments
passed to the scsh process on the command line.
Calling \ex{(command-line)} returns the complete \ex{argv}
string list, including the program. So if we run a shell script
\codex{/usr/shivers/bin/myls -CF src}
then \ex{command-line-arguments} is
\codex{("-CF" "src")}
and \ex{(command-line)} returns
\codex{("/usr/shivers/bin/myls" "-CF" "src")}
\ex{command-line} returns a fresh list each time it is called.
In this way, the programmer can get a fresh copy of the original
argument list if \ex{command-line-arguments} has been modified or is lexically
shadowed.
\end{desc}
\defun {arg} {arglist n [default]} \str
\defunx {arg*} {arglist n [default-thunk]} \str
\defunx {argv} {n [default]} \str
\begin{desc}
These procedures are useful for accessing arguments from argument
lists.
\ex{arg} returns the $n^{\rm{th}}$ element of \var{arglist}.
The index is 1-based.
If \var{n} is too large, \var{default} is returned;
if no \var{default}, then an error is signaled.
\ex{arg*} is similar, except that the \var{default-thunk} is called to generate
the default value.
\ex{(argv \var{n})} is simply \ex{(arg (command-line) (+ \var{n} 1))}.
The +1 offset ensures that the two forms
%
\begin{code}
(arg command-line-arguments \var{n})
(argv \var{n})\end{code}
%
return the same argument
(assuming the user has not rebound or modified \ex{command-line-arguments}).
Example:
%
\begin{code}
(if (null? command-line-arguments)
(& (xterm -n ,host -title ,host
-name ,(string-append "xterm_" host)))
(let* ((progname (file-name-nondirectory (argv 1)))
(title (string-append host ":" progname)))
(& (xterm -n ,title
-title ,title
-e ,@command-line-arguments))))\end{code}
%
A subtlety: there are two ways to invoke a scsh program.
One is as a simple binary,
the other is as an interpreted script via the {\Unix}
\ex{\#!} \ex{exec(2)} feature.
When a binary is running with scsh code, \ex{(command-line)} returns exactly
the command line.
However, when the scsh interpreter is invoked with a scsh script
specified on the command line, then the scsh startup code doctors the list
returned by \ex{(command-line)} to make the shell script itself be the program
(\ie, \ex{(argv 0)}), instead of the string \ex{"scsh"},
or whatever the real \ex{(argv 0)} value is.
In addition, scsh will delete scsh-specific flags from the argument
list.
So if we have a shell script in file \ex{fullecho}:
\begin{code}
#!/usr/local/bin/scsh -s
!#
(for-each (\l{arg} (display arg) (display " "))
(command-line))\end{code}
and we run the program
\codex{fullecho hello world}
the program will print out
\codex{fullecho hello world}
not
\codex{/usr/local/bin/scsh -s fullecho hello world}
This argument line processing ensures that if a scsh script is subsequently
compiled into a standalone executable, that its semantics will be
unchanged---the arglist processing is invariant. In effect, the
\codex{/usr/local/bin/scsh -s}
is not part of the program;
it's a specification for the machine to execute the program on, so it is
not properly part of the program's argument list.
\remark{The truth:
The above discussion assumes some things that don't exist:
\begin{itemize}
\item An implementation of scsh that allows scsh scripts to
be compiled to native code binaries.
\item A native code binary implementation of the scsh interpreter.
\end{itemize}
What there is right now is just the {\scm} virtual machine,
invoked with a scsh heap image.
}
\end{desc}
\section{System parameters}
\defun {maximum-fds}{}\fixnum
\defunx {page-size}{} \fixnum
\defunx {system-name}{} \str
\begin{desc}
Only \ex{system-name} is implemented.
\end{desc}
\section{Signal system}
Signal numbers are bound to the variables \ex{signal/hup}, \ex{signal/int},
\ldots
\defun {signal-process} {pid sig} \undefined
\defunx {signal-procgroup} {prgrp sig} \undefined
\begin{desc}
These two procedures send signals to a specific process, and all the processes
in a specific process group, respectively.
\end{desc}
I haven't done signal handlers yet. Should be straightforward: a mechanism
to assign procedures to signals.
\defun{itimer}{???} \undefined
\defunx{pause-until-interrupt}{} \undefined
\defun{sleep}{secs} \undefined
\begin{desc}
Sleeping is defined, but we don't offer a way to sleep for a more precise
interval (\eg, a microsecond timer), as this is not in {\Posix}.
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Time}
This time package, does not currently work with NeXTSTEP, as NeXTSTEP
does not provide a Posix-compliant time interface that will successfully
link.
Scsh's time system is fairly sophisticated, particularly with respect
to its careful treatment of time zones.
However, casual users shouldn't be intimidated;
most of the complexity is optional,
and defaulting all the optional arguments reduces the system
to a simple interface.
\subsection{Terminology}
``UTC'' and ``UCT'' stand for ``universal coordinated time,'' which is the
official name for what is colloquially referred to as ``Greenwich Mean
Time.''
Posix allows a single time zone to specify \emph{two} different offsets from
UTC: one standard one, and one for ``summer time.'' Summer time is frequently
some sort of daylight savings time.
The scsh time package consistently uses this terminology: we never say
``gmt'' or ``dst;'' we always say ``utc'' and ``summer time.''
\subsection{Basic data types}
We have two types: \emph{time} and \emph{date}.
\index{time}
A \emph{time} specifies an instant in the history of the universe.
It is location and time-zone independent. A time is a real value
giving the number of elapsed seconds since the Unix ``epoch''
(Midnight, January 1, 1970 UTC).
Time values provide arbitrary time resolution,
limited only by the number system of the underlying Scheme system.
\index{date}
A \emph{date} is a name for an instant in time that is specified
relative to some location/time-zone in the world, \eg:
\begin{tightinset}
Friday October 31, 1994 3:47:21 pm EST.
\end{tightinset}
Dates provide one-second resolution,
and are expressed with the following record type:
%
\begin{code}\index{date}
(define-record date ; A Posix tm struct
seconds ; Seconds after the minute [0-59]
minute ; Minutes after the hour [0-59]
hour ; Hours since midnight [0-23]
month-day ; Day of the month [1-31]
month ; Months since January [0-11]
year ; Years since 1900
tz-name ; Time-zone name: #f or a string.
tz-secs ; Time-zone offset: #f or an integer.
summer? ; Summer (Daylight Savings) time in effect?
week-day ; Days since Sunday [0-6]
year-day) ; Days since Jan. 1 [0-365]\end{code}
%
If the \ex{tz-secs} field is given, it specifies the time-zone's offset from
UTC in seconds. If it is specified, the \ex{tz-name} and \ex{summer?}
fields are ignored when using the date structure to determine a specific
instant in time.
If the \ex{tz-name} field is given, it is a time-zone string such as
\ex{"EST"} or \ex{"HKT"} understood by the OS.
Since Posix time-zone strings can specify dual standard/summer time-zones
(e.g., "EST5EDT" specifies U.S. Eastern Standard/Eastern Daylight Time),
the value of the \ex{summer?} field is used to resolve the amiguous
boundary cases. For example, on the morning of the Fall daylight savings
change-over, 1:00am--2:00am happens twice. Hence the date 1:30 am
on this morning can specify two different seconds;
the \ex{summer?} flag says which one.
A date with $\ex{tz-name} = \ex{tz-secs} = \ex{\#f}$ is a date that
is specified in terms of the system's current time zone.
There is redundancy in the \ex{date} data structure.
For example, the \ex{year-day} field is redundant
with the \ex{month-day} and \ex{month} fields.
Either of these implies the values of the \ex{week-day} field.
The \ex{summer?} and \ex{tz-name} fields are redundant with the \ex{tz-secs}
field in terms of specifying an instant in time.
This redundancy is provided because consumers of dates may want it broken out
in different ways.
The scsh procedures that produce date records fill them out completely.
However, when date records produced by the programmer are passed to
scsh procedures, the redundancy is resolved by ignoring some of the
secondary fields.
This is described for each procedure below.
\defun{make-date} {s min h mday mon y [tzn tzs summ? wday yday]} {date}
\begin{desc}
When making a \ex{date} record, the last five elements of the record
are optional, and default to \ex{\#f}, \ex{\#f}, \ex{\#f}, 0,
and 0 respectively.
This is useful when creating a \ex{date} record to pass as an
argument to \ex{time}.
\end{desc}
\subsection{Time zones}
Several time procedures take time zones as arguments. When optional,
the time zone defaults to local time zone. Otherwise the time zone
can be one of:
\begin{inset}
\begin{tabular}{lp{0.7\linewidth}}
\ex{\#f} & Local time \\
Integer & Seconds of offset from UTC. For example,
New York City is -18000 (-5 hours), San Francisco
is -28800 (-8 hours). \\
String & A Posix time zone string understood by the OS
(\ie., the sort of time zone assigned to the \ex{\$TZ}
environment variable).
\end{tabular}
\end{inset}
An integer time zone gives the number of seconds you must add to UTC
to get time in that zone. It is \emph{not} ``seconds west'' of UTC---that
flips the sign.
To get UTC time, use a time zone of either 0 or \ex{"UCT0"}.
\subsection{Procedures}
\defun {time+ticks} {} {[secs ticks]}
\defunx{ticks/sec} {} \real
\begin{desc}
The current time, with sub-second resolution.
Sub-second resolution is not provided by Posix,
but is available on many systems.
The time is returned as elapsed seconds since the Unix epoch, plus
a number of sub-second ``ticks.''
The length of a tick may vary from implementation to implementation;
it can be determined from \ex{(ticks/sec)}.
The system clock is not required to report time at the full resolution
given by \ex{(ticks/sec)}. For example, on BSD, time is reported at
$1\mu$s resolution, so \ex{(ticks/sec)} is 1,000,000. That doesn't mean
the system clock has micro-second resolution.
If the OS does not support sub-second resolution, the \var{ticks} value
is always 0, and \ex{(ticks/sec)} returns 1.
\begin{remarkenv}
I chose to represent system clock resolution as ticks/sec
instead of sec/tick to increase the odds that the value could
be represented as an exact integer, increasing efficiency and
making it easier for Scheme implementations that don't have
sophisticated numeric support to deal with the quantity.
You can convert seconds and ticks to seconds with the expression
\codex{(+ \var{secs} (/ \var{ticks} (ticks/sec)))}
Given that, why not have the fine-grain time procedure just
return a non-integer real for time? Following Common Lisp, I chose to
allow the system clock to report sub-second time in its own units to
lower the overhead of determining the time. This would be important
for a system that wanted to precisely time the duration of some
event. Time stamps could be collected with little overhead, deferring
the overhead of precisely calculating with them until after collection.
This is all a bit academic for the {\scm} implementation, where
we determine time with a heavyweight system call, but it's nice
to plan for the future.
\end{remarkenv}
\end{desc}
\defun {date} {} {date-record}
\defunx{date} {[time tz]} {date-record}
\begin{desc}
Simple \ex{(date)} returns the current date, in the local time zone.
With the optional arguments, \ex{date} converts the time to the date as
specified by the time zone \var{tz}.
\var{Time} defaults to the current time; \var{tz} defaults to local time,
and is as described in the time-zone section.
If the \var{tz} argument is an integer, the date's \ex{tz-name}
field is a Posix time zone of the form
``\ex{UTC+\emph{hh}:\emph{mm}:\emph{ss}}'';
the trailing \ex{:\emph{mm}:\emph{ss}} portion is deleted if it is zeroes.
\end{desc}
\defun {time} {} \integer
\defunx{time} {[date]} \integer
\begin{desc}
Simple \ex{(time)} returns the current time.
With the optional date argument, \ex{time} converts a date to a time.
\var{Date} defaults to the current date.
Note that the input \var{date} record is overconstrained.
\ex{time} ignores \var{date}'s \ex{week-day} and \ex{year-day} fields.
If the date's \ex{tz-secs} field is set, the \ex{tz-name} and
\ex{summer?} fields are ignored.
If the \ex{tz-secs} field is \ex{\#f}, then the time-zone is taken
from the \ex{tz-name} field. A false \ex{tz-name} means the system's
current time zone. When calculating with time-zones, the date's
\ex{summer?} field is used to resolve ambiguities:
\begin{tightinset}
\begin{tabular}{ll}
\ex{\#f} & Resolve an ambiguous time in favor of non-summer time. \\
true & Resolve an ambiguous time in favor of summer time.
\end{tabular}
\end{tightinset}
This is useful in boundary cases during the change-over. For example,
in the Fall, when US daylight savings time changes over at 2:00 am,
1:30 am happens twice---it names two instants in time, an hour apart.
Outside of these boundary cases, the \ex{summer?} flag is ignored. For
example, if the standard/summer change-overs happen in the Fall and the
Spring, then the value of \ex{summer?} is ignored for a January or
July date. A January date would be resolved with standard time, and a
July date with summer time, regardless of the \ex{summer?} value.
The \ex{summer?} flag is also ignored if the time zone doesn't have
a summer time---for example, simple UTC.
\end{desc}
\defun {date->string} {date} \str
\defunx{format-date} {fmt date} \str
\begin{desc}
\ex{Date->string} formats the date as a 24-character string of the
form:
\begin{tightinset}
Sun Sep 16 01:03:52 1973
\end{tightinset}
\ex{Format-date} formats the date according to the format string
\var{fmt}. The format string is copied verbatim, except that tilde
characters indicate conversion specifiers that are replaced by fields from
the date record. Figure \ref{fig:dateconv} gives the full set of
conversion specifiers supported by \ex{format-date}.
\begin{boxedfigure}{tbp}
\renewcommand{\arraystretch}{1.25}
\begin{tabular}{l>{\raggedrightparbox}p{0.9\linewidth}}
\verb|~~| & Converted to the \verb|~| character. \\
\verb|~a| & abbreviated weekday name \\
\verb|~A| & full weekday name \\
\verb|~b| & abbreviated month name \\
\verb|~B| & full month name \\
\verb|~c| & time and date using the time and date representation
for the locale (\verb|~X ~x|) \\
\verb|~d| & day of the month as a decimal number (01-31) \\
\verb|~H| & hour based on a 24-hour clock
as a decimal number (00-23) \\
\verb|~I| & hour based on a 12-hour clock
as a decimal number (01-12) \\
\verb|~j| & day of the year as a decimal number (001-366) \\
\verb|~m| & month as a decimal number (01-12) \\
\verb|~M| & minute as a decimal number (00-59) \\
\verb|~p| & AM/PM designation associated with a 12-hour clock \\
\verb|~S| & second as a decimal number (00-61) \\
\verb|~U| & week number of the year;
Sunday is first day of week (00-53) \\
\verb|~w| & weekday as a decimal number (0-6), where Sunday is 0 \\
\verb|~W| & week number of the year;
Monday is first day of week (00-53) \\
\verb|~x| & date using the date representation for the locale \\
\verb|~X| & time using the time representation for the locale \\
\verb|~y| & year without century (00-99) \\
\verb|~Y| & year with century (\eg 1990) \\
\verb|~Z| & time zone name or abbreviation, or no characters
if no time zone is determinable
\end{tabular}
\caption{\texttt{format-date} conversion specifiers}
\label{fig:dateconv}
\end{boxedfigure}
\end{desc}
%\defun{utc-offset} {[time tz]} \integer
%\begin{desc}
% Returns the offset from UTC of time zone \var{tz} at instant \var{time}.
% \var{time} defaults to the current time; \var{tz} defaults to local time,
% and is as described in the time-zone section.
%
% The offset is the number of seconds you add to UTC time to get
% local time.
%
% Note: Be aware that other time interfaces (\eg, the BSD C interface)
% give offsets as seconds \emph{west} of UTC, which flips the sign. The scsh
% definition is chosen for arithmetic simplicity. It's easy to remember
% the definition of the offset: what you add to UTC to get local.
%\end{desc}
%
%\defun{time-zone} {[summer? tz]} \str
%\begin{desc}
% Returns the name of the time zone as a string. \var{Summer?} is
% used to choose between the summer name and the standard name
% (\eg, ``EST'' and ``EDT'')\@. \var{Summer?} is interpreted as follows:
% \begin{inset}
% \begin{tabular}{lp{0.7\linewidth}}
% Integer & A time value.
% The variant in use at that time is returned. \\
% \ex{\#f} & The standard time name is returned. \\
% \emph{Otherwise} & The summer time name is returned.
% \end{tabular}
% \end{inset}
% \ex{Summer?} defaults to the case that pertains at the time of the call.
% It is ignored if the time zone doesn't have a summer variant.
%\end{desc}
\defun {fill-in-date!}{date}{date}
\begin{desc}
This procedure fills in missing, redundant slots in a date record.
In decreasing order of priority:
\begin{itemize}
\itum{year, month, month-day $\Rightarrow$ year-day}
If the \ex{year}, \ex{month}, and \ex{month-day} fields are all
defined (are all integers), the \ex{year-day}
field is set to the corresponding value.
\itum{year, year-day $\Rightarrow$ month, month-day}
If the \ex{month} and \ex{month-day} fields aren't set, but
the \ex{year} and \ex{year-day} fields are set, then
\ex{month} and \ex{month-day} are calculated.
\itum{year, month, month-day, year-day $\Rightarrow$ week-day}
If either of the above rules is able to determine what day it is,
the \ex{week-day} field is then set.
\itum{tz-secs $\Rightarrow$ tz-name}
If \ex{tz-secs} is defined, but \ex{tz-name} is not, it is assigned
a time-zone name of the form ``\ex{UTC+\emph{hh}:\emph{mm}:\emph{ss}}'';
the trailing \ex{:\emph{mm}:\emph{ss}} portion is deleted if it
is zeroes.
\itum{tz-name, date, summer? $\Rightarrow$ tz-secs, summer?}
If the date information is provided up to second resolution,
\ex{tz-name} is also provided, and \ex{tz-secs} is not set,
then \ex{tz-secs} and \ex{summer?} are set to their correct values.
Summer-time ambiguities are resolved using the original value of
\ex{summer?}. If the time zone doesn't have a
summer time variant, then \ex{summer?} is set to \ex{\#f}.
\itum{local time, date, summer? $\Rightarrow$ tz-name, tz-secs, summer?}
If the date information is provided up to second resolution,
but no time zone information is provided (both \ex{tz-name} and
\ex{tz-secs} aren't set), then we proceed as in the above case,
except the system's current time zone is used.
\end{itemize}
These rules allow one particular ambiguity to escape:
if both \ex{tz-name} and \ex{tz-secs} are set, they are not brought
into agreement. It isn't clear how to do this, nor is it clear which
one should take precedence.
\oops{\ex{fill-in-date!} isn't implemented yet.}
\end{desc}
\section{Environment variables}
\defun {setenv} {var val} \undefined
\defunx {getenv} {var} \str
\begin{desc}
These functions get and set the process environment, stored in the
external C variable \ex{char **environ}.
An environment variable \var{var} is a string.
If an environment variable is set to a string \var{val},
then the process' global environment structure is altered with an entry
of the form \ex{"\var{var}=\var{val}"}.
If \var{val} is {\sharpf}, then any entry for \var{var} is deleted.
\end{desc}
\defun {env->alist}{} {{\str$\rightarrow$\str} alist}
\begin{desc}
The \ex{env->alist} procedure converts the entire environment into
an alist, \eg,
\begin{code}
(("TERM" . "vt100")
("SHELL" . "/bin/csh")
("EDITOR" . "emacs")
\ldots)\end{code}
\end{desc}
\defun {alist->env} {alist} \undefined
\begin{desc}
\var{Alist} must be an alist whose keys are all strings, and whose values
are all either strings or string lists. String lists are converted to
colon lists (see below). The alist is installed as the current {\Unix}
environment (\ie, converted to a null-terminated C vector of
\ex{"\var{var}=\var{val}"} strings which is assigned to the global
\ex{char **environ}).
\end{desc}
The following three functions help the programmer manipulate alist
tables in some generally useful ways. They are all defined using
\ex{equal?} for key comparison.
\begin{defundesc} {alist-delete} {key alist} {alist}
Delete any entry labelled by value \var{key}.
\end{defundesc}
\begin{defundesc} {alist-update} {key val alist} {alist}
Delete \var{key} from \var{alist}, then cons on a
\ex{(\var{key} . \var{val})} entry.
\end{defundesc}
\defun{alist-compress} {alist} {alist}
\begin{desc}
Compresses \var{alist} by removing shadowed entries.
Example:
\begin{code}
;;; Shadowed (1 . c) entry removed.
(alist-compress '( (1 . a) (2 . b) (1 . c) (3 . d) ))
{\evalto} ((1 . a) (2 . b) (3 . d))\end{code}
\end{desc}
\defun {with-env*} {env-alist-delta thunk} {value(s) of thunk}
\defunx {with-total-env*} {env-alist thunk} {value(s) of thunk}
\begin{desc}
These procedures call \var{thunk} in the context of an altered
environment. They return whatever values \var{thunk} returns.
Non-local returns restore the environment to its outer value;
throwing back into the thunk by invoking a stored continuation
restores the environment back to its inner value.
The \var{env-alist-delta} argument specifies
a \emph{modification} to the current en\-vi\-ron\-ment---\var{thunk}'s
environment is the original environment overridden with the
bindings specified by the alist delta.
The \var{env-alist} argument specifies a complete environment
that is installed for \var{thunk}.
\end{desc}
\dfn {with-env} {env-alist-delta . body} {value(s) of body} {syntax}
\dfnx {with-total-env} {env-alist . body} {value(s) of body} {syntax}
\begin{desc}
These special forms provide syntactic sugar for \ex{with-env*}
and {\ttt with\=total\=env*}.
The env alists are not evaluated positions, but are implicitly backquoted.
In this way, they tend to resemble binding lists for \ex{let} and
\ex{let*} forms.
\end{desc}
Example: These four pieces of code all run the mailer with special
\cd{$TERM} and \cd{$EDITOR} values.
{\small
\begin{code}
(with-env (("TERM" . "xterm") ("EDITOR" . ,my-editor))
(run (mail shivers@lcs.mit.edu)))
\cb
(with-env* `(("TERM" . "xterm") ("EDITOR" . ,my-editor))
(\l{} (run (mail shivers@csd.hku.hk))))
\cb
(run (begin (setenv "TERM" "xterm") ; Env mutation happens
(setenv "EDITOR" my-editor) ; in the subshell.
(exec-epf (mail shivers@research.att.com))))
\cb
;; In this example, we compute an alternate environment ENV2
;; as an alist, and install it with an explicit call to the
;; EXEC-PATH/ENV procedure.
(let* ((env (env->alist)) ; Get the current environment,
(env1 (alist-update env "TERM" "xterm")) ; and compute
(env2 (alist-update env1 "EDITOR" my-editor))) ; the new env.
(run (begin (exec-path/env "mail" env2 "shivers@cs.cmu.edu"))))\end{code}}
\subsection{Path lists and colon lists}
Environment variables such as \ex{\$PATH} encode a list of strings
by separating the list elements with colon delimiters.
Once parsed into actual lists, these ordered lists can be manipulated
with the following two functions.
To convert between the colon-separated string encoding and the
list-of-strings representation, see the \ex{field-reader} and
\ex{join-strings} functions in section~\ref{sec:field-reader}.
\remark{An earlier release of scsh provided the \ex{split-colon-list}
and \ex{string-list->colon-list} functions. These have been
removed from scsh, and are replaced by the more general
parsers and unparsers of the field-reader module.}
%\defun {split-colon-list} {string} {{\str} list}
%\defunx {string-list->colon-list} {string-list} \str
%\begin{desc}
% Many {\Unix} lists, such as the \cd{$PATH} search path,
% are stored as ``colon lists.''
% A colon list is a string containing elements delimited by colon characters.
% These functions provide conversions between colon lists and true
% {\Scheme} lists.
%%
%\begin{code}
%(split-colon-list "/foo:/bar::/usr/tmp") \evalto
% ("/foo" "/bar" "" "/usr/tmp")\end{code}
%%
% \ex{string-list->colon-list} is the inverse function.
%
% \ex{with-env*}, \ex{with-total-env*}, and \ex{alist->env} all coerce
% string lists to colon lists where appropriate.
%\end{desc}
\defun {add-before} {elt before list} {list}
\defunx {add-after} {elt after list} {list}
\begin{desc}
These functions are for modifying search-path lists, where element order
is significant.
\ex{add-before} adds \var{elt} to the list immediately
before the first occurrence of \var{before} in the list.
If \var{before} is not in the list, \var{elt} is added to the end
of the list.
\ex{add-after} is similar:
\var{elt} is added after the last occurrence of \var{after}.
If \var{after} is not found,
\var{elt} is added to the beginning of the list.
Neither function destructively alters the original path-list.
The result may share structure with the original list.
Both functions use \ex{equal?} for comparing elements.
\end{desc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\protect{\tt\$USER}, \protect{\tt\$HOME}, and \protect{\tt\$PATH}}
Like sh and unlike csh, scsh has \emph{no} interactive dependencies on
environment variables.
It does, however, initialise certain internal values at startup time from the
initial process environment, in particular \cd{$HOME} and \cd{$PATH}.
Scsh never uses \cd{$USER} at all.
It computes \ex{(user-login-name)} from the system call \ex{(user-uid)}.
\defvar {home-directory} \str
\defvarx {exec-path-list} {{\str} list}
\begin{desc}
Scsh accesses \cd{$HOME} at start-up time, and stores the value in the
global variable \ex{home-directory}. It uses this value for \ex{\~}
lookups and for returning to home on \ex{(chdir)}.
Scsh accesses \cd{$PATH} at start-up time, colon-splits the path list, and
stores the value in the global variable \ex{exec-path-list}. This list is
used for \ex{exec-path} and \ex{exec-path/env} searches.
\end{desc}