scsh-install-lib/doc/latex/proposal.tex

600 lines
25 KiB
TeX

%% $Id: proposal.tex,v 1.1 2004/02/11 19:42:30 michel-schinz Exp $
%% TODO
%% - clean up permissions mess
\documentclass[a4paper,12pt]{article}
\usepackage{a4wide, palatino, url, hyperref}
\newcommand{\file}{\begingroup \urlstyle{tt}\Url}
\newcommand{\envvar}[1]{\texttt{#1}}
\newcommand{\cloption}[1]{\texttt{#1}}
\newcommand{\package}[1]{\texttt{#1}}
\newcommand{\layout}[1]{\texttt{#1}}
\newcommand{\location}[1]{\texttt{#1}}
\newcommand{\define}[3]{%
\vspace{1em}%
\noindent%
(\texttt{#1} \textit{#2})\hfill\textit{(#3)}\\[0.5em]%
}
\newcommand{\definep}[2]{\define{#1}{#2}{procedure}}
\newcommand{\defines}[2]{\define{#1}{#2}{syntax}}
\newcommand{\param}[1]{\emph{#1}}
\newenvironment{rationale}%
{\begin{quotation}\noindent\textbf{Rationale}}%
{\end{quotation}}
\newenvironment{example}%
{\begin{quotation}\noindent\textbf{Example}}%
{\end{quotation}}
\begin{document}
\title{A proposal for scsh packages}
\author{Michel Schinz}
\maketitle
\section{Introduction}
\label{sec:introduction}
The aim of the following proposal is to define a standard for the
packaging, distribution, installation, use and removal of libraries
for scsh. Such packaged libraries are called \emph{scsh packages} or
simply \emph{packages} below.
This proposal attempts to cover both libraries containing only Scheme
code and libraries containing additional C code. It does not try to
cover applications written in scsh, which are currently considered to
be outside of its scope.
\subsection{Package identification and naming}
Packages are identified by a globally-unique name. This name should
start with an ASCII letter (a-z or A-Z) and should consist only of
ASCII letters, digits or underscore characters `\verb|_|'. Package
names are case-sensitive, but there should not be two packages with
names which differ only by their capitalisation.
\begin{rationale}
This restriction on package names ensures that they can be used to
name directories on current operating systems.
\end{rationale}
Several versions of a given package can exist. A version is identified
by a sequence of non-negative integers. Versions are ordered
lexicographically.
A version has a printed representation which is obtained by separating
(the printed representation of) its components by dots. For example,
the printed representation of a version composed of the integer 1
followed by the integer 2 is the string \texttt{1.2}. Below, versions
are usually represented using their printed representation for
simplicity, but it is important to keep in mind that versions are
sequences of integers, not strings.
A specific version of a package is therefore identified by a name and
a version. The full name of a version of a package is obtained by
concatenating:
\begin{itemize}
\item the name of the package,
\item a hyphen `\texttt{-}',
\item the printed representation of the version.
\end{itemize}
In what follows, the term \emph{package} is often used to designate a
specific version of a package, but this should be clear from the
context.
\section{Distribution of packages}
Packages are distributed in \texttt{tar} archives, which can
optionally be compressed by \texttt{gzip} or \texttt{bzip2}.
The name of the archive is composed by appending:
\begin{itemize}
\item the full name of the package,
\item the string \texttt{.tar} indicating that it's a \texttt{tar}
archive,
\item either the string \texttt{.gz} if the archive is compressed
using \texttt{gzip}, or the string \texttt{.bz2} if the archive is
compressed using \texttt{bzip2}, or nothing if the archive is not
compressed.
\end{itemize}
\subsection{Archive contents}
The archive is organised so that it contains one top-level directory
whose name is the full name of the package. This directory is called
the \emph{package unpacking directory}. All the files belonging to the
package are stored below it.
The unpacking directory contains at least the following files:
\begin{description}
\item[\file{install-pkg}] a script performing the installation of the
package,
\item[\file{README}] a textual file containing a short description of
the package,
\item[\file{COPYING}] a textual file containing the license of the
package.
\end{description}
\section{Downloading and installation of packages}
A package can be installed on a target machine by downloading its
archive, expanding it and finally running the installation script
located in the unpacking directory.
\subsection{Layouts}
The installation script installs files according to some given
\emph{layout}. A layout maps abstract locations to concrete
directories on the target machine. For example, a layout could map the
abstract location \location{doc}, where documentation is stored, to
the directory \file{/usr/local/share/doc/my_package}.
Currently, the following abstract locations are defined:
\begin{description}
\item[\location{base}] The ``base'' location of a package, where the
package loading script \file{load.scm} resides.
\item[\location{active}] Location containing a symbolic link, with the
same name as the package (excluding the version), pointing to the
base location of the package. This link is used to designate the
\emph{active} version of a package\,---\,the one to load when a
package is requested by giving only its name, without an explicit
version.
\item[\location{scheme}] Location containing all Scheme code. If the
package comes with some examples showing its usage, they are put in
a sub-directory called \file{examples} of this location.
\item[\location{lib}] Location containing platform-dependent files,
like shared libraries. This location contains one sub-directory per
platform for which packages have been installed, and nothing else.
\item[\location{doc}] Location containing all the package
documentation. This location contains one or more sub-directories
which store the documentation in various formats. The contents of
these sub-directories is standardised as follows, to make it easy
for users to find the document they need:
\begin{description}
\item[\file{html}] Directory containing the HTML documentation of
the package, if any; this directory should at least contain one
file called \file{index.html} serving as an entry point to the
documentation.
\item[\file{pdf}] Directory containing the PDF documentation of the
package, if any; this directory should contain at least one file
called \file{<package>.pdf} where \file{<package>} is the name of
the package.
\item[\file{ps}] Directory containing the PostScript documentation
of the package, if any; this directory should contain at least one
file called \file{<package>.ps} where \file{<package>} is the name
of the package.
\item[\file{text}] Directory containing the raw textual
documentation of the package, if any.
\end{description}
\item[\location{misc-shared}] Location containing miscellaneous data
which does not belong to any directory above, and which is
platform-independant.
\end{description}
The directories to which a layout maps these abstract locations are
not absolute directories, but rather relative ones. They are relative
to a \emph{prefix}, specified at installation time using the
\cloption{--prefix} option, as explained in section
\ref{sec:inst-proc}.
\begin{example}
Let's imagine that a user is installing version 1.2 of a package
called \package{foo}. This package contains a file called
\file{COPYING} which has to be installed in sub-directory
\file{license} of the \location{doc} location. If the user chooses
to use the default layout, which maps \location{doc} to directory
\file{<package_full_name>/doc} (see below), and specifies
\file{/usr/local/etc/scsh/modules} as a prefix, then the
\file{COPYING} file will end up in:
\[
\underbrace{\mathtt{/usr/local/etc/scsh/modules/}}_{1}%
\underbrace{\mathtt{foo-1.2/doc/}}_{2}%
\underbrace{\mathtt{license/COPYING}}_{3}
\]
Part 1 is the prefix, part 2 is the layout's mapping for the
\location{doc} location, and part 3 is the file name relative to the
location.
\end{example}
\subsubsection{Predefined layouts}
\label{sec:predefined-layouts}
Every installation script comes with a set of predefined layouts which
serve different aims. They are described below.
\paragraph{The \layout{scsh} layout}
The \layout{scsh} layout is the default layout. It maps all locations
to sub-directories of a single directory, called the package
installation directory, which contains all the files of the package
being installed and nothing else. Its name is simply the full name of
the package in question, and it resides in the \file{prefix}
directory.
The \layout{scsh} layout maps locations as given in the following
table:
\begin{center}
\begin{tabular}{|l|l|}
\hline
\textbf{Location} & \textbf{Directory (relative to prefix)}\\
\hline
\location{base} & \file{<package_full_name>} \\
\location{active} & \file{.} \\
\location{scheme} & \file{<package_full_name>/scheme} \\
\location{lib} & \file{<package_full_name>/lib} \\
\location{doc} & \file{<package_full_name>/doc} \\
\location{misc-shared} & \file{<package_full_name>} \\
\hline
\end{tabular}
\end{center}
This layout is well suited for installations performed without the
assistance of an additional package manager, because it makes many
common operations easy. For example, finding to which package a file
belongs is trivial, as is the removal of an installed package.
\paragraph{The \layout{fhs} layout}
The \layout{fhs} layout maps locations according to the File Hierarchy
Standard (FHS, see \href{http://www.pathname.com/fhs/}%
{http://www.pathname.com/fhs/}), as follows:
\begin{center}
\begin{tabular}{|l|l|}
\hline
\textbf{Location} & \textbf{Directory (relative to prefix)}\\
\hline
\layout{base} & \file{share/scsh/modules/<package_full_name>} \\
\layout{active} & \file{share/scsh/modules} \\
\layout{scheme} & \file{share/scsh/modules/<package_full_name>/scheme} \\
\layout{lib} & \file{lib/scsh/modules/<package_full_name>} \\
\layout{doc} & \file{share/doc/<package_full_name>} \\
\layout{misc-shared} & \file{share/scsh/modules/<package_full_name>} \\
\hline
\end{tabular}
\end{center}
The main advantage of this layout is that it adheres to the FHS
standard, and is therefore compatible with several packaging policies,
like \href{http://www.debian.org/}{Debian}'s,
\href{http://fink.sourceforge.net/}{Fink}'s and others. Its main
drawback is that files belonging to a given package are scattered, and
therefore hard to find when removing or upgrading a package. Its use
should therefore be considered only if third-party tools are available
to track files belonging to a package.
%% \subsection{File permissions}
%% TODO
\subsection{Installation procedure}
\label{sec:inst-proc}
Packages are installed using the \file{install-pkg} script located in
the package archive. This script must be given the name of the prefix
using the \cloption{--prefix} option. It also accepts the following
options:
\begin{center}
\begin{tabular}{lp{.6\textwidth}}
\cloption{--layout} name & Specifies the layout to use (see \ref{sec:predefined-layouts}) \\
\cloption{--verbose} & Print messages about what is being done. \\
\cloption{--dry-run} & Print what actions would be performed to install the package, but do not perform them. \\
\cloption{--inactive} & Do not activate package after installing it. \\
\cloption{--non-shared-only} & Only install platform-dependent files, if any. \\
\cloption{--force} & Overwrite existing files during installation. \\
\end{tabular}
\end{center}
%% \subsection{Creating images}
%% TODO (my current idea is to add support to install-lib to easily
%% create an image containing the package being installed, and maybe some
%% structures opened. Then, at install time, users could say that they
%% want an image to be created, and the install script would do that).
\section{Using packages}
To use a package, its \emph{loading script} must be loaded in
Scheme~48's exec package. The loading script for a package is a file
written in the Scheme 48 exec language, whose name is \file{load.scm}
and which resides in the \location{base} location.
To load this file, one typically uses scsh's \cloption{-lel} option
along with a properly defined \envvar{SCSH\_LIB\_DIRS} environment
variable.
Scsh has a list of directories, called the library directories, in
which it looks for files to load when the options \cloption{-ll} or
\cloption{-lel} are used. This list can be given a default value
during scsh's configuration, and this value can be overridden by
setting the environment variable \envvar{SCSH\_LIB\_DIRS} before running
scsh.
In order for scsh to find the package loading scripts, one must make
sure that scsh's library search path contains the names of all
\location{active} locations which containing packages.
The names of these directories should not end with a slash `\verb|/|',
as this forces scsh to search them recursively. This could
\emph{drastically} slow down scsh when looking for packages.
\begin{example}
Let's imagine a machine on which the system administrator installs
scsh packages according to the \layout{fhs} layout in prefix
directory \file{/usr/local}. The \location{active} location for
these packages corresponds to the directory
\file{/usr/local/share/scsh/modules}, according to the layout
specification above.
Let's also imagine that there is a user called \texttt{john} on this
machine, who installs additional scsh packages for himself in his
home directory, using \file{/home/john/scsh-packages} as a prefix.
To ease their management, he uses the \layout{scsh} layout. The
\location{active} location for these packages corresponds to the
directory \file{/home/john/scsh-packages}, according to the layout
specification above.
In order to be able to use scsh packages installed both by the
administrator and by himself, user \texttt{john} needs to put both
active directories in his \envvar{SCSH\_LIB\_DIRS} environment
variable. The value of this variable will therefore be:
\begin{verbatim}
"/usr/local/share/scsh/modules" "/home/john/scsh-packages"
\end{verbatim}
Now, in order to use packages \package{foo} and \package{bar} in one
of his script, user \texttt{john} just needs to load their loading
script using the \cloption{-lel} option when invoking scsh, as
follows:
\begin{verbatim}
-lel foo/load.scm -lel bar/load.scm
\end{verbatim}
\end{example}
\section{Writing packages}
Once the Scheme and/or C code for a package has been written, the last
step in turning it into a standard package as defined by this proposal
is to write the installation script.
This script could be written fully by the package author, but in order
to simplify this task a small scsh installation framework is provided.
This framework is composed of several files which are meant to be
included in the package archive. These files are:
\begin{description}
\item[\file{install-pkg}] a trivial \texttt{sh} script which launches
scsh on the main function of the installation library, passing it
all the arguments given by the user,
\item[\file{install-lib.scm}] the code for the installation library,
whose public interface is documented below,
\item[\file{install-lib-module.scm}] Scheme 48 interface and structure
definitions for the installation library,
\end{description}
As explained above, when the \file{install-pkg} script is invoked, it
launches scsh on the main function of the installation library, which
does the following:
\begin{enumerate}
\item parse the command line arguments (e.g the \cloption{--prefix}
option),
\item load the package definition file, a (Scheme) file called
\file{pkg-def.scm}, which is supplied by the package author and
which contains the installation procedure for the package,
\item install the package which was defined in the previous step.
\end{enumerate}
It is actually possible to define several packages in
\file{pkg-def.scm}, and all will be installed. It should not be often
useful, though.
The main job of the package author is therefore to write the package
definition file, \file{pkg-def.scm}.
This file is mostly composed of a package definition statement, which
specifies the name, version and installation code for the package. The
package definition statement is expressed using the
\texttt{define-package} form, defined below.
\subsection{Installation library}
\label{sec:install-library}
\defines{define-package}{name version extension body ...}%
Define a package to be installed. \param{Name}, a string, is the
package name, \param{version} its version (a list of integers),
\param{extensions} is an association list of extensions (see below),
and \param{body} is the list of statements to be evaluated in order to
install the package.
The installation statements typically use functions of the
installation library in order to install files in their target
location. The functions currently exported are presented in the
remainder of this section.
\param{Extensions} is currently used only to specify additional
command-line arguments, but in the future it could serve other
purposes. It consists in a list of pairs, each one composed of a
symbol identifying the extension, and extension-specific parameters.
\begin{description}
\item[options] enables the script to define additional command-line
options. It accepts nine parameters in total, with the last three
being optional. These parameters are described below, in the order
in which they should appear:
\begin{description}
\item[\param{name}] (a symbol) is the name of the option, without
the initial double hyphen (\verb|--|),
\item[\param{help-text}] (a string) describes the option for the
user,
\item[\param{arg-help-text}] (a string) describes the option's
argument (if any) for the user,
\item[\param{required-arg?}] (a boolean) says whether this option
requires an argument or not,
\item[\param{optional-arg?}] (a boolean) says whether this option's
argument can be omitted or not,
\item[\param{default}] is the default value for the option,
\item[\param{parser}] (a function from string to anything) parses
the option, i.e. turns its string representation into its internal
value,
\item[\param{unparser}] (a function from anything to string) turns
the internal representation of the option into a string,
\item[\param{transformer}] is a function taking the current value of
the option, the value submitted by the user and returning its new
value.
\end{description}
By default, \param{parser} and \param{unparser} are the identity
function, and \param{transformer} is a function which takes two
arguments and returns the second (i.e. the current value of the
option is simply replaced by the one given).
\end{description}
\definep{install-file}{file location [target-dir]}%
Install the given \param{file} in the sub-directory \param{target-dir}
(which must be a relative directory) of the given \param{location}.
\param{Target-dir} is \file{.} by default.
If the directory in which the file is about to be installed does not
exist, it is created along with all its parents, as needed. If
\param{file} is a string, then the installed file will have the same
name as the original one. If \param{file} is a pair, then its first element
specifies the name of the source file, and its second element the name
it will have once installed. The second element must be a simple file
name, without any directory part.
\definep{install-file}{file-list location [target-dir]}%
Like install-file but for several files, which are specified as a
list. Each element in the list can be either a simple string or a
pair, as explained above.
\definep{install-directory}{directory location [target-dir]}%
Install the given \param{directory} and all its contents, including
sub-directories, in sub-directory \param{target-dir} of
\param{location}. This is similar to what \param{install-file} does,
but for complete hierarchies.
Notice that \param{directory} will be installed as a sub-directory of
\param{target-dir}.
\definep{install-directories}{dir-list location [target-dir]}%
Install several directories in one go.
\definep{install-directory-contents}{directory location [target-dir]}%
Install the \emph{contents} of the given \param{directory} in
sub-directory \param{target} of \param{location}.
\definep{install-string}{string location [target-dir]}%
Install the contents of \param{string} in sub-directory
\param{target-dir} of \param{location}.
\definep{get-directory}{location install?}%
Get the absolute name of the directory to which the current layout
maps the abstract \param{location}. If \param{install?} is true, the
directory is the one valid during installation; If it is false, the
directory is the one valid after installation, that is when the
package is later used.
The distinction between installation-time and usage-time directories
is necessary to support staged installation, as performed by package
managers like Debian's APT.
\definep{get-option-value}{option}%
Return the value of the given command-line \param{option} (a symbol).
This can be used to get the value of predefined options (like
\cloption{--dry-run}) or package-specific options.
\definep{with-output-to-load-script*}{thunk}%
Evaluate \param{thunk} with the current output opened on the loading
script of the current package. If this script was already existing,
its previous contents will be deleted.
\defines{with-output-to-load-script}{body ...}%
Syntactic sugar for \param{with-output-to-load-script*}.
\definep{write-to-load-script}{s-expression}%
Pretty-print the \param{s-expression} to the loading script of the
current package. If this script was already existing, its previous
contents will be deleted.
\begin{example}
A typical package definition file for a simple package called
\package{my\_package} whose version is 1.2 could look like this:
\begin{verbatim}
(define-package "my_package" (1 2) ()
(install-file "load.scm" 'base)
(install-directory-contents "scheme" 'scheme)
(install-file ("LICENSE" . "COPYING") 'doc)
(install-directory-contents "doc" 'doc))
\end{verbatim}
With such a definition, invoking the installation script with
\file{/usr/local/} as prefix and \layout{fhs} as layout would have
the following effects:
\begin{enumerate}
\item The base directory
\file{/usr/local/share/scsh/modules/my_package-1.2} would be created
and file \file{load.scm} would be copied to it.
\item All the contents of the directory called \file{scheme} would be
copied to directory
\file{/usr/local/share/scsh/modules/my_package-1.2/scheme} which
would be created before, if needed.
\item File \file{LICENSE} would be copied to directory
\file{/usr/local/share/doc/my_package-1.2/} with name
\file{COPYING}.
\item All the contents of the directory called \file{doc} would be
copied to directory \file{/usr/local/share/doc/my_package-1.2/}
\item The package would be activated by creating a symbolic link with
name \file{/usr/local/share/scsh/modules/my_package} pointing to
\file{./my_package-1.2}
\end{enumerate}
\end{example}
\subsection{Packages containing C code (for shared libraries)}
Packages containing C code are more challenging to write, since all
the problems related to C's portability and incompatibilities between
the APIs of the various platforms have to be accounted for.
Fortunately, the GNU Autoconf system simplifies the management of
these problems, and authors of scsh packages containing C code are
strongly encouraged to use it.
Integrating Autoconf into the installation procedure should not be a
major problem thanks to scsh's ability to run separate programs.
\section{Packaging packages}
Most important Unix systems today have one (or several) package
management systems which ease the installation of packages on a
system. In order to avoid confusion between these packages and the
scsh packages discussed above, they will be called \emph{system
packages} in what follows.
It makes perfect sense to provide system packages for scsh packages.
System packages should as much as possible try to use the standard
installation script described above to install scsh packages. This
script currently provides some support for staged installations, which
are required by several packaging systems.
This support is provided through an additional option,
\cloption{--dest-dir}, which specifies the root directory in which to
install files. The files will then have to be moved from this location
to their final location by the system packaging tools.
(The \cloption{--dest-dir} option plays the same role as the
\envvar{DESTDIR} variable which is typically given to \texttt{make
install}, for makefiles which support staging directories).
%% \section{Glossary}
%% TODO define the following terms
%% Version
%% Target machine
%% Package
%% (Package) unpacking directory
%% Layout
%% (Abstract) location
%% Package loading script
\end{document}