diff --git a/doc/latex/proposal.tex b/doc/latex/proposal.tex new file mode 100644 index 0000000..8009ad9 --- /dev/null +++ b/doc/latex/proposal.tex @@ -0,0 +1,600 @@ +%% $Id: proposal.tex,v 1.1 2004/02/11 19:42:30 michel-schinz Exp $ + +%% TODO +%% - clean up permissions mess + +\documentclass[a4paper,12pt]{article} + +\usepackage{a4wide, palatino, url, hyperref} + +\newcommand{\file}{\begingroup \urlstyle{tt}\Url} + +\newcommand{\envvar}[1]{\texttt{#1}} +\newcommand{\cloption}[1]{\texttt{#1}} +\newcommand{\package}[1]{\texttt{#1}} +\newcommand{\layout}[1]{\texttt{#1}} +\newcommand{\location}[1]{\texttt{#1}} + +\newcommand{\define}[3]{% + \vspace{1em}% + \noindent% + (\texttt{#1} \textit{#2})\hfill\textit{(#3)}\\[0.5em]% +} +\newcommand{\definep}[2]{\define{#1}{#2}{procedure}} +\newcommand{\defines}[2]{\define{#1}{#2}{syntax}} +\newcommand{\param}[1]{\emph{#1}} + +\newenvironment{rationale}% +{\begin{quotation}\noindent\textbf{Rationale}}% +{\end{quotation}} +\newenvironment{example}% +{\begin{quotation}\noindent\textbf{Example}}% +{\end{quotation}} + +\begin{document} +\title{A proposal for scsh packages} +\author{Michel Schinz} +\maketitle + +\section{Introduction} +\label{sec:introduction} + +The aim of the following proposal is to define a standard for the +packaging, distribution, installation, use and removal of libraries +for scsh. Such packaged libraries are called \emph{scsh packages} or +simply \emph{packages} below. + +This proposal attempts to cover both libraries containing only Scheme +code and libraries containing additional C code. It does not try to +cover applications written in scsh, which are currently considered to +be outside of its scope. + +\subsection{Package identification and naming} + +Packages are identified by a globally-unique name. This name should +start with an ASCII letter (a-z or A-Z) and should consist only of +ASCII letters, digits or underscore characters `\verb|_|'. Package +names are case-sensitive, but there should not be two packages with +names which differ only by their capitalisation. + +\begin{rationale} + This restriction on package names ensures that they can be used to + name directories on current operating systems. +\end{rationale} + +Several versions of a given package can exist. A version is identified +by a sequence of non-negative integers. Versions are ordered +lexicographically. + +A version has a printed representation which is obtained by separating +(the printed representation of) its components by dots. For example, +the printed representation of a version composed of the integer 1 +followed by the integer 2 is the string \texttt{1.2}. Below, versions +are usually represented using their printed representation for +simplicity, but it is important to keep in mind that versions are +sequences of integers, not strings. + +A specific version of a package is therefore identified by a name and +a version. The full name of a version of a package is obtained by +concatenating: +\begin{itemize} +\item the name of the package, +\item a hyphen `\texttt{-}', +\item the printed representation of the version. +\end{itemize} + +In what follows, the term \emph{package} is often used to designate a +specific version of a package, but this should be clear from the +context. + +\section{Distribution of packages} + +Packages are distributed in \texttt{tar} archives, which can +optionally be compressed by \texttt{gzip} or \texttt{bzip2}. + +The name of the archive is composed by appending: +\begin{itemize} +\item the full name of the package, +\item the string \texttt{.tar} indicating that it's a \texttt{tar} + archive, +\item either the string \texttt{.gz} if the archive is compressed + using \texttt{gzip}, or the string \texttt{.bz2} if the archive is + compressed using \texttt{bzip2}, or nothing if the archive is not + compressed. +\end{itemize} + +\subsection{Archive contents} + +The archive is organised so that it contains one top-level directory +whose name is the full name of the package. This directory is called +the \emph{package unpacking directory}. All the files belonging to the +package are stored below it. + +The unpacking directory contains at least the following files: +\begin{description} +\item[\file{install-pkg}] a script performing the installation of the + package, +\item[\file{README}] a textual file containing a short description of + the package, +\item[\file{COPYING}] a textual file containing the license of the + package. +\end{description} + +\section{Downloading and installation of packages} + +A package can be installed on a target machine by downloading its +archive, expanding it and finally running the installation script +located in the unpacking directory. + +\subsection{Layouts} + +The installation script installs files according to some given +\emph{layout}. A layout maps abstract locations to concrete +directories on the target machine. For example, a layout could map the +abstract location \location{doc}, where documentation is stored, to +the directory \file{/usr/local/share/doc/my_package}. + +Currently, the following abstract locations are defined: +\begin{description} +\item[\location{base}] The ``base'' location of a package, where the + package loading script \file{load.scm} resides. + +\item[\location{active}] Location containing a symbolic link, with the + same name as the package (excluding the version), pointing to the + base location of the package. This link is used to designate the + \emph{active} version of a package\,---\,the one to load when a + package is requested by giving only its name, without an explicit + version. + +\item[\location{scheme}] Location containing all Scheme code. If the + package comes with some examples showing its usage, they are put in + a sub-directory called \file{examples} of this location. + +\item[\location{lib}] Location containing platform-dependent files, + like shared libraries. This location contains one sub-directory per + platform for which packages have been installed, and nothing else. + +\item[\location{doc}] Location containing all the package + documentation. This location contains one or more sub-directories + which store the documentation in various formats. The contents of + these sub-directories is standardised as follows, to make it easy + for users to find the document they need: + \begin{description} + \item[\file{html}] Directory containing the HTML documentation of + the package, if any; this directory should at least contain one + file called \file{index.html} serving as an entry point to the + documentation. + \item[\file{pdf}] Directory containing the PDF documentation of the + package, if any; this directory should contain at least one file + called \file{.pdf} where \file{} is the name of + the package. + \item[\file{ps}] Directory containing the PostScript documentation + of the package, if any; this directory should contain at least one + file called \file{.ps} where \file{} is the name + of the package. + \item[\file{text}] Directory containing the raw textual + documentation of the package, if any. + \end{description} + +\item[\location{misc-shared}] Location containing miscellaneous data + which does not belong to any directory above, and which is + platform-independant. +\end{description} + +The directories to which a layout maps these abstract locations are +not absolute directories, but rather relative ones. They are relative +to a \emph{prefix}, specified at installation time using the +\cloption{--prefix} option, as explained in section +\ref{sec:inst-proc}. + +\begin{example} + Let's imagine that a user is installing version 1.2 of a package + called \package{foo}. This package contains a file called + \file{COPYING} which has to be installed in sub-directory + \file{license} of the \location{doc} location. If the user chooses + to use the default layout, which maps \location{doc} to directory + \file{/doc} (see below), and specifies + \file{/usr/local/etc/scsh/modules} as a prefix, then the + \file{COPYING} file will end up in: +\[ +\underbrace{\mathtt{/usr/local/etc/scsh/modules/}}_{1}% +\underbrace{\mathtt{foo-1.2/doc/}}_{2}% +\underbrace{\mathtt{license/COPYING}}_{3} +\] +Part 1 is the prefix, part 2 is the layout's mapping for the +\location{doc} location, and part 3 is the file name relative to the +location. +\end{example} + +\subsubsection{Predefined layouts} +\label{sec:predefined-layouts} + +Every installation script comes with a set of predefined layouts which +serve different aims. They are described below. + +\paragraph{The \layout{scsh} layout} + +The \layout{scsh} layout is the default layout. It maps all locations +to sub-directories of a single directory, called the package +installation directory, which contains all the files of the package +being installed and nothing else. Its name is simply the full name of +the package in question, and it resides in the \file{prefix} +directory. + +The \layout{scsh} layout maps locations as given in the following +table: +\begin{center} + \begin{tabular}{|l|l|} + \hline + \textbf{Location} & \textbf{Directory (relative to prefix)}\\ + \hline + \location{base} & \file{} \\ + \location{active} & \file{.} \\ + \location{scheme} & \file{/scheme} \\ + \location{lib} & \file{/lib} \\ + \location{doc} & \file{/doc} \\ + \location{misc-shared} & \file{} \\ + \hline + \end{tabular} +\end{center} + +This layout is well suited for installations performed without the +assistance of an additional package manager, because it makes many +common operations easy. For example, finding to which package a file +belongs is trivial, as is the removal of an installed package. + +\paragraph{The \layout{fhs} layout} + +The \layout{fhs} layout maps locations according to the File Hierarchy +Standard (FHS, see \href{http://www.pathname.com/fhs/}% +{http://www.pathname.com/fhs/}), as follows: +\begin{center} + \begin{tabular}{|l|l|} + \hline + \textbf{Location} & \textbf{Directory (relative to prefix)}\\ + \hline + \layout{base} & \file{share/scsh/modules/} \\ + \layout{active} & \file{share/scsh/modules} \\ + \layout{scheme} & \file{share/scsh/modules//scheme} \\ + \layout{lib} & \file{lib/scsh/modules/} \\ + \layout{doc} & \file{share/doc/} \\ + \layout{misc-shared} & \file{share/scsh/modules/} \\ + \hline + \end{tabular} +\end{center} + +The main advantage of this layout is that it adheres to the FHS +standard, and is therefore compatible with several packaging policies, +like \href{http://www.debian.org/}{Debian}'s, +\href{http://fink.sourceforge.net/}{Fink}'s and others. Its main +drawback is that files belonging to a given package are scattered, and +therefore hard to find when removing or upgrading a package. Its use +should therefore be considered only if third-party tools are available +to track files belonging to a package. + +%% \subsection{File permissions} + +%% TODO + +\subsection{Installation procedure} +\label{sec:inst-proc} + +Packages are installed using the \file{install-pkg} script located in +the package archive. This script must be given the name of the prefix +using the \cloption{--prefix} option. It also accepts the following +options: +\begin{center} + \begin{tabular}{lp{.6\textwidth}} + \cloption{--layout} name & Specifies the layout to use (see \ref{sec:predefined-layouts}) \\ + \cloption{--verbose} & Print messages about what is being done. \\ + \cloption{--dry-run} & Print what actions would be performed to install the package, but do not perform them. \\ + \cloption{--inactive} & Do not activate package after installing it. \\ + \cloption{--non-shared-only} & Only install platform-dependent files, if any. \\ + \cloption{--force} & Overwrite existing files during installation. \\ + \end{tabular} +\end{center} +%% \subsection{Creating images} + +%% TODO (my current idea is to add support to install-lib to easily +%% create an image containing the package being installed, and maybe some +%% structures opened. Then, at install time, users could say that they +%% want an image to be created, and the install script would do that). + +\section{Using packages} + +To use a package, its \emph{loading script} must be loaded in +Scheme~48's exec package. The loading script for a package is a file +written in the Scheme 48 exec language, whose name is \file{load.scm} +and which resides in the \location{base} location. + +To load this file, one typically uses scsh's \cloption{-lel} option +along with a properly defined \envvar{SCSH\_LIB\_DIRS} environment +variable. + +Scsh has a list of directories, called the library directories, in +which it looks for files to load when the options \cloption{-ll} or +\cloption{-lel} are used. This list can be given a default value +during scsh's configuration, and this value can be overridden by +setting the environment variable \envvar{SCSH\_LIB\_DIRS} before running +scsh. + +In order for scsh to find the package loading scripts, one must make +sure that scsh's library search path contains the names of all +\location{active} locations which containing packages. + +The names of these directories should not end with a slash `\verb|/|', +as this forces scsh to search them recursively. This could +\emph{drastically} slow down scsh when looking for packages. + +\begin{example} + Let's imagine a machine on which the system administrator installs + scsh packages according to the \layout{fhs} layout in prefix + directory \file{/usr/local}. The \location{active} location for + these packages corresponds to the directory + \file{/usr/local/share/scsh/modules}, according to the layout + specification above. + + Let's also imagine that there is a user called \texttt{john} on this + machine, who installs additional scsh packages for himself in his + home directory, using \file{/home/john/scsh-packages} as a prefix. + To ease their management, he uses the \layout{scsh} layout. The + \location{active} location for these packages corresponds to the + directory \file{/home/john/scsh-packages}, according to the layout + specification above. + + In order to be able to use scsh packages installed both by the + administrator and by himself, user \texttt{john} needs to put both + active directories in his \envvar{SCSH\_LIB\_DIRS} environment + variable. The value of this variable will therefore be: +\begin{verbatim} +"/usr/local/share/scsh/modules" "/home/john/scsh-packages" +\end{verbatim} + + Now, in order to use packages \package{foo} and \package{bar} in one + of his script, user \texttt{john} just needs to load their loading + script using the \cloption{-lel} option when invoking scsh, as + follows: +\begin{verbatim} + -lel foo/load.scm -lel bar/load.scm +\end{verbatim} +\end{example} + +\section{Writing packages} + +Once the Scheme and/or C code for a package has been written, the last +step in turning it into a standard package as defined by this proposal +is to write the installation script. + +This script could be written fully by the package author, but in order +to simplify this task a small scsh installation framework is provided. +This framework is composed of several files which are meant to be +included in the package archive. These files are: +\begin{description} +\item[\file{install-pkg}] a trivial \texttt{sh} script which launches + scsh on the main function of the installation library, passing it + all the arguments given by the user, +\item[\file{install-lib.scm}] the code for the installation library, + whose public interface is documented below, +\item[\file{install-lib-module.scm}] Scheme 48 interface and structure + definitions for the installation library, +\end{description} + +As explained above, when the \file{install-pkg} script is invoked, it +launches scsh on the main function of the installation library, which +does the following: +\begin{enumerate} +\item parse the command line arguments (e.g the \cloption{--prefix} + option), +\item load the package definition file, a (Scheme) file called + \file{pkg-def.scm}, which is supplied by the package author and + which contains the installation procedure for the package, +\item install the package which was defined in the previous step. +\end{enumerate} +It is actually possible to define several packages in +\file{pkg-def.scm}, and all will be installed. It should not be often +useful, though. + +The main job of the package author is therefore to write the package +definition file, \file{pkg-def.scm}. + +This file is mostly composed of a package definition statement, which +specifies the name, version and installation code for the package. The +package definition statement is expressed using the +\texttt{define-package} form, defined below. + +\subsection{Installation library} +\label{sec:install-library} +\defines{define-package}{name version extension body ...}% +Define a package to be installed. \param{Name}, a string, is the +package name, \param{version} its version (a list of integers), +\param{extensions} is an association list of extensions (see below), +and \param{body} is the list of statements to be evaluated in order to +install the package. + +The installation statements typically use functions of the +installation library in order to install files in their target +location. The functions currently exported are presented in the +remainder of this section. + +\param{Extensions} is currently used only to specify additional +command-line arguments, but in the future it could serve other +purposes. It consists in a list of pairs, each one composed of a +symbol identifying the extension, and extension-specific parameters. +\begin{description} +\item[options] enables the script to define additional command-line + options. It accepts nine parameters in total, with the last three + being optional. These parameters are described below, in the order + in which they should appear: + \begin{description} + \item[\param{name}] (a symbol) is the name of the option, without + the initial double hyphen (\verb|--|), + \item[\param{help-text}] (a string) describes the option for the + user, + \item[\param{arg-help-text}] (a string) describes the option's + argument (if any) for the user, + \item[\param{required-arg?}] (a boolean) says whether this option + requires an argument or not, + \item[\param{optional-arg?}] (a boolean) says whether this option's + argument can be omitted or not, + \item[\param{default}] is the default value for the option, + \item[\param{parser}] (a function from string to anything) parses + the option, i.e. turns its string representation into its internal + value, + \item[\param{unparser}] (a function from anything to string) turns + the internal representation of the option into a string, + \item[\param{transformer}] is a function taking the current value of + the option, the value submitted by the user and returning its new + value. + \end{description} + By default, \param{parser} and \param{unparser} are the identity + function, and \param{transformer} is a function which takes two + arguments and returns the second (i.e. the current value of the + option is simply replaced by the one given). +\end{description} + +\definep{install-file}{file location [target-dir]}% +Install the given \param{file} in the sub-directory \param{target-dir} +(which must be a relative directory) of the given \param{location}. +\param{Target-dir} is \file{.} by default. + +If the directory in which the file is about to be installed does not +exist, it is created along with all its parents, as needed. If +\param{file} is a string, then the installed file will have the same +name as the original one. If \param{file} is a pair, then its first element +specifies the name of the source file, and its second element the name +it will have once installed. The second element must be a simple file +name, without any directory part. + +\definep{install-file}{file-list location [target-dir]}% +Like install-file but for several files, which are specified as a +list. Each element in the list can be either a simple string or a +pair, as explained above. + +\definep{install-directory}{directory location [target-dir]}% +Install the given \param{directory} and all its contents, including +sub-directories, in sub-directory \param{target-dir} of +\param{location}. This is similar to what \param{install-file} does, +but for complete hierarchies. + +Notice that \param{directory} will be installed as a sub-directory of +\param{target-dir}. + +\definep{install-directories}{dir-list location [target-dir]}% +Install several directories in one go. + +\definep{install-directory-contents}{directory location [target-dir]}% +Install the \emph{contents} of the given \param{directory} in +sub-directory \param{target} of \param{location}. + +\definep{install-string}{string location [target-dir]}% +Install the contents of \param{string} in sub-directory +\param{target-dir} of \param{location}. + +\definep{get-directory}{location install?}% +Get the absolute name of the directory to which the current layout +maps the abstract \param{location}. If \param{install?} is true, the +directory is the one valid during installation; If it is false, the +directory is the one valid after installation, that is when the +package is later used. + +The distinction between installation-time and usage-time directories +is necessary to support staged installation, as performed by package +managers like Debian's APT. + +\definep{get-option-value}{option}% +Return the value of the given command-line \param{option} (a symbol). +This can be used to get the value of predefined options (like +\cloption{--dry-run}) or package-specific options. + +\definep{with-output-to-load-script*}{thunk}% +Evaluate \param{thunk} with the current output opened on the loading +script of the current package. If this script was already existing, +its previous contents will be deleted. + +\defines{with-output-to-load-script}{body ...}% +Syntactic sugar for \param{with-output-to-load-script*}. + +\definep{write-to-load-script}{s-expression}% +Pretty-print the \param{s-expression} to the loading script of the +current package. If this script was already existing, its previous +contents will be deleted. + +\begin{example} + A typical package definition file for a simple package called + \package{my\_package} whose version is 1.2 could look like this: +\begin{verbatim} +(define-package "my_package" (1 2) () + (install-file "load.scm" 'base) + (install-directory-contents "scheme" 'scheme) + (install-file ("LICENSE" . "COPYING") 'doc) + (install-directory-contents "doc" 'doc)) +\end{verbatim} + + With such a definition, invoking the installation script with + \file{/usr/local/} as prefix and \layout{fhs} as layout would have + the following effects: +\begin{enumerate} +\item The base directory + \file{/usr/local/share/scsh/modules/my_package-1.2} would be created + and file \file{load.scm} would be copied to it. +\item All the contents of the directory called \file{scheme} would be + copied to directory + \file{/usr/local/share/scsh/modules/my_package-1.2/scheme} which + would be created before, if needed. +\item File \file{LICENSE} would be copied to directory + \file{/usr/local/share/doc/my_package-1.2/} with name + \file{COPYING}. +\item All the contents of the directory called \file{doc} would be + copied to directory \file{/usr/local/share/doc/my_package-1.2/} +\item The package would be activated by creating a symbolic link with + name \file{/usr/local/share/scsh/modules/my_package} pointing to + \file{./my_package-1.2} +\end{enumerate} +\end{example} + +\subsection{Packages containing C code (for shared libraries)} + +Packages containing C code are more challenging to write, since all +the problems related to C's portability and incompatibilities between +the APIs of the various platforms have to be accounted for. +Fortunately, the GNU Autoconf system simplifies the management of +these problems, and authors of scsh packages containing C code are +strongly encouraged to use it. + +Integrating Autoconf into the installation procedure should not be a +major problem thanks to scsh's ability to run separate programs. + +\section{Packaging packages} + +Most important Unix systems today have one (or several) package +management systems which ease the installation of packages on a +system. In order to avoid confusion between these packages and the +scsh packages discussed above, they will be called \emph{system + packages} in what follows. + +It makes perfect sense to provide system packages for scsh packages. +System packages should as much as possible try to use the standard +installation script described above to install scsh packages. This +script currently provides some support for staged installations, which +are required by several packaging systems. + +This support is provided through an additional option, +\cloption{--dest-dir}, which specifies the root directory in which to +install files. The files will then have to be moved from this location +to their final location by the system packaging tools. + +(The \cloption{--dest-dir} option plays the same role as the +\envvar{DESTDIR} variable which is typically given to \texttt{make + install}, for makefiles which support staging directories). + +%% \section{Glossary} +%% TODO define the following terms +%% Version +%% Target machine +%% Package +%% (Package) unpacking directory +%% Layout +%% (Abstract) location +%% Package loading script + +\end{document} \ No newline at end of file