Documented static heap linker. Such as it is.

This commit is contained in:
shivers 1997-04-04 22:41:06 +00:00
parent 4c26efe136
commit bf28f0aa86
1 changed files with 121 additions and 8 deletions

View File

@ -750,18 +750,131 @@ One occasionally hears rumours that this is being addressed
by the {\scm} development team.
\section{Statically linking heap images}
Brian Carlstrom has written code to process {\scm} heap images
into \ex{.o} files that can be linked with a virtual machine
binary to produce a standalone machine-code executable.
The static heap linker converts a {\scm} bytecode image contained
in a .image file to a C representation. This C code is then compiled and
linked in with a virtual machine, producing a single executable.
Some of the benefits are:
\begin{itemize}
\item Instantaneous start-up time.
\item Improved paging; scsh images can be shared between different
processes.
\item Vastly reduced GC copying---the whole initial image
is moved out of the heap, and neither traced nor copied.
\item Result program no longer depends on the filesystem for its
initial image.
\end{itemize}
The source code comes with the current distribution, but it has not been
integrated into the system or documented in time for this
release.
The static heap linker takes arguments in the following form:
\codex{scsh-hlink \var{image} \var{executable} [\var{option} \ldots]}
It reads in the heap image \var{image}, translates it into C code,
compiles the C code, and links it against the scsh vm, producing the
standalone binary file \var{executable}.
%Either he integrates it into the system and documents it for release
%0.4, or his body will soon be occupying a shallow grave behind Tech Square.
Each C file represents part of the heap image as a constant C \ex{long} vector
that looks something like this:
{\small\begin{verbatim}
const long p116[]={0x882,0x24,0x19,
0x882,(long)(&p19[785])+7,(long)(&p119[125])+7,
0x882,(long)(&p119[128])+7,(long)(&p119[131])+7,
0x882,(long)(&p102[348])+7,(long)(&p3[114])+7,
0xfc2,0x2030200,0x7100209,0x1091002,0x1c075a,
0x882,(long)(&p29[1562])+7,(long)(&p119[137])+7,
0x882,(long)(&p78[692])+7,(long)(&p119[140])+7,
.
.
.
};
\end{verbatim}}%
%
Translating to a C declaration gives us freedom from the various
object-file formats.\footnote{This idea is due to Jonathan Rees.}
Note that the const declaration allows the compiler to put this array in the
text pages of the executable.
The heap is split into parts because many C compilers cannot handle
multi-megabyte initialised vector declarations.
The allowed options to the heap linker are:
\begin{itemize}
\def\Item#1{\item{\ex{#1}}\\}
\Item{--temp \var{dir}} The temporary directory to hold .c and .o files.
The default is typically configured to be
\ex{/usr/tmp}, and can be overridden by the
environment variable \ex{TMPDIR}.
Example:
\codex{--temp /tmp}
\Item{--cc \var{command}} The command to run the C compiler.
The default can be overridden by the environment
variable \ex{CC}.
Example:
\codex{--cc "gcc -g -O"}
\Item{--ld \var{command}} The arguments to run the C compiler as a linker.
The default can be overridden by the
environment variable \ex{LDFLAGS}.
Example:
\codex{--ld "-Wl,-E"}
\Item{--libs \var{libs}} The libraries needed to link the VM and heap.
The default can be overridden by the
environment variable \ex{LIBS}.
Example:
\codex{--libs "-ldld -lld -lm"}
\end{itemize}
Be warned that the current heap linker has many shortcomings.
\begin{itemize}
\item It is extremely slow. Really, really slow. Translating the standard
scsh heap image into a standalone binary takes well over an hour on a
40Mb/133Mhz Pentium system.
A memory-starved 486 could take all night.
\item It cannot be applied to itself. The current implementation
works by replacing some of the heap-dumping code. This means
you cannot load the heap-linker code into a scsh system and
subsequently use \ex{dump-scsh-program} to create a heap-linker
heap image.
\item The interface leaves a lot to be desired.
\begin{itemize}
\item It requires the heap image to be referenced by a file-name;
the linker will not allow you to feed it the input heap image
on a port.
\item The heap-image is linked against the vm contained in
\begin{tightcode}
/usr/local/lib/scsh/libscshvm.a\end{tightcode}
This is wired in at the time scsh is installed on your system.
\item There is no Scheme procedural interface.
\end{itemize}
\item The program produced uses the default VM argv parser \verb|process_args|
from the scsh source file \ex{main.c} to process the command line
before handing it off to the heap image's top-level procedure.
This is not what you want for many programs.
The system needs to be changed to allow users to override this default
with their own VM argument parsers.
\item A possible problem is the Unix limits on the number of command
line arguments. The heap-linker calls the C linker with a large number of
object files. Its conceivable that on some Unix systems this could fail
now or if scsh grows in the future. The solution could be to create
library archives of a few dozen files and then link the result few dozen
library archives to make the executable.
\end{itemize}
In spite of these many shortcomings, we are providing the static linker
as it stands in this release so that people may get some experience with
it.
Here is an example of how one might use the heap linker:
\begin{code}
scsh-hlink scsh.image fastscsh\end{code}
We'd love it if someone would dive into the source and improve it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Standard file locations}
Because the scshvm binary is intended to be used for writing shell
scripts, it is important that the binary be installed in a standard