1995-10-13 23:34:21 -04:00
|
|
|
%&lplain -*- latex -*-
|
1997-10-06 17:16:27 -04:00
|
|
|
\documentclass[11pt]{article}
|
|
|
|
\usepackage{ct,boxedminipage,draftfooters,code}
|
1995-10-13 23:34:21 -04:00
|
|
|
\input{headings}
|
|
|
|
\begin{document}
|
|
|
|
|
|
|
|
\makeatletter
|
|
|
|
%% What you frequently want when you say \tt:
|
|
|
|
\def\ttt{\tt\catcode``=13\@noligs\frenchspacing}
|
|
|
|
|
|
|
|
\newenvironment{inset}
|
|
|
|
{\bgroup\parskip=1ex plus 1ex\begin{list}{}%
|
|
|
|
{\topsep=0pt\rightmargin\leftmargin}%
|
|
|
|
\item[]}%
|
|
|
|
{\end{list}\leavevmode\egroup\global\@ignoretrue}
|
|
|
|
|
|
|
|
\newcommand{\ex}[1]{\mbox{\ttt{#1}}}
|
|
|
|
\newcommand{\eg}{\mbox{\em e.g.}}
|
|
|
|
\newcommand{\Eg}{\mbox{\em E.g.}}
|
|
|
|
\newcommand{\etc}{\mbox{\em etc.}}
|
|
|
|
\newcommand{\codex}[1]% One line, centered.
|
|
|
|
{$$\abovedisplayskip=.75ex plus 1ex minus .5ex%
|
|
|
|
\belowdisplayskip=\abovedisplayskip%
|
|
|
|
\abovedisplayshortskip=0ex plus .5ex%
|
|
|
|
\belowdisplayshortskip=\abovedisplayshortskip%
|
|
|
|
\hbox{\ttt #1}$$}
|
|
|
|
|
|
|
|
%%% Allow linebreaking here -- for use in \tt.
|
|
|
|
\newcommand{\ob}{\linebreak[0]}
|
|
|
|
|
|
|
|
\newcommand{\itum}[1]{\item{\bf #1}\\*}
|
|
|
|
\newcommand{\itam}[1]{\item{\bf #1}} % if following text is math
|
|
|
|
|
|
|
|
\newcommand{\var}[1]{{\mbox{\it{#1}}}}
|
|
|
|
|
|
|
|
\newcommand{\scm}{Scheme 48}
|
|
|
|
\newcommand{\Scheme}{{\sc{Scheme}}}
|
|
|
|
|
|
|
|
\makeatother
|
|
|
|
|
|
|
|
%%%%%% End of preamble %%%%%%
|
|
|
|
|
|
|
|
\author{Olin Shivers \\ {\ttt shivers@lcs.mit.edu}}
|
|
|
|
%\author{Brian Carlstrom \\ {\ttt bdc@lcs.mit.edu}}
|
|
|
|
\date{6/94}
|
|
|
|
\title{Cig---a C Interface Generator for {\scm}\footnote{
|
|
|
|
Copyright (c) 1994 by Olin Shivers.}}
|
|
|
|
\maketitle
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
|
|
|
Cig is a software package that allows {\scm} programs to link C code
|
|
|
|
into {\scm} binaries.
|
|
|
|
|
|
|
|
\section{{\scm} and Foreign Functions}
|
|
|
|
|
|
|
|
{\scm} is implemented on top of a virtual machine,
|
|
|
|
realised by a byte-code interpreter.
|
|
|
|
The virtual machine is a stack-oriented architecture,
|
|
|
|
designed for portability, dense encodings of {\Scheme} programs,
|
|
|
|
and safe execution of programs.
|
|
|
|
The v.m.\ is written in PreScheme \cite{PreScheme},
|
|
|
|
a statically scoped {\Scheme} subset designed for efficient
|
|
|
|
translation into traditional imperative languages such as C.
|
|
|
|
After being translated into C, the v.m.\ can be compiled by
|
|
|
|
any C compiler.
|
|
|
|
The rest of the {\scm} system is written in {\Scheme},
|
|
|
|
and compiled to machine-independent byte-codes by the {\scm} compiler.
|
|
|
|
|
|
|
|
The {\scm} v.m.\ includes a simple, primitive foreign-function
|
|
|
|
interface.
|
|
|
|
The v.m.\ includes a machine instruction \ex{external-call}
|
|
|
|
which takes an arbitrary number of operands on the stack.
|
|
|
|
The operand on the top of the stack is a pointer to a C function.
|
|
|
|
The rest of the operands are arguments being passed to that
|
|
|
|
C function from {\Scheme}.
|
|
|
|
|
|
|
|
The \ex{external-call} machine instruction calls the C function,
|
|
|
|
which is a two-argument function of the following type:
|
|
|
|
\codex{int f(long nargs, long* argv);}
|
|
|
|
The first argument passed to the C function, \ex{nargs}, is the number of
|
|
|
|
arguments being passed to it from the \ex{external-call} operation.
|
|
|
|
The second argument, \ex{argv}, is a pointer to a vector of the arguments;
|
|
|
|
it is in fact a pointer directly into the {\scm} v.m.\ argument stack.
|
|
|
|
This means that it is not wise to alter the contents of this vector.
|
|
|
|
The {\scm} values are passed as C \ex{long} values;
|
|
|
|
details on the representation of {\scm} values will be discussed later.
|
|
|
|
|
|
|
|
When the C function finishes, the value it returns is passed back
|
|
|
|
to {\Scheme} as the result of the \ex{external-call} operation.
|
|
|
|
Since the C function is architecturally realised as a single v.m.\ instruction,
|
|
|
|
it is atomic with respect to the {\scm} interrupt system, which only
|
|
|
|
services interrupts on v.m.\ instruction boundaries.
|
|
|
|
|
|
|
|
The \ex{external-call} v.m.\ instruction is exposed by the {\scm} compiler
|
|
|
|
directly as a procedure, in the \ex{externals} package:
|
|
|
|
\codex{(external-call \var{c-fun} $\var{arg}_1$ {\ldots} $\var{arg}_n$).}
|
|
|
|
The compiler generates code to push the arguments onto the stack,
|
|
|
|
and execute the instruction.
|
|
|
|
Note that due to the direction of stack growth in the {\scm} stack,
|
|
|
|
the argument vector passed to the C function will contain the arguments
|
|
|
|
in {\em reverse} order.
|
|
|
|
That is, \ex{argv[0]} will retrieve $\var{arg}_n$;
|
|
|
|
\ex{argv[$1$]} will retrieve $\var{arg}_{n-1}$.
|
|
|
|
|
|
|
|
\section{The externals package}
|
|
|
|
The basic interface to the {\scm} v.m.\ foreign-function system
|
|
|
|
is contained in the \ex{externals} package, which exports three
|
|
|
|
procedures:
|
|
|
|
\begin{description}
|
|
|
|
\item[{(external-call \var{c-fun} $\var{arg}_1$ {\ldots} $\var{arg}_n$)}]
|
|
|
|
This procedure is described above.
|
|
|
|
|
|
|
|
\item[{(get-external \var{symbol})}]
|
|
|
|
This procedure looks up the value of a symbol in relocated binary that is
|
|
|
|
being executed by the current {\scm} process.
|
|
|
|
The symbol is represented as a string.
|
|
|
|
{\em Do we include the prepended underbar?}
|
|
|
|
It is used to map names of C procedures to their addresses at run-time.
|
|
|
|
The pointer value returned, an absolute address, is encapsulated as a
|
|
|
|
special {\scm} type (what name?) that can be passed to \ex{external-call}
|
|
|
|
as a possible \var{c-fun} value to be called.
|
|
|
|
|
|
|
|
This procedure is merely a compiler-provided {\Scheme} binding of the {\scm}
|
|
|
|
v.m.\ instruction of the same name.
|
|
|
|
|
|
|
|
\item[{(lookup-all-externals)}]
|
|
|
|
When the {\scm} runtime maps a Unix \ex{ld} symbol to a foreign descriptor
|
|
|
|
with the \ex{get-external} procedure call, the symbol and its corresponding
|
|
|
|
descriptor is remembered in a global table.
|
|
|
|
The \ex{lookup-all-externals} procedure causes the runtime to scan this
|
|
|
|
table.
|
|
|
|
Each symbol is looked up again, and the up-to-date pointer value returned
|
|
|
|
is used to destructively update the original foreign descriptor datum.
|
|
|
|
|
|
|
|
This relinking step is important after resuming a {\scm} heap image
|
|
|
|
that was dumped out into a file, since the v.m.\ executable used to resume
|
|
|
|
the heap might be different from the one that was used to dump it---and so
|
|
|
|
the absolute locations of symbols might be different.
|
|
|
|
\end{description}
|
|
|
|
|
|
|
|
With these three basic procedures, we could (somewhat tediously)
|
|
|
|
construct a simple foreign-function call in {\scm} with the
|
|
|
|
code in figure \ref{fig:ffexamp}.
|
|
|
|
\begin{figure}
|
|
|
|
\begin{boxedminipage}{\textwidth}
|
|
|
|
\begin{centercode}
|
|
|
|
;;; Nuke our process -- the Unix exit() syscall.
|
|
|
|
;;; The argument EXIT-CODE will be off by a factor
|
|
|
|
;;; of four, due to 2-bit Scheme 48 lsb type tags.
|
|
|
|
(define my-exit
|
|
|
|
(let ((exit-cookie (get-external "exit"))) ; C exit()
|
|
|
|
(lambda (exit-code)
|
|
|
|
(external-call exit-cookie exit-code))))
|
|
|
|
|
|
|
|
(my-exit 0) ; Goodbye, cruel world.\end{centercode}
|
|
|
|
\vspace{-3ex}
|
|
|
|
\caption{Simple foreign-function call in {\scm}.}
|
|
|
|
\label{fig:ffexamp}
|
|
|
|
\end{boxedminipage}
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
|
|
\section{Cig's role}
|
|
|
|
|
|
|
|
{\scm}'s foreign-function interface is simple and robust.
|
|
|
|
It relies only on C procedure semantics---the C code in the v.m.
|
|
|
|
emulator that actually does the foreign call is straightforward:
|
|
|
|
\codex{retval = (*c\_fun)(nargs, tos);}
|
|
|
|
This makes it easy to port the C code for the v.m.\ to new machines.
|
|
|
|
|
|
|
|
However, the price of this robustness and portability is that
|
|
|
|
the interface is primitive and awkward to use.
|
|
|
|
It is the responsibility of the C function to
|
|
|
|
\begin{itemize}
|
|
|
|
\item Check the arity of the {\Scheme} call.
|
|
|
|
\item Unpack the arguments from the \ex{argv} vector.
|
|
|
|
\item Check the types of the arguments.
|
|
|
|
\item Convert the arguments from their {\Scheme} representations
|
|
|
|
to their C representation.
|
|
|
|
\item Convert the return value from its C representation
|
|
|
|
to its {\Scheme} representation.
|
|
|
|
\item If the procedure would like to return multiple values,
|
|
|
|
this must be hand-simulated by arranging for the {\Scheme}
|
|
|
|
code to pass an extra, empty vector as an argument into
|
|
|
|
which the C code to stash the extra return values.
|
|
|
|
\end{itemize}
|
|
|
|
The representation conversions may require storage allocation
|
|
|
|
and reclamation, either using \ex{malloc}/\ex{free} on the C
|
|
|
|
side, or with {\Scheme} allocators on the {\Scheme} side.
|
|
|
|
|
|
|
|
This is a lot of tedious bookkeeping.
|
|
|
|
The job of the cig package is to automate all of this---given
|
|
|
|
a single declarative form specifying the C procedure being called,
|
|
|
|
cig can automatically generate C and {\Scheme} stubs to perform all of the
|
|
|
|
bookkeeping listed above.
|
|
|
|
|
|
|
|
Cig works by having a special top-level form, \ex{define-foreign},
|
|
|
|
that is processed twice, once for {\Scheme}, and once for C.
|
|
|
|
On the {\Scheme} side, \ex{define-foreign} is a macro that expands
|
|
|
|
into a stub procedure performing the {\Scheme} side of the foreign-function
|
|
|
|
call.
|
|
|
|
The {\Scheme} stub doesn't call the actual foreign function directly.
|
|
|
|
Instead, it calls a C stub, which is also generated by the cig.
|
|
|
|
This stub manages the C side of things: argument checking and rep
|
|
|
|
conversion, then calling the actual procedure.
|
|
|
|
The cig system has a stub compiler which scans a {\Scheme} file looking
|
|
|
|
for \ex{define-foreign} forms.
|
|
|
|
These forms are converted into the C stubs, which are written into
|
|
|
|
a file which can be compiled and linked in with the {\scm} v.m.
|
|
|
|
|
|
|
|
\subsection{An example}
|
|
|
|
A complete example is given in figure \ref{fig:openexamp}.
|
|
|
|
%%%
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\newcommand{\subcaption}[1]
|
|
|
|
{\unskip\vspace{-2mm}\begin{center}\unskip\em#1\end{center}}
|
|
|
|
|
|
|
|
\begin{figure}
|
|
|
|
\newenvironment{subfig}[1]%
|
|
|
|
{\begin{boxedminipage}{\textwidth}\vskip 1ex\def\subcap{#1}%
|
|
|
|
\bgroup\small%
|
|
|
|
\begin{codeaux}[=0pt]{\topsep 0pt \leftmargin 1.5em}}
|
|
|
|
{\end{codeaux}\egroup\subcaption{\subcap}\vspace{0.1ex}\end{boxedminipage}}
|
|
|
|
\begin{subfig}{(a) Original {\Scheme} source code}
|
|
|
|
(foreign-source "#include <unistd.h>")
|
|
|
|
|
|
|
|
(define-foreign unix-open ; Scheme proc
|
|
|
|
(open (string fname) ; C fun & arg 1
|
|
|
|
(integer flags) ; arg 2
|
|
|
|
(integer mode)) ; arg 3
|
|
|
|
integer) ; ret value\end{subfig}
|
|
|
|
%%
|
|
|
|
%%
|
|
|
|
\vskip 1em
|
|
|
|
\begin{subfig}{(b) {\Scheme} stub, expanded by {\Scheme} macros}
|
|
|
|
(define unix-open
|
|
|
|
(let ((f (get-external "df_open")))
|
|
|
|
(lambda (fname flags mode)
|
|
|
|
(external-call f (check-arg string? fname unix-open)
|
|
|
|
(check-arg integer? flags unix-open)
|
|
|
|
(check-arg integer? mode unix-open)))))\end{subfig}
|
|
|
|
%%
|
|
|
|
%%
|
|
|
|
\vskip 1em
|
|
|
|
\begin{subfig}{(c) C stub, produced by the cig translator}
|
|
|
|
/* This is an Scheme48/C interface file,
|
|
|
|
** automatically generated by cig.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <stdlib.h> /* For malloc. */
|
|
|
|
#include "libcig.h"
|
|
|
|
|
|
|
|
#include <unistd.h>
|
|
|
|
scheme_value df_open(long nargs, scheme_value *args)
|
|
|
|
{
|
|
|
|
extern int open(const char *, int , int );
|
|
|
|
scheme_value ret1;
|
|
|
|
int r1;
|
|
|
|
|
|
|
|
cig_check_nargs(3, nargs, "open");
|
|
|
|
r1 = open(cig_string_body(args[2]), EXTRACT_FIXNUM(args[1]), EXTRACT_FIXNUM(args[0]));
|
|
|
|
ret1 = ENTER_FIXNUM(r1);
|
|
|
|
return ret1;
|
|
|
|
}\end{subfig}
|
|
|
|
|
|
|
|
\caption{A \protect\ex{define-foreign} form,
|
|
|
|
and the {\Scheme} and C stubs produced from it.}
|
|
|
|
\label{fig:openexamp}
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
Suppose we wish to provide access to the Unix system call \ex{open()}
|
|
|
|
from {\Scheme}.
|
|
|
|
We would write the code shown in part (a) of figure \ref{fig:openexamp},
|
|
|
|
which would be macro-expanded to the code shown in part (b).
|
|
|
|
This defines a procedure \ex{unix-open} that takes three arguments,
|
|
|
|
\ex{fname}, \ex{flags}, and \ex{mode}.
|
|
|
|
After the arguments are type-checked
|
|
|
|
(with the \ex{check-args} auxiliary routine),
|
|
|
|
they are passed off to the C stub routine, \verb|df_open|, using the
|
|
|
|
the primitive {\scm} two-argument \ex{nargs}/\ex{argv} linkage.
|
|
|
|
Cig generates the name of the stub by appending ``\verb|df_|'' to the name
|
|
|
|
of the actual C procedure (the ``df'' stands for ``Define Foreign'').
|
|
|
|
|
|
|
|
The code for the \verb|df_open| procedure is generated by the
|
|
|
|
cig C-stub generator.
|
|
|
|
When cig detects the \ex{define-foreign} form,
|
|
|
|
it writes the C code of figure \ref{fig:openexamp}(c)
|
|
|
|
into an auxiliary source file.
|
|
|
|
%%
|
|
|
|
The \verb|df_open| procedure that is produced is responsible for interfacing
|
|
|
|
to the actual \ex{open} procedure on the C side of the call.
|
|
|
|
The \ex{RawFix} and \ex{SchFix} macros shift between {\Scheme} and C
|
|
|
|
representations for integers.
|
|
|
|
The \verb|cig_copy_scheme_string| function copies a {\Scheme} string into
|
|
|
|
a \ex{malloc}'d C string; \ex{Free} returns the C string to the \ex{malloc}
|
|
|
|
heap.
|
|
|
|
These auxiliary functions and macros are defined in the \ex{cig.h} file which
|
|
|
|
is included at the top of every file generated by cig.
|
|
|
|
|
|
|
|
\subsection{Generating an interface}
|
|
|
|
|
|
|
|
So cig is used to generate {\Scheme}/C procedure interfaces by following these
|
|
|
|
steps:
|
|
|
|
\begin{enumerate}
|
|
|
|
\item Write a description of the C function being called using a
|
|
|
|
cig \ex{define-foreign} form.
|
|
|
|
\item Process the {\Scheme} source with the cig stub generator.
|
|
|
|
\item Compile the resulting C file with cc.
|
|
|
|
\item Link the original library of foreign code and the compiled
|
|
|
|
stub in with the {\scm} vm, either statically or with
|
|
|
|
a dynamic loader.
|
|
|
|
\item Load the \ex{define-foreign} forms into the {\scm}, thus defining
|
|
|
|
the {\Scheme}-side interface.
|
|
|
|
\end{enumerate}
|
|
|
|
Steps 2-4 can be automated with appropriate Unix Makefile rules, which
|
|
|
|
we'll go into later.
|
|
|
|
|
|
|
|
\subsection{Disadvantages}
|
|
|
|
Cig has some disadvantages.
|
|
|
|
\begin{itemize}
|
|
|
|
\item Cig's \ex{define-foreign} forms must occur at top-level,
|
|
|
|
where they can be seen and processed by the C stub generator.
|
|
|
|
They cannot, for example, be produced by other macros.
|
|
|
|
\item Cig does not handle general C datatypes, such as structs,
|
|
|
|
and is not extensible.
|
|
|
|
\item ...other disadvantages?
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
\section{The \protect\ex{define-foreign} form}
|
|
|
|
The \ex{define-foreign} form has the syntax:
|
|
|
|
\begin{tightcode}\cdmath
|
|
|
|
(define-foreign \var{scheme-proc} (\var{c-proc} $\var{arg}_1$ \ldots)
|
|
|
|
$\var{ret-rep}_1$
|
|
|
|
\vdots
|
|
|
|
)\end{tightcode}
|
|
|
|
%
|
|
|
|
\begin{description}
|
|
|
|
\item[\var{scheme-proc}:]
|
|
|
|
The name of the defined {\Scheme} procedure; a symbol.
|
|
|
|
\item[\var{c-proc}:]
|
|
|
|
The name of the C procedure being called; a string or symbol.
|
|
|
|
A symbol is converted to a string by down-casing its printname.
|
|
|
|
\item[$\var{arg}_i$:]
|
|
|
|
The arguments to the procedure.
|
|
|
|
Each one is of the form
|
|
|
|
\codex{(\var{rep} \var{[param]})}
|
|
|
|
where \var{rep} gives the type and representation for the argument
|
|
|
|
being passed, and the optional \var{param} gives the name of the
|
|
|
|
parameter.
|
|
|
|
Representation forms are discussed in the following subsections.
|
|
|
|
The \var{param} field is purely for documentation purposes;
|
|
|
|
it is not required.
|
|
|
|
\item[$\var{ret-rep}_i$]
|
|
|
|
The type and representation of the return values from the procedure.
|
|
|
|
Multiple \var{ret-rep} forms cause multiple values to be returned
|
|
|
|
from the procedure.
|
|
|
|
The syntax of a \var{ret-rep} form is similar to \var{arg} forms.
|
|
|
|
\end{description}
|
|
|
|
|
|
|
|
\subsection{Simple representations}
|
|
|
|
Most of the complexity of the \ex{define-foreign} form is contained
|
|
|
|
in the specifications for the various data representations used to
|
|
|
|
pass data to and from the C routine.
|
|
|
|
However, the basic representations are fairly simple to use.
|
|
|
|
Table \ref{table:simple-reps} shows the forms used for the simplest
|
|
|
|
representations.
|
|
|
|
%
|
|
|
|
\begin{table}[tbh]
|
|
|
|
\begin{center}
|
|
|
|
\begin{tabular}{|lll|}
|
|
|
|
\hline
|
|
|
|
Rep & {\Scheme} value & C type \\
|
|
|
|
\hline\hline
|
|
|
|
\ex{char} & character & \ex{char} \\
|
|
|
|
\ex{bool} & any value & \ex{int} \\
|
|
|
|
\ex{integer} & integer & \ex{int} \\
|
|
|
|
\ex{string} & string & \ex{const char *} \\
|
|
|
|
&&\\
|
|
|
|
\ex{desc} & any value & \verb|scheme_value| \\
|
|
|
|
\ex{char-desc} & character & \verb|scheme_value| \\
|
|
|
|
\ex{integer-desc} & integer & \verb|scheme_value| \\
|
|
|
|
\ex{vector-desc} & vector & \verb|scheme_value| \\
|
|
|
|
\ex{string-desc} & string & \verb|scheme_value| \\
|
|
|
|
\ex{pair-desc} & pair & \verb|scheme_value| \\
|
|
|
|
&&\\
|
|
|
|
\verb|short_u| & integer & \verb|short_u| \\
|
|
|
|
\verb|long| & integer & \verb|long| \\
|
|
|
|
&&\\
|
|
|
|
\verb|size_t| & integer & \verb|size_t| \\
|
|
|
|
\verb|mode_t| & integer & \verb|mode_t| \\
|
|
|
|
\verb|gid_t| & integer & \verb|gid_t| \\
|
|
|
|
\verb|uid_t| & integer & \verb|uid_t| \\
|
|
|
|
\verb|off_t| & integer & \verb|off_t| \\
|
|
|
|
\verb|pid_t| & integer & \verb|pid_t| \\
|
|
|
|
\verb|uint_t| & integer & \verb|unsigned int| \\
|
|
|
|
\hline
|
|
|
|
\end{tabular}
|
|
|
|
\end{center}
|
|
|
|
\label{table:simple-reps}
|
|
|
|
\caption{Simple reps}
|
|
|
|
\end{table}
|
|
|
|
%
|
|
|
|
\begin{itemize}
|
|
|
|
\item The \ex{-desc} reps, such as \ex{char-desc} and \ex{integer-desc},
|
|
|
|
pass the actual {\scm} descriptor for the value, with no conversion.
|
|
|
|
|
|
|
|
\item The \verb|short_u|, and \ex{long} reps are for integers of various sizes.
|
|
|
|
|
|
|
|
\item The \verb|size_t|, \verb|mode_t|, {\etc} reps are for POSIX code.
|
|
|
|
|
|
|
|
\item As a parameter rep, \ex{string} specifies a {\Scheme} string that
|
|
|
|
will not be modified by the C routine. The C stub is permitted
|
|
|
|
to either copy the {\Scheme} string to a freshly malloc'd C string,
|
|
|
|
or to use a pointer directly into the {\Scheme} string itself. (If
|
|
|
|
Cig chooses to use malloc'd storage for the C string, the storage
|
|
|
|
is freed after the C routine returns. In no case should the C routine
|
|
|
|
retain a pointer to the string it is passed.)
|
|
|
|
|
|
|
|
As a return-value rep, \ex{string} copies the C string to a freshly
|
|
|
|
allocated {\Scheme} string. The C string passed back must be
|
|
|
|
taken from the malloc pool; it is freed after the rep-conversion.
|
|
|
|
As a boundary case, the C ``string'' \ex{(char*)NULL} is returned
|
|
|
|
to {\Scheme} as \cd{#f}.
|
|
|
|
|
|
|
|
\item Note that there is no size checking performed when rep-converting between
|
|
|
|
{\Scheme} integers and C \ex{ints}.
|
|
|
|
An extended-precision {\Scheme} integer will not be correctly converted
|
|
|
|
to a C \ex{int}, and C \ex{int}s that are too large for {\scm}'s unboxed
|
|
|
|
integers will not be correctly converted.
|
|
|
|
This may be fixed in a future version; for now, {\em caveat emptor.}
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
{\em Some things are missing: the decl form, the no-decl form.}
|
|
|
|
|
|
|
|
\section{Complex Representations}
|
|
|
|
The full mechanism for specifying data representations in cig is fairly
|
|
|
|
complex.
|
|
|
|
|
|
|
|
\subsection{Simple reps and the rep table}
|
|
|
|
Most reps are given by their names: \ex{integer}, \ex{char}, and so forth.
|
|
|
|
These names are actually entries in two internal table of Cig's that
|
|
|
|
specifies the data-passing protocol used by the stub functions.
|
|
|
|
One table is for representations used as procedure arguments;
|
|
|
|
the other table is for representations used as return forms.
|
|
|
|
These tables parameterise the value-passing protocol used to
|
|
|
|
cross the {\Scheme}/C boundary.
|
|
|
|
|
|
|
|
Each entry in the argument table has four fields:
|
|
|
|
\begin{description}
|
|
|
|
\newcommand{\itm}[1]{\item[\ex{#1}:]}
|
|
|
|
\itm{Scheme-pred}
|
|
|
|
A symbol; the name of {\Scheme} predicate used to type-check the argument
|
|
|
|
being passed to the procedure.
|
|
|
|
\ex{\#f} means no type-check.
|
|
|
|
|
|
|
|
\itm{C-decl}
|
|
|
|
A C declaration for the value in its C representation---the
|
|
|
|
type of the value actually passed to the foreign function.
|
|
|
|
This is given as a {\Scheme} \ex{format} string;
|
|
|
|
the \verb|~a| is where the C variable goes.
|
|
|
|
For example, the string \verb|"char *~a"| would be used to declare a string.
|
|
|
|
A \verb|(format #f c-decl "")| is used to compute a pure type---\eg,
|
|
|
|
for casts.
|
|
|
|
|
|
|
|
\itm{C-cvtr}
|
|
|
|
The {\Scheme}$\rightarrow$C rep-converter, which is applied
|
|
|
|
as a C function/macro by the C stub to the value passed in
|
|
|
|
from {\Scheme}.
|
|
|
|
The resulting value is passed to the C routine.
|
|
|
|
The \ex{C-cvtr} is specified as a string;
|
|
|
|
the empty string means the null coercion.
|
|
|
|
|
|
|
|
\itm{Post-C}
|
|
|
|
This is an optional form.
|
|
|
|
If specified, the C stub applies it to the C value {\em after} the
|
|
|
|
C routine returns, before returning to {\Scheme}.
|
|
|
|
Typically, it is used to free up a malloc'd block of storage
|
|
|
|
that was allocated during the argument's rep conversion.
|
|
|
|
It is specified as a string; \ex{\#f} means no post-C processing.
|
|
|
|
\end{description}
|
|
|
|
|
|
|
|
The rep table for return forms also has four fields:
|
|
|
|
\begin{description}
|
|
|
|
\newcommand{\itm}[1]{\item[\ex{#1}:]}
|
|
|
|
\itm{C-decl}
|
|
|
|
This is identical to the argument table's \ex{C-decl} field.
|
|
|
|
It specifies the type of the value returned by the C routine.
|
|
|
|
The protocol used for multiple-value returns is discussed in
|
|
|
|
section ??.
|
|
|
|
|
|
|
|
\itm{C-cvtrs/safe}
|
|
|
|
This specifies a list of C function/macros that are applied
|
|
|
|
by the C stub to the value returned by the C routine.
|
|
|
|
Each resulting value must be a legal {\Scheme} descriptor.
|
|
|
|
An example might be a C macro that takes a C string and
|
|
|
|
produces its length {\em as a {\scm} fixnum}.
|
|
|
|
The list of safe converters is given as a list of strings,
|
|
|
|
\eg, \ex{("strlen")}.
|
|
|
|
|
|
|
|
\itm{C-cvtrs/alien}
|
|
|
|
This is a similar list of C function/macros, but these
|
|
|
|
converters are allowed to produce values that are not
|
|
|
|
legal {\Scheme} descriptors.
|
|
|
|
The C stub will pass these results back to {\Scheme} inside
|
|
|
|
special {\scm} alien structures to keep the GC from being
|
|
|
|
confused.
|
|
|
|
|
|
|
|
\itm{S-cvtr}
|
|
|
|
The C stub collects all the values produced by the \ex{c-cvtrs/safe}
|
|
|
|
and \ex{c-cvtrs/alien} converters, and passes them back to {\Scheme}.
|
|
|
|
The {\Scheme} stub applies the \ex{s-cvtr} form to these values;
|
|
|
|
it is responsible for producing the actual value returned to {\Scheme}.
|
|
|
|
The \ex{s-cvtr}'s arguments are the safe values, followed by
|
|
|
|
the alien values, in order.
|
|
|
|
Each alien value is packaged up inside a {\scm} alien structure.
|
|
|
|
For example, the C stub rep-converts a C string into its length
|
|
|
|
(as a {\Scheme} fixnum) and a \ex{char *} pointer to a malloc'd copy
|
|
|
|
of the string.
|
|
|
|
The \ex{s-cvtr}'s job is to take this length and the C string,
|
|
|
|
allocate a {\Scheme} string of the proper size, and pass the two
|
|
|
|
strings back into C, where the C string can be copied into the
|
|
|
|
{\Scheme} string and then freed.
|
|
|
|
|
|
|
|
The \ex{s-cvtr} is specified as a {\Scheme} expression; typically a
|
|
|
|
symbol naming a procedure.
|
|
|
|
As a special case, if \ex{s-cvtr} is \ex{\#f}, then exactly one safe rep
|
|
|
|
is being passed back from C, and this value is returned from the
|
|
|
|
{\Scheme} stub unchanged.
|
|
|
|
\end{description}
|
|
|
|
|
|
|
|
\subsection{Custom and compound representations}
|
|
|
|
Here is the full story on value reps.
|
|
|
|
Besides simply naming an existing simple rep, the Cig user can specify his own
|
|
|
|
representation, modify existing entries,
|
|
|
|
or build up compound representations out of existing simple ones.
|
|
|
|
|
|
|
|
The full syntax for an argument rep is:
|
|
|
|
\begin{code}\cdmath
|
|
|
|
\var{arg-rep} ::= \var{simple-rep}
|
|
|
|
| (rep \var{scheme-pred} \var{c-decl} \var{c-cvtr} [\var{post-c}])
|
|
|
|
| (C \var{c-decl})
|
|
|
|
\end{code}
|
|
|
|
The \ex{(rep \ldots)} form specifies a custom rep; it simply gives
|
|
|
|
the value for each field used by the argument rep-conversion protocol
|
|
|
|
(see section ???).
|
|
|
|
The \ex{(C \var{c-decl})} form specifies a C datum passed inside a
|
|
|
|
{\scm} \ex{alien} structure.
|
|
|
|
The C datum is extracted by the C stub and passed to the C routine.
|
|
|
|
|
|
|
|
The full syntax for return reps is:
|
|
|
|
\begin{code}\cdmath
|
|
|
|
\var{ret-rep} ::= \var{simple-rep}
|
|
|
|
| (multi-rep $\var{rep}_1$ \ldots $\var{rep}_n$)
|
|
|
|
| (rep \var{c-decl} ->scheme) ???
|
|
|
|
| (to-scheme \var{rep} \var{c-cvtr/safe})
|
|
|
|
| (C \var{c-decl})
|
|
|
|
| (ignore [c-decl])
|
|
|
|
\end{code}
|
|
|
|
The \ex{multi-rep} form allows the user to rep-convert a return value
|
|
|
|
multiple ways, each resulting in a distinct return value from {\Scheme}.
|
|
|
|
The \ex{to-scheme} form allows the user to override \ex{rep}'s
|
|
|
|
\ex{c-cvtrs/safe} and \ex{c-cvtrs/alien} lists with a single
|
|
|
|
safe converter.
|
|
|
|
The \ex{C} form returns an arbitrary C value, bundled up inside a {\Scheme}
|
|
|
|
\ex{alien} value.
|
|
|
|
The \ex{ignore} value causes the C stub to throw away the value returned
|
|
|
|
by the C routine.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%;;; A return-value rep is:
|
|
|
|
%;;; - A simple rep, as above.
|
|
|
|
%;;; - (MULTI-REP rep1 ... repn)
|
|
|
|
%;;; The single value returned from the C function is rep-converted
|
|
|
|
%;;; n ways, each resulting in a distinct return value from Scheme.
|
|
|
|
%;;; - (TO-SCHEME rep c->scheme)
|
|
|
|
%;;; Identical to REP, but use the single C->SCHEME form for the return
|
|
|
|
%;;; rep-conversion. There is no POST-SCHEME processing. This allows you
|
|
|
|
%;;; to use a special rep-converter on the C side, but otherwise use all
|
|
|
|
%;;; the properties of some standard rep. C->SCHEME is a string (or symbol).
|
|
|
|
%;;; - (C type)
|
|
|
|
%;;; Returns a raw C type. No rep-conversion. TYPE is a C type, represented
|
|
|
|
%;;; as a string (or a symbol).
|
|
|
|
%
|
|
|
|
%;;; Parameter reps:
|
|
|
|
%;;; - A simple rep is simply the name of a record in the rep table.
|
|
|
|
%;;; e.g., integer, string
|
|
|
|
%;;; - (REP scheme-pred c-decl to-c [free?])
|
|
|
|
%;;; A detailed spec, as outlined above. SCHEME-PRED is a procedure or #f.
|
|
|
|
%;;; C-DECL is a format string (or a symbol). TO-C is a format string
|
|
|
|
%;;; (or a symbol).
|
|
|
|
%;;; - (C type)
|
|
|
|
%;;; The argument is a C value, passed with no type-checking
|
|
|
|
%;;; or rep-conversion. TYPE is a format string (or a symbol).
|
|
|
|
|
|
|
|
%Parameter rep syntax is:
|
|
|
|
% simple-type
|
|
|
|
% (REP scheme-pred c-decl ->c [free])
|
|
|
|
% (C c-type) ==> (REP #f c-type "")
|
|
|
|
%
|
|
|
|
%Return rep syntax is:
|
|
|
|
% simple-type
|
|
|
|
% (REP c-decl ->scheme)
|
|
|
|
% (->SCHEME rep ->scheme) ; Override the ->scheme part.
|
|
|
|
% (C c-type) ==> (REP c-decl #f)
|
|
|
|
|
|
|
|
\section{Cig and make files}
|
|
|
|
The Cig stub generator can be invoked automatically by a Unix make(1) file.
|
|
|
|
The following implicit rule will cause Cig to scan a \ex{.scm} file
|
|
|
|
for top-level \ex{define-foreign} and \ex{foreign-source} forms, producing
|
|
|
|
a \ex{.c} file that can be compiled and linked in with your C code
|
|
|
|
and the {\scm} v.m.:
|
|
|
|
\begin{inset}
|
|
|
|
\begin{verbatim}
|
|
|
|
.scm.c:
|
|
|
|
cig < $< > $@
|
|
|
|
\end{verbatim}
|
|
|
|
\end{inset}
|
|
|
|
This rule uses the standalone stub generator program \ex{cig}, which reads
|
|
|
|
{\Scheme} forms from its standard input, and writes C stubs on its standard
|
|
|
|
output.
|
|
|
|
|
|
|
|
To generate the \ex{cig} program from its {\Scheme} source, \ex{cig.scm},
|
|
|
|
use something like the following Makefile rule:
|
|
|
|
\begin{inset}\small
|
|
|
|
\begin{verbatim}
|
|
|
|
cig: cig/cig.scm
|
|
|
|
gensym=$$$$ \
|
|
|
|
# First, build an image to do cig processing, \
|
|
|
|
# and put it in /usr/tmp: \
|
|
|
|
(echo ",batch"; \
|
|
|
|
echo ",flush"; \
|
|
|
|
echo ",flush maps source names files table"; \
|
|
|
|
echo ",load-config packages-plus.scm"; \
|
|
|
|
echo ",load-config cig/cig.scm"; \
|
|
|
|
echo ",load-package cig-standalone"; \
|
|
|
|
echo ",flush"; echo ",flush maps source names files table"; \
|
|
|
|
echo ",in cig-standalone"; \
|
|
|
|
echo ",build cig-standalone-toplevel /usr/tmp/cig.$$gensym") \
|
|
|
|
| $(LIB)/$(VM).simple \
|
|
|
|
\
|
|
|
|
#Insert a #! trigger: \
|
|
|
|
bin/image2script $(LIB)/$(VM).simple </usr/tmp/cig.$$gensym >cig/cig \
|
|
|
|
\
|
|
|
|
# Make it executable: \
|
|
|
|
chmod +x cig/cig \
|
|
|
|
\
|
|
|
|
# Flush the temp file: \
|
|
|
|
rm -f /usr/tmp/cig.$$gensym
|
|
|
|
\end{verbatim}
|
|
|
|
\end{inset}
|
|
|
|
|
|
|
|
\section{To Do}
|
|
|
|
\begin{verbatim}
|
|
|
|
Doc: 4 modules defined
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
\end{document}
|