Add wget mirror
wget --mirror --no-parent --no-host-directories --cut-dirs 3 \ https://www.ccs.neu.edu/home/lth/ffigen/ The following files are omitted from this commit: 960213.tar.gz ffigen.tar.gz lcc-3.4b.tar.gz robots.txt
This commit is contained in:
parent
cce0f69fad
commit
61fc3061a1
|
@ -0,0 +1,90 @@
|
||||||
|
; -*- scheme -*-
|
||||||
|
;
|
||||||
|
; Suggestions for policy mechanisms in the FFIGEN back-end for Chez Scheme.
|
||||||
|
; These are currently *not* implemented, and are only intended as examples.
|
||||||
|
;
|
||||||
|
; Mechanism falls into three categories: exclusion, overriding, and
|
||||||
|
; adaptation.
|
||||||
|
|
||||||
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
;
|
||||||
|
; Exclusion:
|
||||||
|
; At the outset, everything in the .ffi file is marked as referenced.
|
||||||
|
; The mechanisms for excluding stuff are based on an item's name.
|
||||||
|
|
||||||
|
; exclude-file takes a file name or list of file names and excludes every
|
||||||
|
; item defined in that file and files included by it.
|
||||||
|
|
||||||
|
(exclude-file '())
|
||||||
|
|
||||||
|
; exclude-structure takes a structure name (i.e. either "struct FOO"
|
||||||
|
; or "union FOO" or "FOO") or list of names and inhibits generation of
|
||||||
|
; constructors, destructors, accessors, and mutators for it and all
|
||||||
|
; typedefs derived from it. If the name is a typedef name and the
|
||||||
|
; structure named has a compiler-generated tag, then the structure
|
||||||
|
; named by this typedef is also excluded.
|
||||||
|
|
||||||
|
(exclude-structure "FILE")
|
||||||
|
|
||||||
|
; exclude-function excludes the named function(s).
|
||||||
|
|
||||||
|
(exclude-function "select")
|
||||||
|
|
||||||
|
; exclude-global excludes the named global variable.
|
||||||
|
|
||||||
|
(exclude-global "__iob")
|
||||||
|
|
||||||
|
|
||||||
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
;
|
||||||
|
; Overriding
|
||||||
|
|
||||||
|
; Override-prototype gives the named function a new prototype.
|
||||||
|
|
||||||
|
(override-prototype "fgets"
|
||||||
|
`(function (,(primitive-type 'string)
|
||||||
|
,(primitive-type 'int)
|
||||||
|
,(pointer-type (struct-type "FILE")))
|
||||||
|
,(pointer-type (primitive-type 'char))))
|
||||||
|
|
||||||
|
|
||||||
|
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||||
|
;
|
||||||
|
; Adaptation
|
||||||
|
|
||||||
|
; Short-policy says something about how to handle shorts. Three values are
|
||||||
|
; possible: warning, use-integer, and use-proxy. If use-integer is the
|
||||||
|
; value, then an integer-32 FFI argument will be used on the assumption that
|
||||||
|
; this is meaningful in the native API. If use-proxy is the value then a
|
||||||
|
; proxy function is generated which takes an integer argument and calls
|
||||||
|
; the real function with the argument cast to short.
|
||||||
|
|
||||||
|
(short-policy 'use-integer)
|
||||||
|
|
||||||
|
; Struct-param-policy says something about how to handle structure parameters.
|
||||||
|
; Values are: warning and use-proxy. If use-proxy is the value, then
|
||||||
|
; an FF will be generated which takes structure pointers and which names a
|
||||||
|
; proxy function (this is transparent to Scheme code), and the real function
|
||||||
|
; will be called by the proxy.
|
||||||
|
|
||||||
|
(struct-param-policy 'warning)
|
||||||
|
|
||||||
|
; Struct-return-policy says something about how to handle structure return
|
||||||
|
; values. Values are: warning, alloc-new, and pass-placeholder. If alloc-new
|
||||||
|
; is the value, then a proxy function will be generated which receives
|
||||||
|
; the return value, allocates an object on the heap for it, copies the value
|
||||||
|
; into the allocated memory, and returns a pointer to the memory.
|
||||||
|
; If pass-placeholder is the value, then a FF and proxy will be generated
|
||||||
|
; that take an extra argument (the first); that argument must be a pointer
|
||||||
|
; to a structure in which to place the value.
|
||||||
|
|
||||||
|
(struct-return-policy 'warning)
|
||||||
|
|
||||||
|
; Variadic-policy says something about how to handle variadic procedures.
|
||||||
|
; Values are: warning and exclude. If the value is exclude, a warning will
|
||||||
|
; be given and no FFI code will be generated; if the value is warning, then
|
||||||
|
; invalid FFI code will be generated.
|
||||||
|
|
||||||
|
(variadic-policy 'warning)
|
||||||
|
|
||||||
|
; eof
|
Binary file not shown.
|
@ -0,0 +1,135 @@
|
||||||
|
<HTML>
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>FFIGEN Home Page</TITLE>
|
||||||
|
<LINK REV="made" HREF="mailto:lth@acm.org">
|
||||||
|
</HEAD>
|
||||||
|
<BODY>
|
||||||
|
<H1>FFIGEN</H1>
|
||||||
|
|
||||||
|
"A good foreign function interface is 25% code and 75% policy."
|
||||||
|
<HR>
|
||||||
|
<P>
|
||||||
|
|
||||||
|
FFIGEN (Foreign Function Interface GENerator) is a program suite that
|
||||||
|
facilitates the writing of translators from C header files to foreign
|
||||||
|
function interfaces for particular language implementations.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
FFIGEN Manifesto and Overview
|
||||||
|
<A href="manifesto.html">(HTML)</a> <A href="manifesto.ps.gz">(ps.gz, 26 KB)</A>
|
||||||
|
<BR>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
FFIGEN User's Manual
|
||||||
|
<a href="userman.html">(HTML)</a> <A href="userman.ps.gz">(ps.gz, 30 KB)</A>
|
||||||
|
<BR>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
FFIGEN Back-end for Chez Scheme Version 5
|
||||||
|
<A href="chez.ps.gz">(ps.gz, 52 KB)</A>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
There are three motivating observations behind FFIGEN. The first is
|
||||||
|
that C header files are hard to parse because of the preprocessor,
|
||||||
|
general syntactic grunge, and the problem of getting the data layouts
|
||||||
|
right. The second is that foreign function interfaces differ widely and
|
||||||
|
that translations to different FFIs can't be the same, yet should share
|
||||||
|
work as much as possible. The third is that not all translations are
|
||||||
|
suitable for all purposes; there may be multiple valid translations -
|
||||||
|
each of which serves a different need - for any given language's FFI.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
For these reasons, a translator from C header syntax to an FFI should
|
||||||
|
have two parts: one target-independent front-end that translates from
|
||||||
|
the header file into a rational intermediate form and which can be used
|
||||||
|
with all translators, and a target-dependent back-end that translates
|
||||||
|
from the intermediate form to an FFI for the target system, using a
|
||||||
|
translation policy to guide the translation. This design nicely
|
||||||
|
facilitates writing back-ends for multiple languages, multiple FFIs per
|
||||||
|
language, and multiple policies per FFI.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
FFIGEN is a system that implements the split-translation philosophy.
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
|
||||||
|
<LI>The FFIGEN front-end is based on the front-end of the freely
|
||||||
|
available, production quality, ANSI C compiler <em>lcc</em>. Using
|
||||||
|
<em>lcc</em> makes
|
||||||
|
the FFIGEN front end portable, complete, and extensible for special
|
||||||
|
purposes.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
<LI>The FFIGEN back-ends can be small (a back-end for Chez Scheme that
|
||||||
|
handles nearly all of C is 350 lines of Scheme code, for example), and
|
||||||
|
can be written in any language. Scheme is the preferred language for
|
||||||
|
back-ends right now, because the output syntax of the front-end,
|
||||||
|
although easily changeable, is that of S-expressions, and because a
|
||||||
|
back-end written in Scheme is already available for new back-ends to
|
||||||
|
build on.
|
||||||
|
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
The current version of FFIGEN is available as a set of modifications to
|
||||||
|
<em>lcc</em> version 3.4b; you also need to get the <em>lcc</em> sources.
|
||||||
|
The FFIGEN
|
||||||
|
distribution includes documentation on how to write back-ends and a
|
||||||
|
documented example back-end for the FFI of Chez Scheme version 5.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
<B> This is a preliminary release of FFIGEN. It works, but
|
||||||
|
is neither complete nor polished.</B>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
Click <A href="ffigen.tar.gz">here</A> to download the
|
||||||
|
full FFIGEN distribution. (148 KB)
|
||||||
|
<BR>
|
||||||
|
This archive has not been updated with the fixes in the bug fix file (below).
|
||||||
|
<P>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
Click <A href="960213.tar.gz">here</A> to download bug fixes up to February 13, 1996. (29 KB) <BR>
|
||||||
|
Fixes to chez.sch to handle structs/unions that are declared but not
|
||||||
|
defined; function pointers; and unsigned shorts (a typo). Also a minor fix
|
||||||
|
to policy.sch to remove gratuitous non-standard-ness (use of reverse! rather
|
||||||
|
than reverse). Also included generated standard libraries for Chez Scheme
|
||||||
|
back-end (unknowingly left out of distribution). Unpack in <em>lcc</em> main
|
||||||
|
directory.
|
||||||
|
<P>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
Click <A href="chez-policy.sch">here</A> to download an example of a Chez
|
||||||
|
Scheme policy file, left out of distribution.
|
||||||
|
<P>
|
||||||
|
<img src="../ball.red.gif" alt="*">
|
||||||
|
Click <A href="lcc-3.4b.tar.gz">here</A> to download the <em>lcc</em>
|
||||||
|
3.4b distribution.
|
||||||
|
(965 KB)
|
||||||
|
|
||||||
|
<P>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
<P>
|
||||||
|
|
||||||
|
Related systems:
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI> Kenneth Russell's <A href="http://www-white.media.mit.edu/~kbrussel/Header2Scheme">Header2Scheme</A>.
|
||||||
|
<LI> David Beazley's <A href="http://www.cs.utah.edu/~beazley/SWIG/">SWIG</A> system.
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
<HR>
|
||||||
|
<P>
|
||||||
|
|
||||||
|
The <A HREF="todo.html">FFIGEN to-do list</A>.
|
||||||
|
|
||||||
|
<P>
|
||||||
|
<A HREF="mailto:lth@acm.org"><I>lth@acm.org</I></A><BR>
|
||||||
|
24 May 2000
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
|
@ -0,0 +1,241 @@
|
||||||
|
<!-- -*- mode: html; mode: font-lock -*-
|
||||||
|
|
||||||
|
Hand-translated from LaTeX to HTML by lth on 2000-05-16,
|
||||||
|
converted footnotes to in-line text, and inserted hyperlinks.
|
||||||
|
No other changes. -->
|
||||||
|
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>FFIGEN Manifesto and Overview</title>
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body>
|
||||||
|
|
||||||
|
<center>
|
||||||
|
<h1>FFIGEN Manifesto and Overview</h1><br>
|
||||||
|
Lars Thomas Hansen <br>
|
||||||
|
<tt>lth@cs.uoregon.edu</tt><br>
|
||||||
|
February 6, 1996
|
||||||
|
</center>
|
||||||
|
|
||||||
|
<blockquote>
|
||||||
|
<p>FFIGEN (Foreign Function Interface GENerator) is a program suite which
|
||||||
|
facilitates the writing of translators from C header files to foreign
|
||||||
|
function interfaces for particular language implementations.</p>
|
||||||
|
|
||||||
|
<p>On a more general level, FFIGEN is a statement about how such
|
||||||
|
translators should be structured for maximum usability, namely as a
|
||||||
|
single translator from C to a rational intermediate language and as
|
||||||
|
multiple translators from the intermediate language to separate FFI
|
||||||
|
translations. In the present document I motivate this two-level
|
||||||
|
structure by arguing that the many policy questions inherent in choosing
|
||||||
|
a mapping from one language to another cannot be accomodated in a single
|
||||||
|
translator, and that the two-level structure promotes significant code
|
||||||
|
reuse. Companion documents present the program suite itself.</p>
|
||||||
|
</blockquote>
|
||||||
|
|
||||||
|
<h2>1. Manifesto</h2>
|
||||||
|
|
||||||
|
<p>Many language implementations have mechanisms which provide support for
|
||||||
|
call-outs to other, typically more primitive, languages. In particular,
|
||||||
|
implementations of very-high-level languages like Scheme, Common Lisp,
|
||||||
|
Standard ML, and Haskell support call-outs to system-level languages,
|
||||||
|
typically C. Other examples include the support for call-outs to C and
|
||||||
|
assembly language in C++, the EXTRINSIC directive in HPF, and the
|
||||||
|
<tt><*EXTERNAL*></tt> pragma in DEC SRC Modula-3. Mechanisms to call-out
|
||||||
|
to other languages are typically called <em>foreign function
|
||||||
|
interfaces</em> (FFIs). The purpose of an FFI is often to gain access to
|
||||||
|
functionality which is not (efficiently) expressible in the language
|
||||||
|
itself; other times the FFI is used to allow the program to interface to
|
||||||
|
existing libraries.</p>
|
||||||
|
|
||||||
|
<p>FFIs are only rarely part of the language definition; the only examples
|
||||||
|
I can think of are the support for C and assembly in C++ and the
|
||||||
|
EXTRINSIC directive in HPF. More typically, each language
|
||||||
|
implementation has its own idiosyncratic and often ad-hoc mechanism for
|
||||||
|
supporting foreign data types, functions, and variables. The mechanisms
|
||||||
|
are not standardized probably because they depend to a large extent on
|
||||||
|
the calling conventions of the procedure being called, the operating
|
||||||
|
system on which the program is running, the architecture of the machine,
|
||||||
|
the data types of the language being called, the version of the
|
||||||
|
compilers for the host and foreign languages, and so on. (In the
|
||||||
|
following I will refer to a point in the space made from the product of
|
||||||
|
the preceding attributes as a <em>target</em>.) Since the system
|
||||||
|
dependencies are considerable, it is unlikely that a fully general and
|
||||||
|
portable FFI can be defined for a language, and in addition, an
|
||||||
|
interface that works with all targets is likely to be neither functional
|
||||||
|
nor convenient. The chances for any portable, standardized language to
|
||||||
|
adopt a non-trivial FFI therefore seem slight. This is not to say that
|
||||||
|
an adequate job can't be done in many cases--for example, Franz Allegro
|
||||||
|
Common Lisp sports a sophisticated FFI which supports C and Fortran
|
||||||
|
seemingly very well--only that no <em>standard</em> and <em>general</em>
|
||||||
|
solution is likely to emerge.</p>
|
||||||
|
|
||||||
|
<p>Based on these observations, an approach to inter-language calling would
|
||||||
|
be to accept the fact that FFIs are implementation-dependent and instead
|
||||||
|
concentrate our effort on a higher level of abstraction: that of the
|
||||||
|
library interface. Even if the FFI is target-dependent, most of the
|
||||||
|
time the interface to a library is not (which is the beauty of an
|
||||||
|
interface in the first place). If, for each library, there existed a
|
||||||
|
reasonable definition of its interface, then a program could take that
|
||||||
|
definition and generate FFI code for the library for a given target.
|
||||||
|
This is the approach advocated by the creators of the ILU system (see
|
||||||
|
section 3).</p>
|
||||||
|
|
||||||
|
<p>However, manufacturers of libraries are <em>not</em> distributing
|
||||||
|
reasonable definitions of the interfaces to their libraries. All you
|
||||||
|
usually get is a C or C++ header file. A header file is not a
|
||||||
|
reasonable definition of the interface because of the baggage it
|
||||||
|
carries: nested include files, preprocessor macros, conditional
|
||||||
|
compilation, syntactic peculiarities, implementation language target
|
||||||
|
dependencies, and so on. In the best of all worlds, the manufacturer
|
||||||
|
would distribute the interfaces in an interface definition language like
|
||||||
|
the Object Management Group's IDL or ILU's ISL, and maybe one day that
|
||||||
|
will be common. In the mean time, we must fend for ourselves.</p>
|
||||||
|
|
||||||
|
<p>What we must do is to provide a translator which takes as its input not
|
||||||
|
a reasonable definition but instead a C or C++ header file or set of
|
||||||
|
header files, and produces as its output the FFI code for the library
|
||||||
|
for a given target. However, such a program is likely to be complicated
|
||||||
|
and there will be one version for each target. Maintaining all these
|
||||||
|
translators will be an unpleasant task. We could of course have one
|
||||||
|
translator, to IDL or ISL, and translators from the interface language
|
||||||
|
to the FFI, and as we will see, this is a variation on the mechanism
|
||||||
|
implemented by FFIGEN.</p>
|
||||||
|
|
||||||
|
<p>An additional important problem is that there is not one but several
|
||||||
|
translations for every target. A given interface can be translated to
|
||||||
|
any of several FFIs depending on the desired <em>policy</em> for the
|
||||||
|
translation. For example, consider a function
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
char *fgets(char*, int, FILE*).
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
What does <tt>char*</tt> translate to? Consider the FFI provided by Chez
|
||||||
|
Scheme version 5. It has a <tt>string</tt> type which in a parameter
|
||||||
|
position causes the address of the first character of the string
|
||||||
|
argument to be passed to the function, but which in the return position
|
||||||
|
causes the characters to be copied from the storage pointed to by the
|
||||||
|
return value (if not <tt>NULL</tt>) into a fresh Scheme string. So if we
|
||||||
|
translate <tt>char*</tt> as <tt>string</tt>, we end up with (since
|
||||||
|
<tt>FILE*</tt> is translated as an <tt>unsigned int</tt>)
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define fgets
|
||||||
|
(foreign-function "fgets"
|
||||||
|
(string integer-32 unsigned-32)
|
||||||
|
string))
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
which is expensive because the string is (needlessly) copied on return.
|
||||||
|
On the other hand, we can treat a <tt>char*</tt> as "just a pointer" and
|
||||||
|
translate as:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define fgets
|
||||||
|
(foreign-function "fgets"
|
||||||
|
(unsigned-32 integer-32 unsigned-32)
|
||||||
|
unsigned-32))
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
but this does not let us access the characters in the buffer using
|
||||||
|
Scheme's string functions, since the buffer is not a string. In the
|
||||||
|
end, it appears that no fixed translation for <tt>char*</tt> is possible;
|
||||||
|
even if a fixed translation (and then: which one of them?) is adequate
|
||||||
|
in most situations, there will be special cases. (Arguably, it
|
||||||
|
would have been better for <tt>fgets()</tt> to return a truth value or the
|
||||||
|
number of characters read.)</p>
|
||||||
|
|
||||||
|
<p>The bottom line is, there is a lot of policy that goes into a
|
||||||
|
translation into a specific FFI. Hence we have a slogan (the core of
|
||||||
|
the Manifesto):</p>
|
||||||
|
|
||||||
|
<blockquote>
|
||||||
|
A good foreign function interface is 25% code and 75% policy.
|
||||||
|
</blockquote>
|
||||||
|
|
||||||
|
<p>It should be a goal, then, to separate the ardous task of parsing and
|
||||||
|
type-checking C headers and translating them into a rational
|
||||||
|
intermediate form, from the task of translating the intermediate form
|
||||||
|
into a FFI specification for a given target and translation policy.</p>
|
||||||
|
|
||||||
|
<h2>2. The FFIGEN System</h2>
|
||||||
|
|
||||||
|
<p>I have written a program, which I call <em>ffigen</em>, which takes
|
||||||
|
as its input a C header file and produces as its output a rational
|
||||||
|
translation of the interface defined by the header file. A rational
|
||||||
|
translation is one in which unnecessary or redundant syntax has been
|
||||||
|
removed, preprocessor macros have been expanded, and preprocessor
|
||||||
|
conditionals have been resolved so that definitions have been included
|
||||||
|
or excluded corrspondingly. The exact format of the intermediate code
|
||||||
|
is described in a companion document, the <a href="userman.html">FFIGEN
|
||||||
|
User's Manual</a>. <em>ffigen</em> functions as the <em>front-end</em>
|
||||||
|
of a system which translates C headers into foreign function
|
||||||
|
interfaces.</p>
|
||||||
|
|
||||||
|
<p>Each target system will have one or more specific <em>back-ends</em> which
|
||||||
|
take the intermediate form and produce translations for particular
|
||||||
|
targets and translation policies. Substantial parts of the back-end
|
||||||
|
code is largely target-independent and can therefore be shared by
|
||||||
|
multiple back-ends.</p>
|
||||||
|
|
||||||
|
<p>I have written one back-end to serve as a sample; it produces FFI code
|
||||||
|
for Chez Scheme version 5. It is documented in a companion document,
|
||||||
|
<em>FFIGEN Back-end for Chez Scheme Version 5</em>.</p>
|
||||||
|
|
||||||
|
|
||||||
|
<h2>3. Related Work</h2>
|
||||||
|
|
||||||
|
<p>Kenneth B. Russell of MIT has implemented a system called Header2Scheme
|
||||||
|
which translates C++ to the FFI of the SCM Scheme system. FFIGEN and
|
||||||
|
Header2Scheme are fairly different at this point. My goal with FFIGEN
|
||||||
|
was to cover all of ANSI C including the preprocessor in a reasonable
|
||||||
|
way; this is doable because ANSI C is a small, fixed, and fairly simple
|
||||||
|
language. C++, on the other hand, is a very large, changing, and
|
||||||
|
complex language, and Header2Scheme therefore handles only part of it at
|
||||||
|
this time (as of version 1.2, it does not handle preprocessor macros,
|
||||||
|
typedefs, and enums). In addition, my emphasis was on not fixing policy
|
||||||
|
at all, which gives great freedom (and more work) to back-end writers,
|
||||||
|
whereas Russell has mostly fixed the policy. On the other hand,
|
||||||
|
Header2Scheme allows some policy decisions to be expressed in auxiliary
|
||||||
|
files given to the translator, and I have yet to experiment with these
|
||||||
|
mechanisms in FFIGEN. Header2Scheme is available from URL
|
||||||
|
<pre>
|
||||||
|
http://www-white.media.mit.edu/~kbrussel/Header2Scheme
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>A message (<tt><1996Jan17.121933.25825@chemabs.uucp></tt>) posted
|
||||||
|
to the Usenet group <tt>comp.lang.scheme</tt> (among others) alleged that
|
||||||
|
Apple has a translator for their Dylan implementation which will take a
|
||||||
|
C header file and generate Dylan FFI glue for it. I know nothing else
|
||||||
|
about this system (but would appreciate hearing about it from anyone who
|
||||||
|
knows).</p>
|
||||||
|
|
||||||
|
<p>The ILU (Inter-Language Unification) system from Xerox PARC provides
|
||||||
|
cross-language calling functionality for modules which have interfaces
|
||||||
|
specified in ISL, the ILU interface definition language. ILU will take
|
||||||
|
the interfaces and produce stubs (glue, as it were) for the languages so
|
||||||
|
that they can call each other. The ISL file specifies the interface
|
||||||
|
somewhat abstractly in terms of data types which are meaningful in ISL
|
||||||
|
but which have various mappings in the target languages; again, one
|
||||||
|
mapping is assumed to fit all.</p>
|
||||||
|
|
||||||
|
<h2>4. Acknowlegements</h2>
|
||||||
|
|
||||||
|
<p>FFIGEN is based on the <em>lcc</em> ANSI C compiler. See the <a
|
||||||
|
href="userman.html">FFIGEN User's Manual</a> for full acknowlegements
|
||||||
|
and a copyright notice.</p>
|
||||||
|
|
||||||
|
<p>This work has been supported by ARPA under U.S. Army grant
|
||||||
|
No. DABT63-94-C-0029, "Programming Environments, Compiler Technology
|
||||||
|
and Runtime Systems for Object Oriented Parallel Processing".</p>
|
||||||
|
|
||||||
|
<hr>
|
||||||
|
<address>
|
||||||
|
<A HREF="mailto:lth@acm.org">lth@acm.org</A>
|
||||||
|
</address>
|
||||||
|
<em>24 May 2000</em>
|
||||||
|
</body>
|
||||||
|
</html>
|
Binary file not shown.
|
@ -0,0 +1,92 @@
|
||||||
|
<HTML>
|
||||||
|
|
||||||
|
<HEAD>
|
||||||
|
<TITLE>FFIGEN To-do list</TITLE>
|
||||||
|
<LINK REV="made" HREF="mailto:lth@acm.org">
|
||||||
|
</HEAD>
|
||||||
|
|
||||||
|
<BODY>
|
||||||
|
<H2>FFIGEN To-do list</H2>
|
||||||
|
|
||||||
|
Updated 14 June 2000.
|
||||||
|
|
||||||
|
<H3>Intermediate format features</H3>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI> Full ANSI C support:
|
||||||
|
<UL>
|
||||||
|
<LI> [done] Bitfields.
|
||||||
|
<LI> General support for type qualifiers.
|
||||||
|
</UL>
|
||||||
|
<LI> Output a machine description.
|
||||||
|
<LI> Output struct/union sizes.
|
||||||
|
<LI> Output structure field offsets.
|
||||||
|
<LI> Output line and column information.
|
||||||
|
<LI> Retain and output comments.
|
||||||
|
<LI> Output source file information (the name of the input file to
|
||||||
|
<code>lcc -ffigen</code>); this can
|
||||||
|
be useful since the back end can generate C files which
|
||||||
|
<code>#includes</code> the source header file.
|
||||||
|
<LI> Support certain extensions: Microsoft __huge, __near, __far, __based,
|
||||||
|
__cdecl, __pascal; GNU __inline; others?
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H3>Processing (both front-end and back-end)</H3>
|
||||||
|
|
||||||
|
<UL>
|
||||||
|
<LI> Some general support for a policy file?
|
||||||
|
<LI> More intelligent macro-expansion support: macros should be expanded
|
||||||
|
as far as possible, and extraneous cruft should be removed so that the
|
||||||
|
back end can produce better translations.
|
||||||
|
<LI> Support for some form of tokenized macros to support certain regular
|
||||||
|
and nice rewrites? C libraries like Open Inventor use macros
|
||||||
|
heavily in a virtual-function like style:
|
||||||
|
<pre>
|
||||||
|
#define SoSphSetOverride(_this, state) \
|
||||||
|
SoNodeSetOverride((SoNode *)_this, state)
|
||||||
|
</pre>
|
||||||
|
and it would be nice to provide some support for such cases in the form
|
||||||
|
of already-tokenized output.
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H3>Implementation features</H3>
|
||||||
|
<UL>
|
||||||
|
<LI> Move to lcc 4.1, and ASDL.
|
||||||
|
<LI> [done] Move to lcc 3.6.
|
||||||
|
<LI> [done] Proper integration with lcc. Currently, it uses the lcc driver but
|
||||||
|
it generates code, performs assembly, and produces file.o (which it need
|
||||||
|
not do). In addition, the output file is called SYMBOLS but should
|
||||||
|
rather be called filename.ffi.
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H3>Known bugs</H3>
|
||||||
|
<UL>
|
||||||
|
<LI> [done] Currently the rhs of a macro is output without any whitespace. This
|
||||||
|
is not correct if there are two adjacent identifiers or reserved words,
|
||||||
|
which happens in declarations (consider "const int blah"). [Harold]
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H3>Back-ends</H3>
|
||||||
|
<UL>
|
||||||
|
<LI> Back-end for Scheme-to-C.
|
||||||
|
<LI> Back-end for Gambit-C (Harold's got one working, it also does
|
||||||
|
interesting things with Open Inventor macros (see above)).
|
||||||
|
<LI> Back-end for Tcl/Tk?
|
||||||
|
<LI> Back-ends for ILU and Modula-3.
|
||||||
|
<LI> Back-end for STk.
|
||||||
|
<LI> Improvements to Chez Scheme back-end.
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<H3>Miscellaneous</H3>
|
||||||
|
<UL>
|
||||||
|
<LI> Advertise on lcc mailing list.
|
||||||
|
</UL>
|
||||||
|
|
||||||
|
<HR>
|
||||||
|
<P>
|
||||||
|
Press <A HREF="index.html">here</A> to go to the FFIGEN home page.
|
||||||
|
|
||||||
|
<hr>
|
||||||
|
<address><a href="mailto:lth@acm.org">lth@acm.org</address>
|
||||||
|
</BODY>
|
||||||
|
</HTML>
|
|
@ -0,0 +1,554 @@
|
||||||
|
<!-- -*- mode: html; mode: font-lock -*-
|
||||||
|
|
||||||
|
Hand-translated from LaTeX to HTML by lth on 2000-05-16, and
|
||||||
|
converted footnotes to in-line text. Fixed a small number of
|
||||||
|
typos. No other changes. -->
|
||||||
|
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<title>FFIGEN User's Manual</title>
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body>
|
||||||
|
|
||||||
|
<center>
|
||||||
|
<h1>FFIGEN User's Manual</h1><br>
|
||||||
|
(Preliminary)<br>
|
||||||
|
Lars Thomas Hansen<br>
|
||||||
|
<tt>lth@cs.uoregon.edu</tt><br>
|
||||||
|
February 6, 1996
|
||||||
|
</center>
|
||||||
|
|
||||||
|
<h2>1. Introduction</h2>
|
||||||
|
|
||||||
|
<p>FFIGEN is a program system which facilitates the writing of
|
||||||
|
translators from C header files to foreign function interfaces for
|
||||||
|
particular programming language implementations. This document
|
||||||
|
describes its structure and use. The discussion is aimed at translator
|
||||||
|
writers; everyone else should confine themselves to section 3. A
|
||||||
|
companion document, <a href="manifesto.html">FFIGEN Manifesto and
|
||||||
|
Overview</a>, motivates the work, and other companion documents describe
|
||||||
|
specific translator implementations. In particular, the document
|
||||||
|
<em>FFIGEN Back-end for Chez Scheme Version 5</em> describes one
|
||||||
|
translator in detail.</p>
|
||||||
|
|
||||||
|
<p>FFIGEN is based on the <em>lcc</em> C compiler, which is copyrighted
|
||||||
|
software. See Section 10 for a full copyright notice.</p>
|
||||||
|
|
||||||
|
<h2>2. Writing Translators</h2>
|
||||||
|
|
||||||
|
<p>To generate a translation of a header file you run the <em>ffigen</em>
|
||||||
|
command to generate an intermediate form of the C header files you want
|
||||||
|
to translate, and then run the back-end on the resulting files to
|
||||||
|
generate the foreign function interface for the library.</p>
|
||||||
|
|
||||||
|
<p>Your task, should you choose to accept it, is to implement the
|
||||||
|
target-specific parts of the back-end for your particular target (which
|
||||||
|
is to say, combination of host language implementation, operating
|
||||||
|
system, architecture, foreign language implementation, and translation
|
||||||
|
policy). You should be able to use the FFIGEN front-end and the
|
||||||
|
target-independent parts of the back-end pretty much as they are.</p>
|
||||||
|
|
||||||
|
<p>How to implement the target-specific parts of the back-end is
|
||||||
|
discussed in Section 6. Use of the front end is described in Section 2.
|
||||||
|
The intermediate format is described in Section 4, and the
|
||||||
|
target-independent parts of the back-end and their interface to the
|
||||||
|
target-dependent part are described in Section 5. Finally, Section 7
|
||||||
|
covers some issues which need to be tackled in the future.</p>
|
||||||
|
|
||||||
|
<h2>3. Running FFIGEN</h2>
|
||||||
|
|
||||||
|
<p>The command <em>ffigen</em> is run on a set of header files with
|
||||||
|
preprocessor option and include file options. Arguments are processed
|
||||||
|
in order. For each header file (type <tt>.h</tt>) and all the files it
|
||||||
|
includes, a single preprocessor file (type <tt>.ffi</tt>) is
|
||||||
|
produced.</p>
|
||||||
|
|
||||||
|
<p>The options are:
|
||||||
|
<dl>
|
||||||
|
<dt><tt>-Dname[=value]</tt>
|
||||||
|
<dd>Define preprocessor macro.
|
||||||
|
<dt><tt>-Uname</tt>
|
||||||
|
<dd>Undefine preprocessor macro.
|
||||||
|
<dt><tt>-Idirectory</tt>
|
||||||
|
<dd>Add directory to the <em>beginning</em> of the list
|
||||||
|
of include files. Standard directories include the <em>lcc</em> include
|
||||||
|
directory, <tt>/usr/include</tt>, and the current directory (in that order).
|
||||||
|
See the release notes for information about how to change the defaults.
|
||||||
|
</dl>
|
||||||
|
|
||||||
|
<em>ffigen</em> performs full syntax and type checks on its input.</p>
|
||||||
|
|
||||||
|
The back-end is run by starting your favorite Scheme system and then
|
||||||
|
loading first the target-independent file <tt>process.sch</tt> and second
|
||||||
|
the target-dependent part of the translator; in the case of the Chez
|
||||||
|
Scheme back-end the file is called <tt>chez.sch</tt>. You then call the
|
||||||
|
procedure <tt>process</tt> with the name of the <tt>.ffi</tt> file to
|
||||||
|
process, as discussed in section 5.
|
||||||
|
|
||||||
|
<h2>4. Intermediate Format</h2>
|
||||||
|
|
||||||
|
<p>The intermediate format consists of s-expressions following this grammar:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
<file> -> <record> ...
|
||||||
|
<record> -> (function <filename> <name> <type> <attrs>)
|
||||||
|
| (var <filename> <name> <type> <attrs>)
|
||||||
|
| (type <filename> <name> <type>)
|
||||||
|
| (struct <filename> <name> ((<name> <type>) ...))
|
||||||
|
| (union <filename> <name> ((<name> <type>) ...))
|
||||||
|
| (enum <filename> <name> ((<name> <value>) ...))
|
||||||
|
| (enum-ident <filename> <name> <value>)
|
||||||
|
| (macro <filename> <name+args> <body>)
|
||||||
|
<type> -> (<primitive> <attrs>)
|
||||||
|
| (struct-ref <tag>)
|
||||||
|
| (union-ref <tag>)
|
||||||
|
| (enum-ref <tag>)
|
||||||
|
| (function (<type> ...) <type>)
|
||||||
|
| (pointer <type>)
|
||||||
|
| (array <value> <type>)
|
||||||
|
<attrs> -> (<attr> ...)
|
||||||
|
<attr> -> static | extern | const | volatile
|
||||||
|
<primitive> -> char | signed-char | unsigned-char | short
|
||||||
|
| unsigned-short | int | unsigned | long
|
||||||
|
| unsigned-long | float | double | void
|
||||||
|
<value> -> <integer>
|
||||||
|
<filename> -> <string>
|
||||||
|
<name> -> <string>
|
||||||
|
<body> -> <string>
|
||||||
|
<name+args> -> <string>
|
||||||
|
<tag> -> <string>
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
Notes relating to the grammar:</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li> <tt>...</tt> means "zero or more of" the preceding item.
|
||||||
|
|
||||||
|
<li> The grammar is a little more general than the actual output
|
||||||
|
language. All structs, unions, and enums in parameter lists, return
|
||||||
|
types, and variable declarations are encoded as <tt>struct-ref</tt>,
|
||||||
|
<tt>union-ref</tt>, and <tt>enum-ref</tt>, respectively; structure, union,
|
||||||
|
and enum type definitions occur only in <tt>struct</tt>, <tt>union</tt>,
|
||||||
|
and <tt>enum</tt> records.
|
||||||
|
|
||||||
|
<li> The <tt><tag></tt> field in structs/unions/enums (and their
|
||||||
|
<tt>-ref</tt> forms) is the tag. If one of these types
|
||||||
|
has a user-defined tag, then that tag is used in the <tt>struct-ref</tt>
|
||||||
|
item for the type; if the structure had no user-defined tag then a tag has been
|
||||||
|
generated by <em>lcc</em>. Generated tags have the syntax of positive
|
||||||
|
integers; in particular they start with a digit. There is one namespace
|
||||||
|
each for structs, unions, and enums.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
<tt>typedef</tt> names are not used anywhere: they occur in <tt>type</tt>
|
||||||
|
records only.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
The attributes on primitive types are <tt>const</tt> or <tt>volatile</tt>; the
|
||||||
|
attributes <tt>static</tt> and <tt>extern</tt> are used only on functions and
|
||||||
|
global variables.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Functions which are known to take no parameters (<em>ie</em> <tt>t f(void)</tt>) have
|
||||||
|
one parameter, of type <tt>(void ())</tt>. The void type appears in a
|
||||||
|
parameter list only as the last element.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Functions which take a variable number of arguments have at least one
|
||||||
|
defined non-void parameter and a last parameter of type <tt>(void ())</tt>.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Functions for which no parameters were defined (<em>ie</em> <tt>t f()</tt>) have
|
||||||
|
no parameters.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
The ordering of records in the input has no relation to the
|
||||||
|
relative ordering of declarations in the original source.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
The <tt><value></tt> field in the array is its size. If the size is not
|
||||||
|
known, it is 0.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Multidimensional arrays are represented as nested array types with the
|
||||||
|
leftmost dimension outermost in the expected way; i.e., it looks like
|
||||||
|
an array of arrays.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Arrays are not valid return types.
|
||||||
|
|
||||||
|
<li>
|
||||||
|
Array parameters lose some semantic information in the translation in
|
||||||
|
the current system. An array parameter <tt>t a[n]</tt> is always
|
||||||
|
converted to a pointer: <tt>(pointer t)</tt> regardless of whether
|
||||||
|
<tt>n</tt> is known or not. As expected, then, something like
|
||||||
|
<tt>t a[n][m][o]</tt> gets the parameter type
|
||||||
|
<tt>(pointer (array m (array o t)))</tt>. Note that this only pertains to
|
||||||
|
parameter types; variables of array type are not converted in this manner.
|
||||||
|
(The semantic information claimed lost is the size of the leftmost
|
||||||
|
dimension. This lossage may make it impossible to perform array conversion
|
||||||
|
at call boundaries, for example.)
|
||||||
|
|
||||||
|
<li>
|
||||||
|
The grammar describes the current format, which will change: line number
|
||||||
|
and column information will be incorporated. You should always use the
|
||||||
|
accessor functions defined in the target-independent part of the
|
||||||
|
back-end; see section 5. The grammar does not allow
|
||||||
|
for bit fields or qualifications on anything but primitive
|
||||||
|
types, but these will be accomodated eventually.
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<h2>5. The Target-Independent Back-End</h2>
|
||||||
|
|
||||||
|
<p>The target-independent back-end is a Scheme program called
|
||||||
|
<tt>process</tt> which reads the intermediate form into memory and
|
||||||
|
performs some initial processing. It exports some global variables and
|
||||||
|
a number of procedures which are used to access the structures in the
|
||||||
|
database of intermediate records, and imports two target-dependent
|
||||||
|
functions from the target-dependent back-end. This section describes
|
||||||
|
the interfaces.</p>
|
||||||
|
|
||||||
|
<p>The global variables which hold the database are:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define functions '()) ; list of function records
|
||||||
|
(define vars '()) ; list of var records
|
||||||
|
(define types '()) ; list of type records
|
||||||
|
(define structs '()) ; list of struct records
|
||||||
|
(define unions '()) ; list of union records
|
||||||
|
(define macros '()) ; list of macro records
|
||||||
|
(define enums '()) ; list of enum records
|
||||||
|
(define enum-idents '()) ; list of enum-ident records
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
Each of these contains a list of all the records of the type indicated
|
||||||
|
by their names. Note that records may look different internally than
|
||||||
|
in the defined intermediate form, so accessor functions (see below) should
|
||||||
|
always be used.</p>
|
||||||
|
|
||||||
|
<p>In addition, there are two globals which are set but not used by
|
||||||
|
the target-independent back-end:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define source-file #f) ; name of the input file itself
|
||||||
|
(define filenames '()) ; names of all files in the input
|
||||||
|
</pre>
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>The main entry point to the back end is the procedure <tt>process</tt>,
|
||||||
|
which takes a single file name as an argument. <tt>Process</tt>
|
||||||
|
initializes globals, reads the file, and processes the records.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (process filename) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>Record processing consists of some general analysis and target-specific
|
||||||
|
code generation. First, the target-specific procedure
|
||||||
|
<tt>select-functions</tt> is called; it must set or reset the
|
||||||
|
"referenced" bit in each record depending on whether the function is
|
||||||
|
interesting to the back-end or not. After computing reachability of
|
||||||
|
structured types and setting the referenced bits of those types which
|
||||||
|
are reachable, a translation is generated by a call to the back-end
|
||||||
|
function <tt>generate-translation</tt>, which takes no arguments.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (select-functions) ...)
|
||||||
|
(define (generate-translation) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>A number of data structure accessors and mutators are also available.
|
||||||
|
These are generic procedures which work on all of the record types.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (file r) ...) ; file name of record
|
||||||
|
(define (name r) ...) ; name in records which have one
|
||||||
|
(define (type r) ...) ; type in records which have one
|
||||||
|
(define (attrs r) ...) ; attrs in records which have one
|
||||||
|
(define (fields r) ...) ; fields in struct/union record
|
||||||
|
(define (value r) ...) ; value of enum-ident record
|
||||||
|
(define (tag r) ...) ; tag in struct/union/union/-ref record
|
||||||
|
|
||||||
|
(define (referenced? r) ...) ; is record referenced?
|
||||||
|
(define (referenced! r) ...) ; set referenced bit
|
||||||
|
(define (unreferenced! r) ...) ; reset referenced bit
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
Arguably the <tt>tag</tt> accessor should go away and <tt>name</tt>
|
||||||
|
should simply be used in its place. As it is, <tt>name</tt> is not
|
||||||
|
defined on <tt>struct-ref</tt>, <tt>union-ref</tt>, and
|
||||||
|
<tt>enum-ref</tt> records.</p>
|
||||||
|
|
||||||
|
<p>The procedure <tt>record-tag</tt> returns the tag of the record currently
|
||||||
|
being held. It can also be applied to types.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (record-tag r) ...) ; get record tag
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>All records can have back-end specific values attached to them; usually
|
||||||
|
these are cached names for operations on structured values, so for now
|
||||||
|
the procedures which manipulate the back-end specific data are called
|
||||||
|
<tt>cache-name</tt> to remember a value and <tt>cached-names</tt> to return
|
||||||
|
the list of remembered values:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (cache-name r v) ...) ; remember value in record
|
||||||
|
(define (cached-names r) ...) ; retrieve remembered values
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
We should probably replace this with a more general property-list-like
|
||||||
|
mechanism.</p>
|
||||||
|
|
||||||
|
<p>In addition, two procedures extract parts of function types:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (arglist r) ...) ; function argument types
|
||||||
|
(define (rett r) ...) ; function return type
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>Some utilities to deal with file names are also provided:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (strip-extension fn) ...)
|
||||||
|
(define (strip-path fn) ...)
|
||||||
|
(define (get-path fn) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>A string macro expander makes it easier to generate C code, for the back
|
||||||
|
ends that need it. The macro expander is called <tt>instantiate</tt> and
|
||||||
|
is called with a string template and a vector of arguments (which are
|
||||||
|
also strings). The template contains patterns of the form <tt>@n</tt>
|
||||||
|
where <tt>n</tt> is a single digit; when such a pattern is seen it is
|
||||||
|
replaced with the corresponding value from the argument vector.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (instantiate template arguments) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>Two procedures, <tt>struct-names</tt> and <tt>union-names</tt>, take a
|
||||||
|
structure (or union) and returns a list of all the typedef names which
|
||||||
|
reference the structure directly.
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (struct-names struct) ...)
|
||||||
|
(define (union-names union) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>An association function which searches one of the record lists for a
|
||||||
|
given record by the <tt>name</tt> field is also available:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (lookup key items) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>The procedure <tt>user-defined-tag?</tt> determines whether a tag was
|
||||||
|
defined by the user or generated by the system:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (user-defined-tag? x) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>The procedure <tt>warn</tt> takes some arbitrary arguments and generates
|
||||||
|
a warning message on standard output:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (warn msg . rest) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<p>Some standard predicates take a type and test its kind:
|
||||||
|
<tt>primitive-type?</tt> is true if the argument is of a primitive type as
|
||||||
|
outlined in the grammar above; <tt>basic-type?</tt> is true if the
|
||||||
|
argument is a primitive type or a pointer type; <tt>array-type?</tt> is
|
||||||
|
true if the argument is an array type, and finally,
|
||||||
|
<tt>structured-type?</tt> is true if the argument is a <tt>struct-ref</tt>
|
||||||
|
or <tt>union-ref</tt> type:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
(define (primitive-type? t) ...)
|
||||||
|
(define (basic-type? t) ...)
|
||||||
|
(define (array-type? t) ...)
|
||||||
|
(define (structured-type? t) ...)
|
||||||
|
</pre></p>
|
||||||
|
|
||||||
|
<h2>6. Writing a Target-Dependent Back-End</h2>
|
||||||
|
|
||||||
|
<p>To write the target-dependent back-end, you must decide on the policy
|
||||||
|
for the translation and then implement the translation. The policy
|
||||||
|
covers such issues as: which constructs in C are or are not handled; the
|
||||||
|
translation for each handled construct; how non-handled constructs are
|
||||||
|
dealt with (ignored, detected with warnings, detected with errors); how
|
||||||
|
to deal with exceptional cases (consider the <tt>fgets</tt> example from
|
||||||
|
the <a href="manifesto.html">Manifesto</a>).</p>
|
||||||
|
|
||||||
|
<p>For a concrete example, see the companion document <em>FFIGEN Backend
|
||||||
|
for Chez Scheme Version 5</em>, which addresses many of the choices to be
|
||||||
|
made and their possible solutions.</p>
|
||||||
|
|
||||||
|
<h2>7. Future Work</h2>
|
||||||
|
|
||||||
|
<p>A number of features <em>will</em> be supported in the future:</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li> There will be a line and a column field in each record, giving the
|
||||||
|
source line on which the identifier was defined.
|
||||||
|
|
||||||
|
<li> Bitfields will be supported.
|
||||||
|
|
||||||
|
<li> Qualifiers (what's now called attributes, that is, const and
|
||||||
|
volatile) will be supported on all types, not just on primitive
|
||||||
|
non-pointer types like now.
|
||||||
|
|
||||||
|
<li> The intermediate representation will include the name of the orignal
|
||||||
|
input file, and its path.
|
||||||
|
|
||||||
|
<li> The intermediate representation will include a representation of
|
||||||
|
the include file hierarchy which was traversed to produce the
|
||||||
|
intermediate representation.
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<p>A number of features will most likely be supported, but need
|
||||||
|
to be investigated:</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li> It would be nice to retain comments.
|
||||||
|
|
||||||
|
<li> Various popular extensions to C are not currently supported by
|
||||||
|
<em>lcc</em>, but would be extremely useful: <tt>long long</tt> is used
|
||||||
|
extensively in Unix header files, and header files for compilers on PCs
|
||||||
|
often use the common Microsoft extensions <tt>__huge</tt>, <tt>__far</tt>,
|
||||||
|
and <tt>__near</tt> (and their non-underscore equivalents). Some C compilers
|
||||||
|
support <tt>__inline</tt> declarations, and although we can't generate
|
||||||
|
code for in-line procedures we can at least parse them if the compiler
|
||||||
|
can cope with <tt>__inline</tt>. (<tt>__inline</tt> is the easier, since it
|
||||||
|
can be ignored. The others must show up as type qualifiers or new types.)
|
||||||
|
|
||||||
|
<li> The current shell-program driver will probably be replaced by
|
||||||
|
something based on the lcc driver.
|
||||||
|
|
||||||
|
<li> I'm going to experiment with partial macro application in the
|
||||||
|
front end so that back-ends can have simple support for macro
|
||||||
|
definitions. Currently, for example, even something as simple as the
|
||||||
|
<tt>EOF</tt> macro will be ignored by the Chez Scheme back-end because its
|
||||||
|
form is <tt>"(-1)"</tt> rather than simply <tt>"-1"</tt>.
|
||||||
|
|
||||||
|
<li> Information about the layout of fields within structured types
|
||||||
|
should possibly be emitted; this information would be useful to
|
||||||
|
low-level FFIs which need byte offset and size to access the field of a
|
||||||
|
structure.
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<p>In addition, there are some issues to investigate in a larger
|
||||||
|
perspective:</p>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li> General (target-independent) support for useful policy mechanisms.
|
||||||
|
|
||||||
|
<li> How well can the intermediate language support other front-ends?
|
||||||
|
I don't want to fall into the UNCOL pit, but it would be interesting to
|
||||||
|
see how languages which resemble C in their parameter passing mechanisms
|
||||||
|
(Pascal, Modula, Oberon) could be mapped onto the intermediate language.
|
||||||
|
This is not high priority with me, however. If I embark on supporting
|
||||||
|
another front-end language it will probably be (sigh) C++.
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h2>8. Please Contribute!</h2>
|
||||||
|
|
||||||
|
<p>My goal is to support as many target languages as is reasonable, but I
|
||||||
|
can't write all the translators myself (I lack the time and, in many
|
||||||
|
cases, the knowledge). Targets that I will take care of include STk,
|
||||||
|
and, if no-one beats me to it, Scsh, both Scheme systems. Someone has
|
||||||
|
already volunteered to write the ILU back-end. Others are interested
|
||||||
|
in back-ends for Modula-3 and Mercury.</p>
|
||||||
|
|
||||||
|
<p>Volunteers for any translator back-end are welcome to e-mail me and
|
||||||
|
volunteer their help. I will coach, coordinate, and help out as much as
|
||||||
|
possible.</p>
|
||||||
|
|
||||||
|
<h2>9. Credits</h2>
|
||||||
|
|
||||||
|
<p>FFIGEN is based on the freely available <em>lcc</em> ANSI C compiler,
|
||||||
|
implemented by Christopher Fraser (of AT&T Bell Labs) and David Hanson
|
||||||
|
(of Princeton University).</p>
|
||||||
|
|
||||||
|
<p>I would like to thank Fraser and Hanson for producing such an excellent
|
||||||
|
system; <em>lcc</em> has been a joy to work with, and their book, <em>A
|
||||||
|
Retargetable C Compiler: Design and Implementation</em>, made the
|
||||||
|
implementation of the FFIGEN front end in the matter of roughly a single
|
||||||
|
work day possible. Would it be that all software was this clean!</p>
|
||||||
|
|
||||||
|
<p>The development of FFIGEN was supported by ARPA
|
||||||
|
under U.S. Army grant No. DABT63-94-C-0029,
|
||||||
|
``Programming Environments, Compiler Technology and Runtime Systems
|
||||||
|
for Object Oriented Parallel Processing''.</p>
|
||||||
|
|
||||||
|
<h2>10. Copyrights</h2>
|
||||||
|
|
||||||
|
<em>lcc</em> is covered by the following Copyright notice:
|
||||||
|
|
||||||
|
<blockquote>
|
||||||
|
<p>The authors of this software are Christopher W. Fraser and
|
||||||
|
David R. Hanson.</p>
|
||||||
|
|
||||||
|
<p>Copyright (c) 1991,1992,1993,1994,1995 by AT&T, Christopher W. Fraser,
|
||||||
|
and David R. Hanson. All Rights Reserved.</p>
|
||||||
|
|
||||||
|
<p>Permission to use, copy, modify, and distribute this software for any
|
||||||
|
purpose, subject to the provisions described below, without fee is
|
||||||
|
hereby granted, provided that this entire notice is included in all
|
||||||
|
copies of any software that is or includes a copy or modification of
|
||||||
|
this software and in all copies of the supporting documentation for
|
||||||
|
such software.</p>
|
||||||
|
|
||||||
|
<p>THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED
|
||||||
|
WARRANTY. IN PARTICULAR, NEITHER THE AUTHORS NOR AT&T MAKE ANY
|
||||||
|
REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY
|
||||||
|
OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.</p>
|
||||||
|
|
||||||
|
<p>lcc is not public-domain software, shareware, and it is not protected
|
||||||
|
by a `copyleft' agreement, like the code from the Free Software
|
||||||
|
Foundation.</p>
|
||||||
|
|
||||||
|
<p>lcc is available free for your personal research and instructional use
|
||||||
|
under the `fair use' provisions of the copyright law. You may,
|
||||||
|
however, redistribute the lcc in whole or in part provided you
|
||||||
|
acknowledge its source and include this COPYRIGHT file.</P>
|
||||||
|
|
||||||
|
<p>You may not sell lcc or any product derived from it in which it is a
|
||||||
|
significant part of the value of the product. Using the lcc front end
|
||||||
|
to build a C syntax checker is an example of this kind of product.</p>
|
||||||
|
|
||||||
|
<p>You may use parts of lcc in products as long as you charge for only
|
||||||
|
those components that are entirely your own and you acknowledge the use
|
||||||
|
of lcc clearly in all product documentation and distribution media. You
|
||||||
|
must state clearly that your product uses or is based on parts of lcc
|
||||||
|
and that lcc is available free of charge. You must also request that
|
||||||
|
bug reports on your product be reported to you. Using the lcc front
|
||||||
|
end to build a C compiler for the Motorola 88000 chip and charging for
|
||||||
|
and distributing only the 88000 code generator is an example of this
|
||||||
|
kind of product.</p>
|
||||||
|
|
||||||
|
<p>Using parts of lcc in other products is more problematic. For example,
|
||||||
|
using parts of lcc in a C++ compiler could save substantial time and
|
||||||
|
effort and therefore contribute significantly to the profitability of
|
||||||
|
the product. This kind of use, or any use where others stand to make a
|
||||||
|
profit from what is primarily our work, is subject to negotiation.</p>
|
||||||
|
|
||||||
|
<p>Chris Fraser / cwf@research.att.com <br>
|
||||||
|
David Hanson / drh@cs.princeton.edu<br>
|
||||||
|
Fri Jun 17 11:57:07 EDT 1994</p>
|
||||||
|
</blockquote>
|
||||||
|
|
||||||
|
<hr>
|
||||||
|
<address>
|
||||||
|
<A HREF="mailto:lth@acm.org">lth@acm.org</A>
|
||||||
|
</address>
|
||||||
|
<em>24 May 2000</em>
|
||||||
|
</body>
|
||||||
|
</html>
|
Binary file not shown.
Loading…
Reference in New Issue