2023-05-19 04:11:48 -04:00
|
|
|
% -*- scheme -*-
|
|
|
|
%
|
|
|
|
% This is a nuweb document.
|
|
|
|
% nuweb is available by anon. ftp from cs.rice.edu in /public/preston.
|
|
|
|
%
|
2023-05-19 04:13:22 -04:00
|
|
|
% Author: Lars Thomas Hansen (lth@@cs.uoregon.edu)
|
2023-05-19 04:11:48 -04:00
|
|
|
%
|
|
|
|
% Copyright (C) 1996 The University of Oregon. All rights reserved.
|
|
|
|
%
|
|
|
|
% This file may be freely redistributed in its entirety with or without
|
|
|
|
% modification provided that this copyright notice is not removed. It
|
|
|
|
% may not be sold for profit or incorporated in commercial software
|
|
|
|
% products without the prior written permission of the copyright holder.
|
|
|
|
|
|
|
|
\documentstyle{article}
|
|
|
|
\oddsidemargin 0in
|
|
|
|
\evensidemargin 0in
|
|
|
|
\textwidth 6.5in
|
|
|
|
\topmargin -0.5in
|
|
|
|
\textheight 9in
|
|
|
|
|
|
|
|
\newcommand{\fixme}[1]{{\Large\bf FIXME: #1}}
|
|
|
|
|
|
|
|
\title{FFIGEN Back-end for Chez Scheme Version 5\footnote{This work
|
|
|
|
has been supported by ARPA under U.S.~Army grant No.~DABT63-94-C-0029,
|
|
|
|
``Programming Environments, Compiler Technology and Runtime Systems
|
|
|
|
for Object Oriented Parallel Processing''.}}
|
|
|
|
\author{Lars Thomas Hansen \\ {\tt lth@@cs.uoregon.edu}}
|
|
|
|
\date{February 7, 1996}
|
|
|
|
|
|
|
|
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
|
|
|
|
This document describes the FFIGEN back-end for Chez Scheme version 5.
|
|
|
|
The exposition is provided in the hope that it will make it reasonably
|
|
|
|
clear how back-ends for FFIGEN are constructed; many back-ends will
|
|
|
|
probably have a structure similar to that of the present program.
|
|
|
|
|
|
|
|
If you haven't read the {\em FFIGEN Manifesto and Overview}, now is a
|
|
|
|
good time to do so.
|
|
|
|
|
|
|
|
If you haven't read the {\em FFIGEN User's Manual,\/} which documents
|
|
|
|
the target-independent code for FFIGEN back-ends and the format for the
|
|
|
|
intermediate code, now is a good time to do so.
|
|
|
|
|
|
|
|
This document is organized in the following manner. Section
|
|
|
|
\ref{overall} documents the overall structure of the translation for
|
|
|
|
Chez Scheme. Section \ref{thecode} presents the policy choices made for
|
|
|
|
data types and most of the code implementing the policy; section
|
|
|
|
\ref{utils} presents several utility procedures used by the
|
|
|
|
implementation. Section \ref{discussion} discusses some of the
|
|
|
|
implementation choices, particularly in terms of performance, and
|
|
|
|
section \ref{future} summarizes future work on the Chez Scheme back-end.
|
|
|
|
Section \ref{stdlib} presents the implementation of the support
|
|
|
|
libraries.
|
|
|
|
|
|
|
|
This document is also the program: the \LaTeX\ source for the document,
|
|
|
|
the Scheme source code for the program, and the standard library sources
|
|
|
|
have been derived from the same file using the \verb|nuweb| system for
|
|
|
|
literate programming.\footnote{\verb|nuweb| was implemented by Preston
|
|
|
|
Briggs of Rice University; you may retrieve it by anonymous ftp from
|
|
|
|
\verb|cs.rice.edu| in the directory \verb|/public/preston|.} The
|
|
|
|
document presents the program in an order which I believe is conducive
|
|
|
|
to understanding its workings. Here's how you read it: each snippet of
|
|
|
|
code (called a {\em macro\/} or {\em scrap\/}) has a title which includes
|
|
|
|
the page number on which the scrap is defined and a letter following the
|
|
|
|
page number which shows the relative place on that page where you can
|
|
|
|
find the scrap (for example, 9a is the first scrap on page 9 and 6c is
|
|
|
|
the third scrap on page 6). Scraps reference other scraps; these
|
|
|
|
references are the titles, so any cross-reference immediately gives you
|
|
|
|
the page number of the referenced code. In addition, an index of global
|
|
|
|
identifiers can be found in the appendix.
|
|
|
|
|
|
|
|
|
|
|
|
\section{Overall Structure}
|
|
|
|
\label{overall}
|
|
|
|
|
|
|
|
The FFIGEN back-end for any FFI consists of two parts: a
|
|
|
|
target-independent part which reads the output of the front end into
|
|
|
|
global in-memory data structures, and a target-dependent part which does
|
|
|
|
the actual translation. The target-dependent part is implemented by two
|
|
|
|
procedures, \verb|select-functions| and \verb|generate-translation|.
|
|
|
|
|
|
|
|
\subsection{The {\tt select-functions} procedure}
|
|
|
|
|
|
|
|
The procedure \verb|select-functions| is called during initial
|
|
|
|
processing to mark as selected those procedures which are interesting
|
|
|
|
for the translation. This makes it possible to exclude some functions
|
|
|
|
from the translation. For example, if you are translating the interface
|
|
|
|
to a library whose header file includes \verb|stdio.h|, you may wish to
|
|
|
|
exclude all function prototypes included from the latter file. Since
|
|
|
|
each function record in the intermediate form has a field naming the
|
|
|
|
file the record came from, you can easily exclude functions from
|
|
|
|
\verb|stdio.h|.
|
|
|
|
|
|
|
|
The implementation of \verb|select-functions| for Chez Scheme simply
|
|
|
|
marks all functions as selected, by turning on the ``referenced'' bit
|
|
|
|
in the record.
|
|
|
|
|
|
|
|
@d the select-functions interface procedure
|
|
|
|
@{(define (select-functions)
|
|
|
|
(do ((functions functions (cdr functions)))
|
|
|
|
((null? functions))
|
|
|
|
(referenced! (car functions))))
|
|
|
|
@| select-functions @}
|
|
|
|
|
|
|
|
\subsection{The {\tt generate-translation} procedure}
|
|
|
|
|
|
|
|
The procedure \verb|generate-translation| is called when the
|
|
|
|
target-independent part is done processing the data, and it must
|
|
|
|
implement all aspects of the target-dependent translation. It takes no
|
|
|
|
arguments. My version sets up the output files, initializes the files,
|
|
|
|
calls the generating functions, and cleans up.
|
|
|
|
|
|
|
|
@d the generate-translation interface procedure
|
|
|
|
@{(define (generate-translation)
|
|
|
|
@<initialize output files@>
|
|
|
|
@<generate code for all constructs@>
|
|
|
|
@<finalize output files@>
|
|
|
|
#t)
|
|
|
|
@| generate-translation @}
|
|
|
|
|
|
|
|
\verb|generate-translation| calls on specialized procedures to
|
|
|
|
generate code for the various intermediate language records:
|
|
|
|
|
|
|
|
@d generate code for all constructs
|
|
|
|
@{(dump-structs)
|
|
|
|
(dump-unions)
|
|
|
|
(dump-functions)
|
|
|
|
(dump-variables)
|
|
|
|
(dump-enums)
|
|
|
|
(dump-macros)
|
|
|
|
@}
|
|
|
|
|
|
|
|
\subsection{Structure of the Scheme source file}
|
|
|
|
|
|
|
|
The final file is called \verb|chez.sch| and is laid out in the
|
|
|
|
following way:
|
|
|
|
|
|
|
|
@o chez.sch
|
|
|
|
@{@<back-end global variables@>
|
|
|
|
@<the select-functions interface...@>
|
|
|
|
@<the generate-translation interface...@>
|
|
|
|
@<Chez FFI names for primitive types@>
|
|
|
|
@<dump structs and unions@>
|
|
|
|
@<dump global variable accessors@>
|
|
|
|
@<dump function definitions@>
|
|
|
|
@<dump enum definitions@>
|
|
|
|
@<dump macro definitions@>
|
|
|
|
@<utility functions@>
|
|
|
|
@}
|
|
|
|
|
|
|
|
\subsection{The standard library}
|
|
|
|
|
|
|
|
To facilitate the translation, a small standard library has been created
|
|
|
|
ahead of time (see section \ref{stdlib} for the implementation). The
|
|
|
|
library implements primitive dereferencing functions and a memory copy
|
|
|
|
procedure. There is a C file and a Scheme file. The C file must be
|
|
|
|
compiled and loaded into the Scheme system using the FFI facilities for
|
|
|
|
this (which happen to be operating-system dependent in Chez Scheme).
|
|
|
|
The Scheme FFI code for the standard library must be loaded into the
|
|
|
|
Scheme system at run time.
|
|
|
|
|
|
|
|
\section{Policy in the Chez Scheme Back-end}
|
|
|
|
\label{thecode}
|
|
|
|
|
|
|
|
Chez Scheme version 5 has a fairly simple native FFI. It supports most
|
|
|
|
primitive C data types for both parameters and return values, and
|
|
|
|
performs data conversions on call and return. A string is passed as a
|
|
|
|
\verb|char*| to the first character (in Chez Scheme, strings are
|
|
|
|
0-terminated for C compatibility), but is returned by copying the data
|
|
|
|
into a fresh Scheme string. There is no direct support for the
|
|
|
|
\verb|short| datatype, variadic procedures, call by reference, calling
|
|
|
|
functions through a function pointer, struct/union arguments, or
|
|
|
|
struct/union return values, nor is there support for passing general
|
|
|
|
Scheme objects like pairs, arrays, or functions to foreign functions.
|
|
|
|
|
|
|
|
In this (first) version of the translation I have decided not to
|
|
|
|
implement any mechanisms for handling special cases like the
|
|
|
|
\verb|fgets()| example outlined in the {\em Manifesto}; that will have
|
|
|
|
to come later. Such a mechanism might take the form of a file
|
|
|
|
containing rules for how to override certain function signatures, for
|
|
|
|
example. See section \ref{future}.
|
|
|
|
|
|
|
|
\subsection{The Output}
|
|
|
|
\label{the-output}
|
|
|
|
|
|
|
|
Two output files are produced: a C file and a Scheme file. The Scheme
|
|
|
|
file will hold the FFI code which interfaces to the library and its data
|
|
|
|
types. The C file will hold supporting (``glue'') code generated along
|
|
|
|
the way: accessors, mutators, and so on. The need for supporting code
|
|
|
|
will become clear as you read.
|
|
|
|
|
|
|
|
\verb|c-output| and \verb|sch-output| are global variables which hold
|
|
|
|
output ports for generated C code and Scheme code, respectively.
|
|
|
|
|
|
|
|
@d back-end global variables
|
|
|
|
@{(define c-output #f)
|
|
|
|
(define sch-output #f)
|
|
|
|
@| c-output sch-output @}
|
|
|
|
|
|
|
|
The files are initialized in \verb|generate-translation| by computing
|
|
|
|
file names and assigning opened ports to the global variables; for
|
|
|
|
simplicity I use fixed names for the time being. The use of
|
|
|
|
\verb|delete-file| is necessary in Chez Scheme, lest it complain when
|
|
|
|
opening an existing file for output.
|
|
|
|
|
|
|
|
@d initialize output files
|
|
|
|
@{(delete-file "C-OUTPUT")
|
|
|
|
(delete-file "SCH-OUTPUT")
|
|
|
|
(set! c-output (open-output-file "C-OUTPUT"))
|
|
|
|
(set! sch-output (open-output-file "SCH-OUTPUT"))
|
|
|
|
@}
|
|
|
|
|
|
|
|
Once the files are opened they are initialized with standard
|
|
|
|
definitions. The C file \verb|#include|s the FFIGEN standard library
|
|
|
|
header and the ANSI C standard library header; there is nothing to be
|
|
|
|
done in the Scheme case. We should probably also emit a line here
|
|
|
|
which includes the original header file, but the name of that file
|
|
|
|
is not currently available in the intermediate format.
|
|
|
|
|
|
|
|
@d initialize output files
|
|
|
|
@{(display "#include \"chez-stdlib.h\"" c-output) (newline c-output)
|
|
|
|
(display "#include \"stdlib.h\"" c-output) (newline c-output)
|
|
|
|
@}
|
|
|
|
|
|
|
|
When processing is done, the output files must be closed:
|
|
|
|
|
|
|
|
@d finalize output files
|
|
|
|
@{(close-output-port c-output)
|
|
|
|
(close-output-port sch-output)
|
|
|
|
@}
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{Names}
|
|
|
|
|
|
|
|
Since Scheme is case-insensitive and C is not, names which do not clash
|
|
|
|
in C may do so in Scheme. This can be handled fairly simply by keeping
|
|
|
|
track of all generated Scheme names and warning about conflicts. In
|
|
|
|
this version I have not implemented this, but it will be implemented
|
|
|
|
in the future.
|
|
|
|
|
|
|
|
In general, parameter names in generated C procedures are prepended with
|
|
|
|
underscores to prevent name clashes with \verb|typedef|s: names starting
|
|
|
|
with a single underscore are reserved for libraries and will in
|
|
|
|
principle not appear in user code. However, this scheme is not
|
|
|
|
foolproof, considering that FFIGEN is used to translate library code.
|
|
|
|
We might need to come up with something better.
|
|
|
|
|
|
|
|
\subsection{Primitive Types}
|
|
|
|
|
|
|
|
The procedure \verb|chez-type| takes a primitive type structure and
|
|
|
|
returns the name of the Chez Scheme parameter or return type. Already
|
|
|
|
we have to make hard choices: what do we do with pointers, in particular
|
|
|
|
character pointers? In the current implementation I let a \verb|char*|
|
|
|
|
be a \verb|char*|; programs which wish to pass strings will have to
|
|
|
|
first copy the string data into an allocated character buffer and then
|
|
|
|
pass the address of the buffer. A function which performs the copy
|
|
|
|
operation is part of the standard library.
|
|
|
|
|
|
|
|
Other decisions I have made are to represent pointers as unsigned
|
|
|
|
32-bit ints (hence all pointers are the same size), and to represent
|
|
|
|
\verb|char|, \verb|signed char|, and \verb|unsigned char| all as the
|
|
|
|
same \verb|char| FFI
|
|
|
|
type. If the type is a \verb|short| variant, a warning is generated.%
|
|
|
|
\footnote{Chez Scheme does not support a \verb|short| FFI type. Since
|
|
|
|
the ANSI C standard (or at least K\&R 2nd edition) seems to say that
|
|
|
|
a \verb|short| can be passed to a function with a prototype in scope
|
|
|
|
without being widened to \verb|int|, we can't simply use \verb|integer-32|
|
|
|
|
as the FFI argument type for a \verb|short|. On many architectures using
|
|
|
|
a 32-bit integer would
|
|
|
|
work because the APIs don't support parameters smaller than 32 bits. We
|
|
|
|
could also have used proxy functions; see section \ref{struct-vals}.}
|
|
|
|
Enums are assumed to be 32-bit integers.
|
|
|
|
|
|
|
|
@d Chez FFI names for primitive types
|
|
|
|
@{(define (chez-type type)
|
|
|
|
(case (record-tag type)
|
|
|
|
((pointer) 'unsigned-32)
|
|
|
|
((int long enum) 'integer-32)
|
|
|
|
((unsigned unsigned-long) 'unsigned-32)
|
|
|
|
((char unsigned-char signed-char) 'char)
|
|
|
|
((void) 'void)
|
|
|
|
((double) 'double-float)
|
|
|
|
((float) 'single-float)
|
|
|
|
((***invalid***) '***invalid***)
|
|
|
|
(else
|
|
|
|
(warn "Cannot translate this type: " type)
|
|
|
|
(string->symbol (string-append (symbol->string '***invalid:)
|
|
|
|
(symbol->string (record-tag type))
|
|
|
|
"***")))))
|
|
|
|
@| chez-type @}
|
|
|
|
|
|
|
|
The special type \verb|***invalid***| and its variant
|
|
|
|
\verb|***invalid:tag***| are used to signal errors detected during
|
|
|
|
translation; by using these types in the output, no file that was
|
|
|
|
generated with errors can actually be used.
|
|
|
|
|
|
|
|
In addition, we have to consider how to dereference pointers to
|
|
|
|
primitive types (pointers to structured types are handled later). For
|
|
|
|
example, if I have an \verb|int*| and I want to access the integer it
|
|
|
|
points to, how do I do this? For this purpose, dereferencing functions
|
|
|
|
are provided in the standard library. Each function takes a pointer
|
|
|
|
\verb|p| and an offset \verb|k| and fetches the element \verb|p[k]|.
|
|
|
|
There are also functions which store values through pointers to
|
|
|
|
primitive types. See section \ref{stdlib} for the signatures and the
|
|
|
|
actual implementation.
|
|
|
|
|
|
|
|
By treating pointers to pointers as generic pointer types, we handle the
|
|
|
|
general case of recursive dereferencing. Since pointers and integers
|
|
|
|
are the same size and type checking is lax, we can use the unsigned
|
|
|
|
integer access functions to follow pointer chains. For example, if we
|
|
|
|
have an \verb|int** p| and want the integer, the code to access it would
|
|
|
|
be (in C) \verb|_ref_int(_ref_uint(p,0),0)|.
|
|
|
|
|
|
|
|
\subsection{Structured Types}
|
|
|
|
|
|
|
|
The translations for structs and unions are similar. I'll talk about
|
|
|
|
structures, but the discussion pertains to unions as well. The
|
|
|
|
procedures \verb|dump-structs| and \verb|dump-unions| are almost
|
|
|
|
identical, and call the same function to do their work:
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (dump-structs)
|
|
|
|
(dump-struct/union structs struct-names "struct"))
|
|
|
|
|
|
|
|
(define (dump-unions)
|
|
|
|
(dump-struct/union unions union-names "union"))
|
|
|
|
@| dump-structs dump-unions @}
|
|
|
|
|
|
|
|
\verb|struct-names| and \verb|union-names| are procedures which return a
|
|
|
|
list of all \verb|typedef|s which directly name a struct or union; see
|
|
|
|
the {\em FFIGEN User's Manual} for a discussion.
|
|
|
|
|
|
|
|
In the following, we will use this structure as an example:
|
|
|
|
|
|
|
|
\begin{flushleft} \small
|
|
|
|
\begin{minipage}{\linewidth}
|
|
|
|
\begin{list}{}{} \item
|
|
|
|
\mbox{}\verb@@struct X {@@\\
|
|
|
|
\mbox{}\verb@@ int i;@@\\
|
|
|
|
\mbox{}\verb@@ char c[5];@@\\
|
|
|
|
\mbox{}\verb@@ union {@@\\
|
|
|
|
\mbox{}\verb@@ char *s;@@\\
|
|
|
|
\mbox{}\verb@@ int *q;@@\\
|
|
|
|
\mbox{}\verb@@ } u;@@\\
|
|
|
|
\mbox{}\verb@@};@@\\
|
|
|
|
\end{list}
|
|
|
|
\vspace{-1ex}
|
|
|
|
\end{minipage}\\[4ex]
|
|
|
|
\end{flushleft}
|
|
|
|
\vspace{-3ex}
|
|
|
|
|
|
|
|
Each structure in the intermediate form has a tag (\verb|X| in the
|
|
|
|
example), although some of these tags were generated by the compiler.
|
|
|
|
Whether the tag was programmer-defined or not is important for how code
|
|
|
|
is generated (see below), and the procedure \verb|dump-struct/union|
|
|
|
|
takes it into account when generating FFI code for each referenced
|
|
|
|
structure.
|
|
|
|
|
|
|
|
Only structures which have programmer-defined tags have operations
|
|
|
|
generated for them with the name of the tag. Structures which are
|
|
|
|
referred to by a \verb|typedef| must be handled specially. There are
|
|
|
|
two cases. If the \verb|typedef|'d structure also has a
|
|
|
|
programmer-defined tag, then Scheme definitions are emitted which define
|
|
|
|
operations using the \verb|typedef| name to be the same as the
|
|
|
|
operations using the structure tag. If the type does not have a
|
|
|
|
programmer-defined tag, then operations are defined on the structure
|
|
|
|
using the \verb|typedef| name only; you get names like \verb|_get_X_i|.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (dump-struct/union records typedef-name-getter qualifier)
|
|
|
|
(for-each
|
|
|
|
(lambda (structure)
|
|
|
|
(if (referenced? structure)
|
|
|
|
(begin
|
|
|
|
(if (user-defined-tag? (tag structure))
|
|
|
|
(dump-struct/union-def structure qualifier (tag structure)))
|
|
|
|
(for-each (lambda (n)
|
|
|
|
(if (user-defined-tag? (tag structure))
|
|
|
|
(generate-reference-to-structure structure n qualifier)
|
|
|
|
(dump-struct/union-def structure "" n)))
|
|
|
|
(typedef-name-getter structure)))))
|
|
|
|
records))
|
|
|
|
@| dump-struct/union @}
|
|
|
|
|
|
|
|
The procedure \verb|generate-reference-to-structure| takes a structure
|
|
|
|
which has a cached list of defined constructor, destructor, accessor,
|
|
|
|
and mutator function, a \verb|typedef|-name, and a qualifier (\verb|struct|,
|
|
|
|
\verb|union|, or the empty string) and generates Scheme definitions
|
|
|
|
which use the \verb|typedef|-name but which refer to the already-defined
|
|
|
|
functions using the qualified name and structure tag. For example, if
|
|
|
|
there is already a foreign function called \verb|_get_struct_X_F| and
|
|
|
|
the \verb|typedef| name is \verb|Y|, then a variable called \verb|_get_Y_F|
|
|
|
|
will be bound to the value of the existing foreign function.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (generate-reference-to-structure structure typedef-name qualifier)
|
|
|
|
(for-each (lambda (n)
|
|
|
|
(let ((newname (compute-newname n typedef-name (tag structure) qualifier)))
|
|
|
|
(display `(define ,newname ,n) sch-output)
|
|
|
|
(newline sch-output)))
|
|
|
|
(cached-names structure)))
|
|
|
|
@| generate-reference-to-structure @}
|
|
|
|
|
|
|
|
The procedure \verb|compute-newname| takes an already emitted name and
|
|
|
|
generates a name which uses the \verb|typedef| name.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (compute-newname oldname typedef-name tag qualifier)
|
|
|
|
(let ((q (string-append qualifier "_" tag)))
|
|
|
|
(let ((get (string-append "_get_" q))
|
|
|
|
(set (string-append "_set_" q))
|
|
|
|
(alloc (string-append "_alloc_" q))
|
|
|
|
(free (string-append "_free_" q)))
|
|
|
|
(cond ((string-prefix=? oldname get)
|
|
|
|
(string-append "_get_" typedef-name (substring oldname (string-length get)
|
|
|
|
(string-length oldname))))
|
|
|
|
((string-prefix=? oldname set)
|
|
|
|
(string-append "_set_" typedef-name (substring oldname (string-length set)
|
|
|
|
(string-length oldname))))
|
|
|
|
((string-prefix=? oldname alloc) (string-append "_alloc_" typedef-name))
|
|
|
|
((string-prefix=? oldname free) (string-append "_free_" typedef-name))
|
|
|
|
(else (error "compute-newname: can't handle: " oldname))))))
|
|
|
|
@| compute-newname @}
|
|
|
|
|
|
|
|
The procedure \verb|dump-struct/union-def| takes a structure type, a
|
|
|
|
qualifier (a string, either \verb|struct|, \verb|union|, or the empty
|
|
|
|
string) and the C name of the structure (its tag or \verb|typedef|
|
|
|
|
name), and generates constructors, destructors, accessors, and mutators
|
|
|
|
for the structure.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (dump-struct/union-def structure qualifier name)
|
2023-05-19 04:13:22 -04:00
|
|
|
(if (not (null? (fields structure)))
|
|
|
|
(let* ((funcname (if (string=? qualifier "")
|
|
|
|
name
|
|
|
|
(string-append qualifier "_" name)))
|
|
|
|
(cast (if (string=? qualifier "")
|
|
|
|
name
|
|
|
|
(string-append qualifier " " name))))
|
|
|
|
(generate-constructor-and-destructor structure funcname cast)
|
|
|
|
(generate-accessors-and-mutators structure funcname cast ""))))
|
2023-05-19 04:11:48 -04:00
|
|
|
@| dump-struct/union-def @}
|
|
|
|
|
2023-05-19 04:13:22 -04:00
|
|
|
Constructors, destructors, accessors and mutators are generated only if
|
|
|
|
the field list for the structure is non-empty, as an empty field lists
|
|
|
|
implies that the structure has merely been declared.
|
|
|
|
|
2023-05-19 04:11:48 -04:00
|
|
|
\subsubsection{Constructors and destructors}
|
|
|
|
|
|
|
|
The procedure \verb|generate-constructor-and-destructor| generates
|
|
|
|
constructor and destructor procedures for its argument. The constructor
|
|
|
|
procedure is called \verb|_alloc_struct_X|. This procedure takes no
|
|
|
|
parameters and allocates memory for the given type, returning a pointer
|
|
|
|
(cast to an unsigned integer) or the literal integer 0 (which may be
|
|
|
|
different from the null pointer).
|
|
|
|
|
|
|
|
The destructor procedure is called \verb|_free_struct_X|; it takes a
|
|
|
|
value returned from the constructor and frees the allocated memory.
|
|
|
|
Generating a constructor may be excessive; a single interface to
|
|
|
|
\verb|free()| is probably enough. The destructors may go away in the
|
|
|
|
future.
|
|
|
|
|
|
|
|
The choice of value for the null pointer warrants discussion. I use the
|
|
|
|
integer 0 to make it possible to test for a null pointer by comparing
|
|
|
|
with 0 on the Scheme side; it is a decision which probably simplifies
|
|
|
|
the Scheme code. Using an unadulterated null pointer would make
|
|
|
|
comparison with 0 nonportable, since a null pointer is not guaranteed to
|
|
|
|
be all-bits-zero.
|
|
|
|
|
|
|
|
An alternative would be to generate foreign procedures which return null
|
|
|
|
pointers, one procedure for each pointer type in the program, and these
|
|
|
|
return values could be used for pointer comparisons.\footnote{There's
|
|
|
|
one null pointer for every pointer type.}
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (generate-constructor-and-destructor structure funcname cast)
|
|
|
|
(function-pair constructor-template
|
|
|
|
(vector funcname cast)
|
|
|
|
(string-append "_alloc_" funcname)
|
|
|
|
'((void ()))
|
|
|
|
`(pointer ,(struct/union-ref structure)))
|
|
|
|
(function-pair destructor-template
|
|
|
|
(vector funcname cast)
|
|
|
|
(string-append "_free_" funcname)
|
|
|
|
`((pointer ,(struct/union-ref structure)))
|
|
|
|
'(void ()))
|
|
|
|
(cache-name structure (string-append "_alloc_" funcname))
|
|
|
|
(cache-name structure (string-append "_free_" funcname)))
|
|
|
|
@| generate-constructor-and-destructor @}
|
|
|
|
|
|
|
|
In the templates for constructors and destructors, \verb|@@0| is the
|
|
|
|
name of the structure as we want it to be generated: \verb|struct_X|,
|
|
|
|
\verb|union_X|, or \verb|X| (for a \verb|typedef|); and \verb|@@1| is the
|
|
|
|
cast expression for the type: \verb|struct X|, \verb|union X|, or
|
|
|
|
\verb|X|.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define constructor-template
|
|
|
|
"unsigned _alloc_@@0(void) {
|
|
|
|
@@1 *_p = (@@1 *)malloc(sizeof(@@1)); return (_p == 0 ? 0 : (unsigned)_p);
|
|
|
|
}")
|
|
|
|
|
|
|
|
(define destructor-template
|
|
|
|
"void _free_@@0(unsigned _p) { if (_p == 0) abort(); free((@@1 *)_p); }")
|
|
|
|
@| constructor-template destructor-template @}
|
|
|
|
|
|
|
|
|
|
|
|
\subsubsection{Accessors and Mutators}
|
|
|
|
|
|
|
|
For each field \verb|F| of basic or array type there will be an accessor
|
|
|
|
function \verb|_get_struct_X_F|, and for each field of basic type there
|
|
|
|
will be a mutator function \verb|_set_struct_X_F|. The accessor takes a
|
|
|
|
pointer and returns the field value; the mutator takes a pointer and a
|
|
|
|
value and updates the field.
|
|
|
|
|
|
|
|
If the structure has nested structures, functions are generated which
|
|
|
|
will access fields in the nested structures in the expected way; the
|
|
|
|
naming scheme is to replace the ``.'' in the C syntax with an underscore
|
|
|
|
``\_''. The generated C code for the accessors and mutators will look
|
|
|
|
like this:
|
|
|
|
|
|
|
|
\begin{flushleft} \small
|
|
|
|
\begin{minipage}{\linewidth}
|
|
|
|
\begin{list}{}{} \item
|
|
|
|
\mbox{}\verb@@int _get_struct_X_i(unsigned _p) { return ((struct X*)_p)->i; }@@\\
|
|
|
|
\mbox{}\verb@@void _set_struct_X_i(unsigned _p, int _v) { ((struct X*)_p)->i = _v; }@@\\
|
|
|
|
\mbox{}\verb@@unsigned _get_struct_X_u_s(unsigned _p) { return (unsigned)((struct X*)_p)->u.s; }@@\\
|
|
|
|
\mbox{}\verb@@void _set_struct_X_u_s(unsigned _p, unsigned _v) { ((struct X*)_p)->u.s = (char*)_v; }@@\\
|
|
|
|
\end{list}
|
|
|
|
\vspace{-1ex}
|
|
|
|
\end{minipage}\\[4ex]
|
|
|
|
\end{flushleft}
|
|
|
|
\vspace{-3ex}
|
|
|
|
|
|
|
|
Procedures are not generated which directly dereference pointer fields;
|
|
|
|
the program must first fetch the pointer field and then dereference it.
|
|
|
|
Arrays in structures are handled like pointer fields, so
|
|
|
|
\verb|_get_struct_X_c| will be generated and will return a pointer to
|
|
|
|
the first element of \verb|c|, which can then be dereferenced.
|
|
|
|
|
|
|
|
The procedure \verb|generate-accessors-and-mutators| takes a structure
|
|
|
|
type, a name to be used for the function (initially just
|
|
|
|
\verb|struct_X|, \verb|union_X|, or \verb|X| for a typedef name), a cast
|
|
|
|
expression (\verb|struct X|, \verb|union X|, or just \verb|X|), and a field
|
|
|
|
selector expression (initially the empty string). As the program
|
|
|
|
descends into the structure the function name and the selector
|
|
|
|
expression will be updated to reflect the field which is being accessed.
|
|
|
|
Appended to the function name will be an underscore and the field name,
|
|
|
|
so we'll see \verb|struct_X_i| and \verb|struct_X_u_s|, for example.
|
|
|
|
The selector will be the corresponding access expression for C, so
|
|
|
|
\verb|i| and \verb|u.s|.
|
|
|
|
|
|
|
|
When generating accessors and mutators, there are three cases: fields of
|
|
|
|
some basic type (primitive types and pointers), fields of array type,
|
|
|
|
and fields of structured type.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (generate-accessors-and-mutators structure funcname cast selector)
|
|
|
|
(for-each
|
|
|
|
(lambda (field)
|
|
|
|
(let ((funcname (string-append funcname "_" (canonical-name (name field))))
|
|
|
|
(selector (string-append selector (if (string=? selector "") "" ".") (name field))))
|
|
|
|
(cond ((basic-type? (type field))
|
|
|
|
(getset-basic-type structure funcname cast selector field))
|
|
|
|
((array-type? (type field))
|
|
|
|
(getset-array-type structure funcname cast selector field))
|
|
|
|
((structured-type? (type field))
|
|
|
|
(getset-structured-type structure funcname cast selector field))
|
|
|
|
(else (error 'generate-accessors-and-mutators "Unknown: " field)))))
|
|
|
|
(fields structure)))
|
|
|
|
@| generate-accessors-and-mutators @}
|
|
|
|
|
|
|
|
The use of \verb|canonical-name| is important. This procedure
|
|
|
|
transforms an identifier from its C syntax into a syntax acceptable to
|
|
|
|
Scheme by transforming the letters to the canonical case of the
|
|
|
|
representation. (For example, if the symbol \verb|a| prints as \verb|A|
|
|
|
|
then the canonical case of the implementation is upper case.)
|
|
|
|
|
|
|
|
Fields of basic types get both accessors and mutators.\footnote{Fields
|
|
|
|
which are \verb|const| should not have a mutator; when we get around to
|
|
|
|
implementing general qualifiers on types, this must be fixed.}
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (getset-basic-type struct funcname cast selector field)
|
|
|
|
(let* ((typename (basic-type-name (type field)))
|
|
|
|
(fieldtype (c-cast-expression (type field))))
|
|
|
|
(function-pair accessor-template
|
|
|
|
(vector typename funcname cast selector)
|
|
|
|
(string-append "_get_" funcname)
|
|
|
|
`((pointer ,(struct/union-ref struct)))
|
|
|
|
(type field))
|
|
|
|
(function-pair mutator-template
|
|
|
|
(vector typename funcname cast selector fieldtype)
|
|
|
|
(string-append "_set_" funcname)
|
|
|
|
`((pointer ,(struct/union-ref struct)) ,(type field))
|
|
|
|
`(void ()))
|
|
|
|
(cache-name struct (string-append "_get_" funcname))
|
|
|
|
(cache-name struct (string-append "_set_" funcname))))
|
|
|
|
@| getset-basic-type @}
|
|
|
|
|
|
|
|
In the accessor and mutator templates, \verb|@@0| is the value return or
|
|
|
|
parameter type (pointers are always ``unsigned''), \verb|@@1| is the
|
|
|
|
function name, \verb|@@2| is a cast expression for the structure pointer
|
|
|
|
type, \verb|@@3| is the C field selector expression, and \verb|@@4| is
|
|
|
|
the cast expression from the parameter type to the field type (used in
|
|
|
|
mutators only).
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define accessor-template
|
|
|
|
"@@0 _get_@@1( unsigned _p ) { return (@@0)((@@2*)_p)->@@3; }")
|
|
|
|
|
|
|
|
(define mutator-template
|
|
|
|
"void _set_@@1( unsigned _p, @@0 _v ) { ((@@2*)_p)->@@3 = (@@4)_v; }")
|
|
|
|
@| accessor-template mutator-template @}
|
|
|
|
|
|
|
|
Array types are just like basic types, but there is no mutator because
|
|
|
|
the array name is \verb|const|. The name of the array denotes a pointer to
|
|
|
|
the first element, so that's what we return.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (getset-array-type structure funcname cast selector field)
|
|
|
|
(function-pair array-accessor-template
|
|
|
|
(vector funcname cast selector)
|
|
|
|
(string-append "_get_" funcname)
|
|
|
|
`((pointer ,(struct/union-ref structure)))
|
|
|
|
'(unsigned))
|
|
|
|
(cache-name structure (string-append "_get_" funcname)))
|
|
|
|
|
|
|
|
(define array-accessor-template
|
|
|
|
"unsigned _get_@@0( unsigned _p ) { return (unsigned)(((@@1*)_p)->@@2); }")
|
|
|
|
@| getset-array-type array-accessor-template @}
|
|
|
|
|
|
|
|
Structured types are handled by adding the field name to the selector
|
|
|
|
and recurring; we ignore the distinction between struct and union here
|
|
|
|
as it doesn't matter.
|
|
|
|
|
|
|
|
@d dump structs and unions
|
|
|
|
@{(define (getset-structured-type structure funcname cast selector field)
|
|
|
|
(let (;(selector (string-append selector "." (name field)))
|
|
|
|
;(funcname (string-append funcname "_" (canonical-name (name field))))
|
|
|
|
(struct (if (eq? (record-tag (type field)) 'struct-ref)
|
|
|
|
(lookup (tag (type field)) structs)
|
|
|
|
(lookup (tag (type field)) unions))))
|
|
|
|
(generate-accessors-and-mutators struct funcname cast selector)))
|
|
|
|
@| getset-structured-type @}
|
|
|
|
|
|
|
|
\subsubsection{Structures as values}
|
|
|
|
\label{struct-vals}
|
|
|
|
|
|
|
|
As mentioned above, there are no provisions for using structures or
|
|
|
|
unions as values in the program, not even when they have the size of a
|
|
|
|
primitive type. The reason for this is that it is not natural for
|
|
|
|
Scheme to assign objects which are larger than a single location (and
|
|
|
|
special-casing small structures because the are small seems silly). For
|
|
|
|
example, the following function takes a structure value as an argument
|
|
|
|
(by copy, that is, by assignment):
|
|
|
|
|
|
|
|
\begin{flushleft} \small
|
|
|
|
\begin{minipage}{\linewidth}
|
|
|
|
\begin{list}{}{} \item
|
|
|
|
\mbox{}\verb@@void f( struct X x ) { ... }@@\\
|
|
|
|
\end{list}
|
|
|
|
\vspace{-1ex}
|
|
|
|
\end{minipage}\\[4ex]
|
|
|
|
\end{flushleft}
|
|
|
|
\vspace{-3ex}
|
|
|
|
|
|
|
|
There is no way to call this function directly in the generated FFI for
|
|
|
|
Chez Scheme. We could fudge our way around it by generating a proxy
|
|
|
|
function which acts as a level of indirection:
|
|
|
|
|
|
|
|
\begin{flushleft} \small
|
|
|
|
\begin{minipage}{\linewidth}
|
|
|
|
\begin{list}{}{} \item
|
|
|
|
\mbox{}\verb@@void _proxy_f( struct X *x ) { f( *x ); }@@\\
|
|
|
|
\end{list}
|
|
|
|
\vspace{-1ex}
|
|
|
|
\end{minipage}\\[4ex]
|
|
|
|
\end{flushleft}
|
|
|
|
\vspace{-3ex}
|
|
|
|
|
|
|
|
\noindent but I have not implemented this. If it becomes a problem it
|
|
|
|
needs to be fixed. Libraries which manipulate complex numbers
|
|
|
|
represented as structures may need it.
|
|
|
|
|
|
|
|
Another upshot is that there is no way to ask for the field \verb|u| of
|
< |