429 lines
14 KiB
HTML
429 lines
14 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
|
"http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
|
|
<title>femtoLisp</title>
|
|
</head>
|
|
<body bgcolor="#fcfcfc"> <!-"#fcfcc8">
|
|
<img src="flbanner.jpg">
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>0. Argument</h1>
|
|
This Lisp has the following characteristics and goals:
|
|
|
|
<ul>
|
|
<li>Lisp-1 evaluation rule (ala Scheme)
|
|
<li>Self-evaluating lambda (i.e. <tt>'(lambda (x) x)</tt> is callable)
|
|
<li>Full Common Lisp-style macros
|
|
<li>Dotted lambda lists for rest arguments (ala Scheme)
|
|
<li>Symbols have one binding
|
|
<li>Builtin functions are constants
|
|
<li><em>All</em> values are printable and readable
|
|
<li>Case-sensitive symbol names
|
|
<li>Only the minimal core built-in (i.e. written in C), but
|
|
enough to provide a practical level of performance
|
|
<li>Very short (but not necessarily simple...) implementation
|
|
<li>Generally use Common Lisp operator names
|
|
<li>Nothing excessively weird or fancy
|
|
</ul>
|
|
|
|
<h1>1. Syntax</h1>
|
|
<h2>1.1. Symbols</h2>
|
|
Any character string can be a symbol name, including the empty string. In
|
|
general, text between whitespace is read as a symbol except in the following
|
|
cases:
|
|
<ul>
|
|
<li>The text begins with <tt>#</tt>
|
|
<li>The text consists of a single period <tt>.</tt>
|
|
<li>The text contains one of the special characters <tt>()[]';`,\|</tt>
|
|
<li>The text is a valid number
|
|
<li>The text is empty
|
|
</ul>
|
|
In these cases the symbol can be written by surrounding it with <tt>| |</tt>
|
|
characters, or by escaping individual characters within the symbol using
|
|
backslash <tt>\</tt>. Note that <tt>|</tt> and <tt>\</tt> must always be
|
|
preceded with a backslash when writing a symbol name.
|
|
|
|
<h2>1.2. Numbers</h2>
|
|
|
|
A number consists of an optional + or - sign followed by one of the following
|
|
sequences:
|
|
<ul>
|
|
<li><tt>NNN...</tt> where N is a decimal digit
|
|
<li><tt>0xNNN...</tt> where N is a hexadecimal digit
|
|
<li><tt>0NNN...</tt> where N is an octal digit
|
|
</ul>
|
|
femtoLisp provides 30-bit integers, and it is an error to write a constant
|
|
less than -2<sup>29</sup> or greater than 2<sup>29</sup>-1.
|
|
|
|
<h2>1.3. Conses and vectors</h2>
|
|
|
|
The text <tt>(a b c)</tt> parses to the structure
|
|
<tt>(cons a (cons b (cons c nil)))</tt> where a, b, and c are arbitrary
|
|
expressions.
|
|
<p>
|
|
The text <tt>(a . b)</tt> parses to the structure
|
|
<tt>(cons a b)</tt> where a and b are arbitrary expressions.
|
|
<p>
|
|
The text <tt>()</tt> reads as the symbol <tt>nil</tt>.
|
|
<p>
|
|
The text <tt>[a b c]</tt> parses to a vector of expressions a, b, and c.
|
|
The syntax <tt>#(a b c)</tt> has the same meaning.
|
|
|
|
|
|
<h2>1.4. Comments</h2>
|
|
|
|
Text between a semicolon <tt>;</tt> and the next end-of-line is skipped.
|
|
Text between <tt>#|</tt> and <tt>|#</tt> is also skipped.
|
|
|
|
<h2>1.5. Prefix tokens</h2>
|
|
|
|
There are five special prefix tokens which parse as follows:<p>
|
|
<tt>'a</tt> is equivalent to <tt>(quote a)</tt>.<br>
|
|
<tt>`a</tt> is equivalent to <tt>(backquote a)</tt>.<br>
|
|
<tt>,a</tt> is equivalent to <tt>(*comma* a)</tt>.<br>
|
|
<tt>,@a</tt> is equivalent to <tt>(*comma-at* a)</tt>.<br>
|
|
<tt>,.a</tt> is equivalent to <tt>(*comma-dot* a)</tt>.
|
|
|
|
|
|
<h2>1.6. Other read macros</h2>
|
|
|
|
femtoLisp provides a few "read macros" that let you accomplish interesting
|
|
tricks for textually representing data structures.
|
|
|
|
<table border=1>
|
|
<tr>
|
|
<td>sequence<td>meaning
|
|
<tr>
|
|
<td><tt>#.e</tt><td>evaluate expression <tt>e</tt> and behave as if e's
|
|
value had been written in place of e
|
|
<tr>
|
|
<td><tt>#\c</tt><td><tt>c</tt> is a character; read as its Unicode value
|
|
<tr>
|
|
<td><tt>#n=e</tt><td>read <tt>e</tt> and label it as <tt>n</tt>, where n
|
|
is a decimal number
|
|
<tr>
|
|
<td><tt>#n#</tt><td>read as the identically-same value previously labeled
|
|
<tt>n</tt>
|
|
<tr>
|
|
<td><tt>#:gNNN or #:NNN</tt><td>read a gensym. NNN is a hexadecimal
|
|
constant. future occurrences of the same <tt>#:</tt> sequence will read to
|
|
the identically-same gensym
|
|
<tr>
|
|
<td><tt>#sym(...)</tt><td>reads to the result of evaluating
|
|
<tt>(apply sym '(...))</tt>
|
|
<tr>
|
|
<td><tt>#<</tt><td>triggers an error
|
|
<tr>
|
|
<td><tt>#'</tt><td>ignored; provided for compatibility
|
|
<tr>
|
|
<td><tt>#!</tt><td>single-line comment, for script execution support
|
|
<tr>
|
|
<td><tt>"str"</tt><td>UTF-8 character string; may contain newlines.
|
|
<tt>\</tt> is the escape character. All C escape sequences are supported, plus
|
|
<tt>\u</tt> and <tt>\U</tt> for unicode values.
|
|
</table>
|
|
When a read macro involves persistent state (e.g. label assignments), that
|
|
state is valid only within the closest enclosing call to <tt>read</tt>.
|
|
|
|
|
|
<h2>1.7. Builtins</h2>
|
|
|
|
Builtin functions are represented as opaque constants. Every builtin
|
|
function is the value of some constant symbol, so the builtin <tt>eq</tt>,
|
|
for example, can be written as <tt>#.eq</tt> ("the value of symbol eq").
|
|
Note that <tt>eq</tt> itself is still an ordinary symbol, except that its
|
|
value cannot be changed.
|
|
<p>
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
|
|
<h1>2. Data and execution models</h1>
|
|
|
|
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
|
|
<h1>3. Primitive functions</h1>
|
|
|
|
|
|
eq atom not set prog1 progn
|
|
symbolp numberp builtinp consp vectorp boundp
|
|
+ - * / <
|
|
apply eval
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>4. Special forms</h1>
|
|
|
|
quote if lambda macro while label cond and or
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>5. Data structures</h1>
|
|
|
|
cons car cdr rplaca rplacd list
|
|
alloc vector aref aset length
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>6. Other functions</h1>
|
|
|
|
read print princ load exit
|
|
equal compare
|
|
gensym
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>7. Exceptions</h1>
|
|
|
|
trycatch raise
|
|
|
|
|
|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
|
|
<tr><td bgcolor="#2d3f5f" height=4></table>
|
|
|
|
<h1>8. Cvalues</h1>
|
|
|
|
<h2>8.1. Introduction</h2>
|
|
|
|
femtoLisp allows you to use the full range of C data types on
|
|
dynamically-typed Lisp values. The motivation for this feature is that
|
|
useful
|
|
interpreters must provide a large library of routines in C for dealing
|
|
with "real world" data like text and packed numeric arrays, and I would
|
|
rather not write yet another such library. Instead, all the
|
|
required data representations and primitives are provided so that such
|
|
features could be implemented in, or at least described in, Lisp.
|
|
<p>
|
|
The cvalues capability makes it easier to call C from Lisp by providing
|
|
ways to construct whatever arguments your C routines might require, and ways
|
|
to decipher whatever values your C routines might return. Here are some
|
|
things you can do with cvalues:
|
|
<ul>
|
|
<li>Call native C functions from Lisp without wrappers
|
|
<li>Wrap C functions in pure Lisp, automatically inheriting some degree
|
|
of type safety
|
|
<li>Use Lisp functions as callbacks from C code
|
|
<li>Use the Lisp garbage collector to reclaim malloc'd storage
|
|
<li>Annotate C pointers with size information for bounds checking or
|
|
serialization
|
|
<li>Attach symbolic type information to a C data structure, allowing it to
|
|
inherit Lisp services such as printing a readable representation
|
|
<li>Add datatypes like strings to Lisp
|
|
<li>Use more efficient represenations for your Lisp programs' data
|
|
</ul>
|
|
<p>
|
|
femtoLisp's "cvalues" is inspired in part by Python's "ctypes" package.
|
|
Lisp doesn't really have first-class types the way Python does, but it does
|
|
have values, hence my version is called "cvalues".
|
|
|
|
<h2>8.2. Type representations</h2>
|
|
|
|
The core of cvalues is a language for describing C data types as
|
|
symbolic expressions:
|
|
|
|
<ul>
|
|
<li>Primitive types are symbols <tt>int8, uint8, int16, uint16, int32, uint32,
|
|
int64, uint64, char, wchar, long, ulong, float, double, void</tt>
|
|
<li>Arrays <tt>(array TYPE SIZE)</tt>, where TYPE is another C type and
|
|
SIZE is either a Lisp number or a C ulong. SIZE can be omitted to
|
|
represent incomplete C array types like "int a[]". As in C, the size may
|
|
only be omitted for the top level of a nested array; all array
|
|
<em>element</em> types
|
|
must have explicit sizes. Examples:
|
|
<ul>
|
|
<tt>int a[][2][3]</tt> is <tt>(array (array (array int32 3) 2))</tt><br>
|
|
<tt>int a[4][]</tt> would be <tt>(array (array int32) 4)</tt>, but this is
|
|
invalid.
|
|
</ul>
|
|
<li>Pointer <tt>(pointer TYPE)</tt>
|
|
<li>Struct <tt>(struct ((NAME TYPE) (NAME TYPE) ...))</tt>
|
|
<li>Union <tt>(union ((NAME TYPE) (NAME TYPE) ...))</tt>
|
|
<li>Enum <tt>(enum (NAME NAME ...))</tt>
|
|
<li>Function <tt>(c-function RET-TYPE (ARG-TYPE ARG-TYPE ...))</tt>
|
|
</ul>
|
|
|
|
A cvalue can be constructed using <tt>(c-value TYPE arg)</tt>, where
|
|
<tt>arg</tt> is some Lisp value. The system will try to convert the Lisp
|
|
value to the specified type. In many cases this will work better if some
|
|
components of the provided Lisp value are themselves cvalues.
|
|
|
|
<p>
|
|
Note the function type is called "c-function" to avoid confusion, since
|
|
functions are such a prevalent concept in Lisp.
|
|
|
|
<p>
|
|
The function <tt>sizeof</tt> returns the size (in bytes) of a cvalue or a
|
|
c type. Every cvalue has a size, but incomplete types will cause
|
|
<tt>sizeof</tt> to raise an error. The function <tt>typeof</tt> returns
|
|
the type of a cvalue.
|
|
|
|
<p>
|
|
You are probably wondering how 32- and 64-bit integers are constructed from
|
|
femtoLisp's 30-bit integers. The answer is that larger integers are
|
|
constructed from multiple Lisp numbers 16 bits at a time, in big-endian
|
|
fashion. In fact, the larger numeric types are the only cvalues
|
|
types whose constructors accept multiple arguments. Examples:
|
|
<ul>
|
|
<pre>
|
|
(c-value 'int32 0xdead 0xbeef) ; make 0xdeadbeef
|
|
(c-value 'uint64 0x1001 0x8000 0xffff) ; make 0x000010018000ffff
|
|
</pre>
|
|
</ul>
|
|
As you can see, missing zeros are padded in from the left.
|
|
|
|
|
|
<h2>8.3. Constructors</h2>
|
|
|
|
For convenience, a specialized constructor is provided for each
|
|
class of C type (primitives, pointer, array, struct, union, enum,
|
|
and c-function).
|
|
For example:
|
|
<ul>
|
|
<pre>
|
|
(uint32 0xcafe 0xd00d)
|
|
(int32 -4)
|
|
(char #\w)
|
|
(array 'int8 [1 1 2 3 5 8])
|
|
</pre>
|
|
</ul>
|
|
|
|
These forms can be slightly less efficient than <tt>(c-value ...)</tt>
|
|
because in many cases they will allocate a new type for the new value.
|
|
For example, the fourth expression must create the type
|
|
<tt>(array int8 6)</tt>.
|
|
|
|
<p>
|
|
Notice that calls to these constructors strongly resemble
|
|
the types of the values they create. This relationship can be expressed
|
|
formally as follows:
|
|
|
|
<pre>
|
|
(define (c-allocate type)
|
|
(if (atom type)
|
|
(apply (eval type) ())
|
|
(apply (eval (car type)) (cdr type))))
|
|
</pre>
|
|
|
|
This function produces an instance of the given type by
|
|
invoking the appropriate constructor. Primitive types (whose representations
|
|
are symbols) can be constructed with zero arguments. For other types,
|
|
the only required arguments are those present in the type representation.
|
|
Any arguments after those are initializers. Using
|
|
<tt>(cdr type)</tt> as the argument list provides only required arguments,
|
|
so the value you get will not be initialized.
|
|
|
|
<p>
|
|
The builtin <tt>c-value</tt> function is similar to this one, except that it
|
|
lets you pass initializers.
|
|
|
|
<p>
|
|
Cvalue constructors are generally permissive; they do the best they
|
|
can with whatever you pass in. For example:
|
|
|
|
<ul>
|
|
<pre>
|
|
(c-value '(array int8 1)) ; ok, full type provided
|
|
(c-value '(array int8)) ; error, no size information
|
|
(c-value '(array int8) [0 1]) ; ok, size implied by initializer
|
|
</pre>
|
|
</ul>
|
|
|
|
<p>
|
|
ccopy, c2lisp
|
|
|
|
<h2>8.4. Pointers, arrays, and strings</h2>
|
|
|
|
Pointer types are provided for completeness and C interoperability, but
|
|
they should not generally be used from Lisp. femtoLisp doesn't know
|
|
anything about a pointer except the raw address and the (alleged) type of the
|
|
value it points to. Arrays are much more useful. They behave like references
|
|
as in C, but femtoLisp tracks their sizes and performs bounds checking.
|
|
|
|
<p>
|
|
Arrays are used to allocate strings. All strings share
|
|
the incomplete array type <tt>(array char)</tt>:
|
|
|
|
<pre>
|
|
> (c-value '(array char) [#\h #\e #\l #\l #\o])
|
|
"hello"
|
|
|
|
> (sizeof that)
|
|
5
|
|
</pre>
|
|
|
|
<tt>sizeof</tt> reveals that the size is known even though it is not
|
|
reflected in the type (as is always the case with incomplete array types).
|
|
|
|
<p>
|
|
Since femtoLisp tracks the sizes of all values, there is no need for NUL
|
|
terminators. Strings are just arrays of bytes, and may contain zero bytes
|
|
throughout. However, C functions require zero-terminated strings. To
|
|
solve this problem, femtoLisp allocates magic strings that actually have
|
|
space for one more byte than they appear to. The hidden extra byte is
|
|
always zero. This guarantees that a C function operating on the string
|
|
will never overrun its allocated space.
|
|
|
|
<p>
|
|
Such magic strings are produced by double-quoted string literals, and by
|
|
any explicit string-constructing function (such as <tt>string</tt>).
|
|
|
|
<p>
|
|
Unfortunately you still need to be careful, because it is possible to
|
|
allocate a non-magic character array with no terminator. The "hello"
|
|
string above is an example of this, since it was constructed from an
|
|
explicit vector of characters.
|
|
Such an array would cause problems if passed to a function expecting a
|
|
C string.
|
|
|
|
<p>
|
|
deref
|
|
|
|
<h2>8.5. Access</h2>
|
|
|
|
cref,cset,byteref,byteset,ccopy
|
|
|
|
<h2>8.6. Memory management concerns</h2>
|
|
|
|
autorelease
|
|
|
|
|
|
<h2>8.7. Guest functions</h2>
|
|
|
|
Functions written in C but designed to operate on Lisp values are
|
|
known here as "guest functions". Although they are foreign, they live in
|
|
Lisp's house and so live by its rules. Guest functions are what you
|
|
use to write interpreter extensions, for example to implement a function
|
|
like <tt>assoc</tt> in C for performance.
|
|
|
|
<p>
|
|
Guest functions must have a particular signature:
|
|
<pre>
|
|
value_t func(value_t *args, uint32_t nargs);
|
|
</pre>
|
|
Guest functions must also be aware of the femtoLisp API and garbage
|
|
collector.
|
|
|
|
|
|
<h2>8.8. Native functions</h2>
|
|
|
|
</body>
|
|
</html>
|