429 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
	
			
		
		
	
	
			429 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
	
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 | 
						|
   "http://www.w3.org/TR/html4/loose.dtd">
 | 
						|
<html>
 | 
						|
<head>
 | 
						|
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
 | 
						|
<title>femtoLisp</title>
 | 
						|
</head>
 | 
						|
<body bgcolor="#fcfcfc">    <!-"#fcfcc8">
 | 
						|
<img src="flbanner.jpg">
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>0. Argument</h1>
 | 
						|
This Lisp has the following characteristics and goals:
 | 
						|
 | 
						|
<ul>
 | 
						|
<li>Lisp-1 evaluation rule (ala Scheme)
 | 
						|
<li>Self-evaluating lambda (i.e. <tt>'(lambda (x) x)</tt> is callable)
 | 
						|
<li>Full Common Lisp-style macros
 | 
						|
<li>Dotted lambda lists for rest arguments (ala Scheme)
 | 
						|
<li>Symbols have one binding
 | 
						|
<li>Builtin functions are constants
 | 
						|
<li><em>All</em> values are printable and readable
 | 
						|
<li>Case-sensitive symbol names
 | 
						|
<li>Only the minimal core built-in (i.e. written in C), but
 | 
						|
    enough to provide a practical level of performance
 | 
						|
<li>Very short (but not necessarily simple...) implementation
 | 
						|
<li>Generally use Common Lisp operator names
 | 
						|
<li>Nothing excessively weird or fancy
 | 
						|
</ul>
 | 
						|
 | 
						|
<h1>1. Syntax</h1>
 | 
						|
<h2>1.1. Symbols</h2>
 | 
						|
Any character string can be a symbol name, including the empty string. In
 | 
						|
general, text between whitespace is read as a symbol except in the following
 | 
						|
cases:
 | 
						|
<ul>
 | 
						|
<li>The text begins with <tt>#</tt>
 | 
						|
<li>The text consists of a single period <tt>.</tt>
 | 
						|
<li>The text contains one of the special characters <tt>()[]';`,\|</tt>
 | 
						|
<li>The text is a valid number
 | 
						|
<li>The text is empty
 | 
						|
</ul>
 | 
						|
In these cases the symbol can be written by surrounding it with <tt>| |</tt>
 | 
						|
characters, or by escaping individual characters within the symbol using
 | 
						|
backslash <tt>\</tt>. Note that <tt>|</tt> and <tt>\</tt> must always be
 | 
						|
preceded with a backslash when writing a symbol name.
 | 
						|
 | 
						|
<h2>1.2. Numbers</h2>
 | 
						|
 | 
						|
A number consists of an optional + or - sign followed by one of the following
 | 
						|
sequences:
 | 
						|
<ul>
 | 
						|
<li><tt>NNN...</tt> where N is a decimal digit
 | 
						|
<li><tt>0xNNN...</tt> where N is a hexadecimal digit
 | 
						|
<li><tt>0NNN...</tt> where N is an octal digit
 | 
						|
</ul>
 | 
						|
femtoLisp provides 30-bit integers, and it is an error to write a constant
 | 
						|
less than -2<sup>29</sup> or greater than 2<sup>29</sup>-1.
 | 
						|
 | 
						|
<h2>1.3. Conses and vectors</h2>
 | 
						|
 | 
						|
The text <tt>(a b c)</tt> parses to the structure
 | 
						|
<tt>(cons a (cons b (cons c nil)))</tt> where a, b, and c are arbitrary
 | 
						|
expressions.
 | 
						|
<p>
 | 
						|
The text <tt>(a . b)</tt> parses to the structure
 | 
						|
<tt>(cons a b)</tt> where a and b are arbitrary expressions.
 | 
						|
<p>
 | 
						|
The text <tt>()</tt> reads as the symbol <tt>nil</tt>.
 | 
						|
<p>
 | 
						|
The text <tt>[a b c]</tt> parses to a vector of expressions a, b, and c.
 | 
						|
The syntax <tt>#(a b c)</tt> has the same meaning.
 | 
						|
 | 
						|
 | 
						|
<h2>1.4. Comments</h2>
 | 
						|
 | 
						|
Text between a semicolon <tt>;</tt> and the next end-of-line is skipped.
 | 
						|
Text between <tt>#|</tt> and <tt>|#</tt> is also skipped.
 | 
						|
 | 
						|
<h2>1.5. Prefix tokens</h2>
 | 
						|
 | 
						|
There are five special prefix tokens which parse as follows:<p>
 | 
						|
<tt>'a</tt> is equivalent to <tt>(quote a)</tt>.<br>
 | 
						|
<tt>`a</tt> is equivalent to <tt>(backquote a)</tt>.<br>
 | 
						|
<tt>,a</tt> is equivalent to <tt>(*comma* a)</tt>.<br>
 | 
						|
<tt>,@a</tt> is equivalent to <tt>(*comma-at* a)</tt>.<br>
 | 
						|
<tt>,.a</tt> is equivalent to <tt>(*comma-dot* a)</tt>.
 | 
						|
 | 
						|
 | 
						|
<h2>1.6. Other read macros</h2>
 | 
						|
 | 
						|
femtoLisp provides a few "read macros" that let you accomplish interesting
 | 
						|
tricks for textually representing data structures.
 | 
						|
 | 
						|
<table border=1>
 | 
						|
<tr>
 | 
						|
<td>sequence<td>meaning
 | 
						|
<tr>
 | 
						|
<td><tt>#.e</tt><td>evaluate expression <tt>e</tt> and behave as if e's
 | 
						|
  value had been written in place of e
 | 
						|
<tr>
 | 
						|
<td><tt>#\c</tt><td><tt>c</tt> is a character; read as its Unicode value
 | 
						|
<tr>
 | 
						|
<td><tt>#n=e</tt><td>read <tt>e</tt> and label it as <tt>n</tt>, where n
 | 
						|
  is a decimal number
 | 
						|
<tr>
 | 
						|
<td><tt>#n#</tt><td>read as the identically-same value previously labeled
 | 
						|
  <tt>n</tt>
 | 
						|
<tr>
 | 
						|
<td><tt>#:gNNN or #:NNN</tt><td>read a gensym. NNN is a hexadecimal
 | 
						|
  constant. future occurrences of the same <tt>#:</tt> sequence will read to
 | 
						|
  the identically-same gensym
 | 
						|
<tr>
 | 
						|
<td><tt>#sym(...)</tt><td>reads to the result of evaluating
 | 
						|
  <tt>(apply sym '(...))</tt>
 | 
						|
<tr>
 | 
						|
<td><tt>#<</tt><td>triggers an error
 | 
						|
<tr>
 | 
						|
<td><tt>#'</tt><td>ignored; provided for compatibility
 | 
						|
<tr>
 | 
						|
<td><tt>#!</tt><td>single-line comment, for script execution support
 | 
						|
<tr>
 | 
						|
<td><tt>"str"</tt><td>UTF-8 character string; may contain newlines.
 | 
						|
  <tt>\</tt> is the escape character. All C escape sequences are supported, plus
 | 
						|
  <tt>\u</tt> and <tt>\U</tt> for unicode values.
 | 
						|
</table>
 | 
						|
When a read macro involves persistent state (e.g. label assignments), that
 | 
						|
state is valid only within the closest enclosing call to <tt>read</tt>.
 | 
						|
 | 
						|
 | 
						|
<h2>1.7. Builtins</h2>
 | 
						|
 | 
						|
Builtin functions are represented as opaque constants. Every builtin
 | 
						|
function is the value of some constant symbol, so the builtin <tt>eq</tt>,
 | 
						|
for example, can be written as <tt>#.eq</tt> ("the value of symbol eq").
 | 
						|
Note that <tt>eq</tt> itself is still an ordinary symbol, except that its
 | 
						|
value cannot be changed.
 | 
						|
<p>
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
 | 
						|
<h1>2. Data and execution models</h1>
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
 | 
						|
<h1>3. Primitive functions</h1>
 | 
						|
 | 
						|
 | 
						|
eq atom not set prog1 progn
 | 
						|
symbolp numberp builtinp consp vectorp boundp
 | 
						|
+ - * / <
 | 
						|
apply eval
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>4. Special forms</h1>
 | 
						|
 | 
						|
quote if lambda macro while label cond and or
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>5. Data structures</h1>
 | 
						|
 | 
						|
cons car cdr rplaca rplacd list
 | 
						|
alloc vector aref aset length
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>6. Other functions</h1>
 | 
						|
 | 
						|
read print princ load exit
 | 
						|
equal compare
 | 
						|
gensym
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>7. Exceptions</h1>
 | 
						|
 | 
						|
trycatch raise
 | 
						|
 | 
						|
 | 
						|
<table border=0 width="100%" cellpadding=0 cellspacing=0>
 | 
						|
<tr><td bgcolor="#2d3f5f" height=4></table>
 | 
						|
 | 
						|
<h1>8. Cvalues</h1>
 | 
						|
 | 
						|
<h2>8.1. Introduction</h2>
 | 
						|
 | 
						|
femtoLisp allows you to use the full range of C data types on
 | 
						|
dynamically-typed Lisp values. The motivation for this feature is that
 | 
						|
useful
 | 
						|
interpreters must provide a large library of routines in C for dealing
 | 
						|
with "real world" data like text and packed numeric arrays, and I would
 | 
						|
rather not write yet another such library. Instead, all the
 | 
						|
required data representations and primitives are provided so that such
 | 
						|
features could be implemented in, or at least described in, Lisp.
 | 
						|
<p>
 | 
						|
The cvalues capability makes it easier to call C from Lisp by providing
 | 
						|
ways to construct whatever arguments your C routines might require, and ways
 | 
						|
to decipher whatever values your C routines might return. Here are some
 | 
						|
things you can do with cvalues:
 | 
						|
<ul>
 | 
						|
<li>Call native C functions from Lisp without wrappers
 | 
						|
<li>Wrap C functions in pure Lisp, automatically inheriting some degree
 | 
						|
  of type safety
 | 
						|
<li>Use Lisp functions as callbacks from C code
 | 
						|
<li>Use the Lisp garbage collector to reclaim malloc'd storage
 | 
						|
<li>Annotate C pointers with size information for bounds checking or
 | 
						|
  serialization
 | 
						|
<li>Attach symbolic type information to a C data structure, allowing it to
 | 
						|
  inherit Lisp services such as printing a readable representation
 | 
						|
<li>Add datatypes like strings to Lisp
 | 
						|
<li>Use more efficient represenations for your Lisp programs' data
 | 
						|
</ul>
 | 
						|
<p>
 | 
						|
femtoLisp's "cvalues" is inspired in part by Python's "ctypes" package.
 | 
						|
Lisp doesn't really have first-class types the way Python does, but it does
 | 
						|
have values, hence my version is called "cvalues".
 | 
						|
 | 
						|
<h2>8.2. Type representations</h2>
 | 
						|
 | 
						|
The core of cvalues is a language for describing C data types as
 | 
						|
symbolic expressions:
 | 
						|
 | 
						|
<ul>
 | 
						|
<li>Primitive types are symbols <tt>int8, uint8, int16, uint16, int32, uint32,
 | 
						|
int64, uint64, char, wchar, long, ulong, float, double, void</tt>
 | 
						|
<li>Arrays <tt>(array TYPE SIZE)</tt>, where TYPE is another C type and
 | 
						|
SIZE is either a Lisp number or a C ulong. SIZE can be omitted to
 | 
						|
represent incomplete C array types like "int a[]". As in C, the size may
 | 
						|
only be omitted for the top level of a nested array; all array
 | 
						|
<em>element</em> types
 | 
						|
must have explicit sizes. Examples:
 | 
						|
<ul>
 | 
						|
  <tt>int a[][2][3]</tt> is <tt>(array (array (array int32 3) 2))</tt><br>
 | 
						|
  <tt>int a[4][]</tt> would be <tt>(array (array int32) 4)</tt>, but this is
 | 
						|
  invalid.
 | 
						|
</ul>
 | 
						|
<li>Pointer <tt>(pointer TYPE)</tt>
 | 
						|
<li>Struct <tt>(struct ((NAME TYPE) (NAME TYPE) ...))</tt>
 | 
						|
<li>Union <tt>(union ((NAME TYPE) (NAME TYPE) ...))</tt>
 | 
						|
<li>Enum <tt>(enum (NAME NAME ...))</tt>
 | 
						|
<li>Function <tt>(c-function RET-TYPE (ARG-TYPE ARG-TYPE ...))</tt>
 | 
						|
</ul>
 | 
						|
 | 
						|
A cvalue can be constructed using <tt>(c-value TYPE arg)</tt>, where
 | 
						|
<tt>arg</tt> is some Lisp value. The system will try to convert the Lisp
 | 
						|
value to the specified type. In many cases this will work better if some
 | 
						|
components of the provided Lisp value are themselves cvalues.
 | 
						|
 | 
						|
<p>
 | 
						|
Note the function type is called "c-function" to avoid confusion, since
 | 
						|
functions are such a prevalent concept in Lisp.
 | 
						|
 | 
						|
<p>
 | 
						|
The function <tt>sizeof</tt> returns the size (in bytes) of a cvalue or a
 | 
						|
c type. Every cvalue has a size, but incomplete types will cause
 | 
						|
<tt>sizeof</tt> to raise an error. The function <tt>typeof</tt> returns
 | 
						|
the type of a cvalue.
 | 
						|
 | 
						|
<p>
 | 
						|
You are probably wondering how 32- and 64-bit integers are constructed from
 | 
						|
femtoLisp's 30-bit integers. The answer is that larger integers are
 | 
						|
constructed from multiple Lisp numbers 16 bits at a time, in big-endian
 | 
						|
fashion. In fact, the larger numeric types are the only cvalues
 | 
						|
types whose constructors accept multiple arguments. Examples:
 | 
						|
<ul>
 | 
						|
<pre>
 | 
						|
(c-value 'int32 0xdead 0xbeef)         ; make 0xdeadbeef
 | 
						|
(c-value 'uint64 0x1001 0x8000 0xffff) ; make 0x000010018000ffff
 | 
						|
</pre>
 | 
						|
</ul>
 | 
						|
As you can see, missing zeros are padded in from the left.
 | 
						|
 | 
						|
 | 
						|
<h2>8.3. Constructors</h2>
 | 
						|
 | 
						|
For convenience, a specialized constructor is provided for each
 | 
						|
class of C type (primitives, pointer, array, struct, union, enum,
 | 
						|
and c-function).
 | 
						|
For example:
 | 
						|
<ul>
 | 
						|
<pre>
 | 
						|
(uint32 0xcafe 0xd00d)
 | 
						|
(int32 -4)
 | 
						|
(char #\w)
 | 
						|
(array 'int8 [1 1 2 3 5 8])
 | 
						|
</pre>
 | 
						|
</ul>
 | 
						|
 | 
						|
These forms can be slightly less efficient than <tt>(c-value ...)</tt>
 | 
						|
because in many cases they will allocate a new type for the new value.
 | 
						|
For example, the fourth expression must create the type
 | 
						|
<tt>(array int8 6)</tt>.
 | 
						|
 | 
						|
<p>
 | 
						|
Notice that calls to these constructors strongly resemble
 | 
						|
the types of the values they create. This relationship can be expressed
 | 
						|
formally as follows:
 | 
						|
 | 
						|
<pre>
 | 
						|
(define (c-allocate type)
 | 
						|
  (if (atom type)
 | 
						|
      (apply (eval type) ())
 | 
						|
      (apply (eval (car type)) (cdr type))))
 | 
						|
</pre>
 | 
						|
 | 
						|
This function produces an instance of the given type by
 | 
						|
invoking the appropriate constructor. Primitive types (whose representations
 | 
						|
are symbols) can be constructed with zero arguments. For other types,
 | 
						|
the only required arguments are those present in the type representation.
 | 
						|
Any arguments after those are initializers. Using
 | 
						|
<tt>(cdr type)</tt> as the argument list provides only required arguments,
 | 
						|
so the value you get will not be initialized.
 | 
						|
 | 
						|
<p>
 | 
						|
The builtin <tt>c-value</tt> function is similar to this one, except that it
 | 
						|
lets you pass initializers.
 | 
						|
 | 
						|
<p>
 | 
						|
Cvalue constructors are generally permissive; they do the best they
 | 
						|
can with whatever you pass in. For example:
 | 
						|
 | 
						|
<ul>
 | 
						|
<pre>
 | 
						|
(c-value '(array int8 1))      ; ok, full type provided
 | 
						|
(c-value '(array int8))        ; error, no size information
 | 
						|
(c-value '(array int8) [0 1])  ; ok, size implied by initializer
 | 
						|
</pre>
 | 
						|
</ul>
 | 
						|
 | 
						|
<p>
 | 
						|
ccopy, c2lisp
 | 
						|
 | 
						|
<h2>8.4. Pointers, arrays, and strings</h2>
 | 
						|
 | 
						|
Pointer types are provided for completeness and C interoperability, but
 | 
						|
they should not generally be used from Lisp. femtoLisp doesn't know
 | 
						|
anything about a pointer except the raw address and the (alleged) type of the
 | 
						|
value it points to. Arrays are much more useful. They behave like references
 | 
						|
as in C, but femtoLisp tracks their sizes and performs bounds checking.
 | 
						|
 | 
						|
<p>
 | 
						|
Arrays are used to allocate strings. All strings share
 | 
						|
the incomplete array type <tt>(array char)</tt>:
 | 
						|
 | 
						|
<pre>
 | 
						|
> (c-value '(array char) [#\h #\e #\l #\l #\o])
 | 
						|
"hello"
 | 
						|
 | 
						|
> (sizeof that)
 | 
						|
5
 | 
						|
</pre>
 | 
						|
 | 
						|
<tt>sizeof</tt> reveals that the size is known even though it is not
 | 
						|
reflected in the type (as is always the case with incomplete array types).
 | 
						|
 | 
						|
<p>
 | 
						|
Since femtoLisp tracks the sizes of all values, there is no need for NUL
 | 
						|
terminators. Strings are just arrays of bytes, and may contain zero bytes
 | 
						|
throughout. However, C functions require zero-terminated strings. To
 | 
						|
solve this problem, femtoLisp allocates magic strings that actually have
 | 
						|
space for one more byte than they appear to. The hidden extra byte is
 | 
						|
always zero. This guarantees that a C function operating on the string
 | 
						|
will never overrun its allocated space.
 | 
						|
 | 
						|
<p>
 | 
						|
Such magic strings are produced by double-quoted string literals, and by
 | 
						|
any explicit string-constructing function (such as <tt>string</tt>).
 | 
						|
 | 
						|
<p>
 | 
						|
Unfortunately you still need to be careful, because it is possible to
 | 
						|
allocate a non-magic character array with no terminator. The "hello"
 | 
						|
string above is an example of this, since it was constructed from an
 | 
						|
explicit vector of characters.
 | 
						|
Such an array would cause problems if passed to a function expecting a
 | 
						|
C string.
 | 
						|
 | 
						|
<p>
 | 
						|
deref
 | 
						|
 | 
						|
<h2>8.5. Access</h2>
 | 
						|
 | 
						|
cref,cset,byteref,byteset,ccopy
 | 
						|
 | 
						|
<h2>8.6. Memory management concerns</h2>
 | 
						|
 | 
						|
autorelease
 | 
						|
 | 
						|
 | 
						|
<h2>8.7. Guest functions</h2>
 | 
						|
 | 
						|
Functions written in C but designed to operate on Lisp values are
 | 
						|
known here as "guest functions". Although they are foreign, they live in
 | 
						|
Lisp's house and so live by its rules. Guest functions are what you
 | 
						|
use to write interpreter extensions, for example to implement a function
 | 
						|
like <tt>assoc</tt> in C for performance.
 | 
						|
 | 
						|
<p>
 | 
						|
Guest functions must have a particular signature:
 | 
						|
<pre>
 | 
						|
value_t func(value_t *args, uint32_t nargs);
 | 
						|
</pre>
 | 
						|
Guest functions must also be aware of the femtoLisp API and garbage
 | 
						|
collector.
 | 
						|
 | 
						|
 | 
						|
<h2>8.8. Native functions</h2>
 | 
						|
 | 
						|
</body>
 | 
						|
</html>
 |