429 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
	
			
		
		
	
	
			429 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			HTML
		
	
	
	
| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 | |
|    "http://www.w3.org/TR/html4/loose.dtd">
 | |
| <html>
 | |
| <head>
 | |
| <meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
 | |
| <title>femtoLisp</title>
 | |
| </head>
 | |
| <body bgcolor="#fcfcfc">    <!-"#fcfcc8">
 | |
| <img src="flbanner.jpg">
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>0. Argument</h1>
 | |
| This Lisp has the following characteristics and goals:
 | |
| 
 | |
| <ul>
 | |
| <li>Lisp-1 evaluation rule (ala Scheme)
 | |
| <li>Self-evaluating lambda (i.e. <tt>'(lambda (x) x)</tt> is callable)
 | |
| <li>Full Common Lisp-style macros
 | |
| <li>Dotted lambda lists for rest arguments (ala Scheme)
 | |
| <li>Symbols have one binding
 | |
| <li>Builtin functions are constants
 | |
| <li><em>All</em> values are printable and readable
 | |
| <li>Case-sensitive symbol names
 | |
| <li>Only the minimal core built-in (i.e. written in C), but
 | |
|     enough to provide a practical level of performance
 | |
| <li>Very short (but not necessarily simple...) implementation
 | |
| <li>Generally use Common Lisp operator names
 | |
| <li>Nothing excessively weird or fancy
 | |
| </ul>
 | |
| 
 | |
| <h1>1. Syntax</h1>
 | |
| <h2>1.1. Symbols</h2>
 | |
| Any character string can be a symbol name, including the empty string. In
 | |
| general, text between whitespace is read as a symbol except in the following
 | |
| cases:
 | |
| <ul>
 | |
| <li>The text begins with <tt>#</tt>
 | |
| <li>The text consists of a single period <tt>.</tt>
 | |
| <li>The text contains one of the special characters <tt>()[]';`,\|</tt>
 | |
| <li>The text is a valid number
 | |
| <li>The text is empty
 | |
| </ul>
 | |
| In these cases the symbol can be written by surrounding it with <tt>| |</tt>
 | |
| characters, or by escaping individual characters within the symbol using
 | |
| backslash <tt>\</tt>. Note that <tt>|</tt> and <tt>\</tt> must always be
 | |
| preceded with a backslash when writing a symbol name.
 | |
| 
 | |
| <h2>1.2. Numbers</h2>
 | |
| 
 | |
| A number consists of an optional + or - sign followed by one of the following
 | |
| sequences:
 | |
| <ul>
 | |
| <li><tt>NNN...</tt> where N is a decimal digit
 | |
| <li><tt>0xNNN...</tt> where N is a hexadecimal digit
 | |
| <li><tt>0NNN...</tt> where N is an octal digit
 | |
| </ul>
 | |
| femtoLisp provides 30-bit integers, and it is an error to write a constant
 | |
| less than -2<sup>29</sup> or greater than 2<sup>29</sup>-1.
 | |
| 
 | |
| <h2>1.3. Conses and vectors</h2>
 | |
| 
 | |
| The text <tt>(a b c)</tt> parses to the structure
 | |
| <tt>(cons a (cons b (cons c nil)))</tt> where a, b, and c are arbitrary
 | |
| expressions.
 | |
| <p>
 | |
| The text <tt>(a . b)</tt> parses to the structure
 | |
| <tt>(cons a b)</tt> where a and b are arbitrary expressions.
 | |
| <p>
 | |
| The text <tt>()</tt> reads as the symbol <tt>nil</tt>.
 | |
| <p>
 | |
| The text <tt>[a b c]</tt> parses to a vector of expressions a, b, and c.
 | |
| The syntax <tt>#(a b c)</tt> has the same meaning.
 | |
| 
 | |
| 
 | |
| <h2>1.4. Comments</h2>
 | |
| 
 | |
| Text between a semicolon <tt>;</tt> and the next end-of-line is skipped.
 | |
| Text between <tt>#|</tt> and <tt>|#</tt> is also skipped.
 | |
| 
 | |
| <h2>1.5. Prefix tokens</h2>
 | |
| 
 | |
| There are five special prefix tokens which parse as follows:<p>
 | |
| <tt>'a</tt> is equivalent to <tt>(quote a)</tt>.<br>
 | |
| <tt>`a</tt> is equivalent to <tt>(backquote a)</tt>.<br>
 | |
| <tt>,a</tt> is equivalent to <tt>(*comma* a)</tt>.<br>
 | |
| <tt>,@a</tt> is equivalent to <tt>(*comma-at* a)</tt>.<br>
 | |
| <tt>,.a</tt> is equivalent to <tt>(*comma-dot* a)</tt>.
 | |
| 
 | |
| 
 | |
| <h2>1.6. Other read macros</h2>
 | |
| 
 | |
| femtoLisp provides a few "read macros" that let you accomplish interesting
 | |
| tricks for textually representing data structures.
 | |
| 
 | |
| <table border=1>
 | |
| <tr>
 | |
| <td>sequence<td>meaning
 | |
| <tr>
 | |
| <td><tt>#.e</tt><td>evaluate expression <tt>e</tt> and behave as if e's
 | |
|   value had been written in place of e
 | |
| <tr>
 | |
| <td><tt>#\c</tt><td><tt>c</tt> is a character; read as its Unicode value
 | |
| <tr>
 | |
| <td><tt>#n=e</tt><td>read <tt>e</tt> and label it as <tt>n</tt>, where n
 | |
|   is a decimal number
 | |
| <tr>
 | |
| <td><tt>#n#</tt><td>read as the identically-same value previously labeled
 | |
|   <tt>n</tt>
 | |
| <tr>
 | |
| <td><tt>#:gNNN or #:NNN</tt><td>read a gensym. NNN is a hexadecimal
 | |
|   constant. future occurrences of the same <tt>#:</tt> sequence will read to
 | |
|   the identically-same gensym
 | |
| <tr>
 | |
| <td><tt>#sym(...)</tt><td>reads to the result of evaluating
 | |
|   <tt>(apply sym '(...))</tt>
 | |
| <tr>
 | |
| <td><tt>#<</tt><td>triggers an error
 | |
| <tr>
 | |
| <td><tt>#'</tt><td>ignored; provided for compatibility
 | |
| <tr>
 | |
| <td><tt>#!</tt><td>single-line comment, for script execution support
 | |
| <tr>
 | |
| <td><tt>"str"</tt><td>UTF-8 character string; may contain newlines.
 | |
|   <tt>\</tt> is the escape character. All C escape sequences are supported, plus
 | |
|   <tt>\u</tt> and <tt>\U</tt> for unicode values.
 | |
| </table>
 | |
| When a read macro involves persistent state (e.g. label assignments), that
 | |
| state is valid only within the closest enclosing call to <tt>read</tt>.
 | |
| 
 | |
| 
 | |
| <h2>1.7. Builtins</h2>
 | |
| 
 | |
| Builtin functions are represented as opaque constants. Every builtin
 | |
| function is the value of some constant symbol, so the builtin <tt>eq</tt>,
 | |
| for example, can be written as <tt>#.eq</tt> ("the value of symbol eq").
 | |
| Note that <tt>eq</tt> itself is still an ordinary symbol, except that its
 | |
| value cannot be changed.
 | |
| <p>
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| 
 | |
| <h1>2. Data and execution models</h1>
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| 
 | |
| <h1>3. Primitive functions</h1>
 | |
| 
 | |
| 
 | |
| eq atom not set prog1 progn
 | |
| symbolp numberp builtinp consp vectorp boundp
 | |
| + - * / <
 | |
| apply eval
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>4. Special forms</h1>
 | |
| 
 | |
| quote if lambda macro while label cond and or
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>5. Data structures</h1>
 | |
| 
 | |
| cons car cdr rplaca rplacd list
 | |
| alloc vector aref aset length
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>6. Other functions</h1>
 | |
| 
 | |
| read print princ load exit
 | |
| equal compare
 | |
| gensym
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>7. Exceptions</h1>
 | |
| 
 | |
| trycatch raise
 | |
| 
 | |
| 
 | |
| <table border=0 width="100%" cellpadding=0 cellspacing=0>
 | |
| <tr><td bgcolor="#2d3f5f" height=4></table>
 | |
| 
 | |
| <h1>8. Cvalues</h1>
 | |
| 
 | |
| <h2>8.1. Introduction</h2>
 | |
| 
 | |
| femtoLisp allows you to use the full range of C data types on
 | |
| dynamically-typed Lisp values. The motivation for this feature is that
 | |
| useful
 | |
| interpreters must provide a large library of routines in C for dealing
 | |
| with "real world" data like text and packed numeric arrays, and I would
 | |
| rather not write yet another such library. Instead, all the
 | |
| required data representations and primitives are provided so that such
 | |
| features could be implemented in, or at least described in, Lisp.
 | |
| <p>
 | |
| The cvalues capability makes it easier to call C from Lisp by providing
 | |
| ways to construct whatever arguments your C routines might require, and ways
 | |
| to decipher whatever values your C routines might return. Here are some
 | |
| things you can do with cvalues:
 | |
| <ul>
 | |
| <li>Call native C functions from Lisp without wrappers
 | |
| <li>Wrap C functions in pure Lisp, automatically inheriting some degree
 | |
|   of type safety
 | |
| <li>Use Lisp functions as callbacks from C code
 | |
| <li>Use the Lisp garbage collector to reclaim malloc'd storage
 | |
| <li>Annotate C pointers with size information for bounds checking or
 | |
|   serialization
 | |
| <li>Attach symbolic type information to a C data structure, allowing it to
 | |
|   inherit Lisp services such as printing a readable representation
 | |
| <li>Add datatypes like strings to Lisp
 | |
| <li>Use more efficient represenations for your Lisp programs' data
 | |
| </ul>
 | |
| <p>
 | |
| femtoLisp's "cvalues" is inspired in part by Python's "ctypes" package.
 | |
| Lisp doesn't really have first-class types the way Python does, but it does
 | |
| have values, hence my version is called "cvalues".
 | |
| 
 | |
| <h2>8.2. Type representations</h2>
 | |
| 
 | |
| The core of cvalues is a language for describing C data types as
 | |
| symbolic expressions:
 | |
| 
 | |
| <ul>
 | |
| <li>Primitive types are symbols <tt>int8, uint8, int16, uint16, int32, uint32,
 | |
| int64, uint64, char, wchar, long, ulong, float, double, void</tt>
 | |
| <li>Arrays <tt>(array TYPE SIZE)</tt>, where TYPE is another C type and
 | |
| SIZE is either a Lisp number or a C ulong. SIZE can be omitted to
 | |
| represent incomplete C array types like "int a[]". As in C, the size may
 | |
| only be omitted for the top level of a nested array; all array
 | |
| <em>element</em> types
 | |
| must have explicit sizes. Examples:
 | |
| <ul>
 | |
|   <tt>int a[][2][3]</tt> is <tt>(array (array (array int32 3) 2))</tt><br>
 | |
|   <tt>int a[4][]</tt> would be <tt>(array (array int32) 4)</tt>, but this is
 | |
|   invalid.
 | |
| </ul>
 | |
| <li>Pointer <tt>(pointer TYPE)</tt>
 | |
| <li>Struct <tt>(struct ((NAME TYPE) (NAME TYPE) ...))</tt>
 | |
| <li>Union <tt>(union ((NAME TYPE) (NAME TYPE) ...))</tt>
 | |
| <li>Enum <tt>(enum (NAME NAME ...))</tt>
 | |
| <li>Function <tt>(c-function RET-TYPE (ARG-TYPE ARG-TYPE ...))</tt>
 | |
| </ul>
 | |
| 
 | |
| A cvalue can be constructed using <tt>(c-value TYPE arg)</tt>, where
 | |
| <tt>arg</tt> is some Lisp value. The system will try to convert the Lisp
 | |
| value to the specified type. In many cases this will work better if some
 | |
| components of the provided Lisp value are themselves cvalues.
 | |
| 
 | |
| <p>
 | |
| Note the function type is called "c-function" to avoid confusion, since
 | |
| functions are such a prevalent concept in Lisp.
 | |
| 
 | |
| <p>
 | |
| The function <tt>sizeof</tt> returns the size (in bytes) of a cvalue or a
 | |
| c type. Every cvalue has a size, but incomplete types will cause
 | |
| <tt>sizeof</tt> to raise an error. The function <tt>typeof</tt> returns
 | |
| the type of a cvalue.
 | |
| 
 | |
| <p>
 | |
| You are probably wondering how 32- and 64-bit integers are constructed from
 | |
| femtoLisp's 30-bit integers. The answer is that larger integers are
 | |
| constructed from multiple Lisp numbers 16 bits at a time, in big-endian
 | |
| fashion. In fact, the larger numeric types are the only cvalues
 | |
| types whose constructors accept multiple arguments. Examples:
 | |
| <ul>
 | |
| <pre>
 | |
| (c-value 'int32 0xdead 0xbeef)         ; make 0xdeadbeef
 | |
| (c-value 'uint64 0x1001 0x8000 0xffff) ; make 0x000010018000ffff
 | |
| </pre>
 | |
| </ul>
 | |
| As you can see, missing zeros are padded in from the left.
 | |
| 
 | |
| 
 | |
| <h2>8.3. Constructors</h2>
 | |
| 
 | |
| For convenience, a specialized constructor is provided for each
 | |
| class of C type (primitives, pointer, array, struct, union, enum,
 | |
| and c-function).
 | |
| For example:
 | |
| <ul>
 | |
| <pre>
 | |
| (uint32 0xcafe 0xd00d)
 | |
| (int32 -4)
 | |
| (char #\w)
 | |
| (array 'int8 [1 1 2 3 5 8])
 | |
| </pre>
 | |
| </ul>
 | |
| 
 | |
| These forms can be slightly less efficient than <tt>(c-value ...)</tt>
 | |
| because in many cases they will allocate a new type for the new value.
 | |
| For example, the fourth expression must create the type
 | |
| <tt>(array int8 6)</tt>.
 | |
| 
 | |
| <p>
 | |
| Notice that calls to these constructors strongly resemble
 | |
| the types of the values they create. This relationship can be expressed
 | |
| formally as follows:
 | |
| 
 | |
| <pre>
 | |
| (define (c-allocate type)
 | |
|   (if (atom type)
 | |
|       (apply (eval type) ())
 | |
|       (apply (eval (car type)) (cdr type))))
 | |
| </pre>
 | |
| 
 | |
| This function produces an instance of the given type by
 | |
| invoking the appropriate constructor. Primitive types (whose representations
 | |
| are symbols) can be constructed with zero arguments. For other types,
 | |
| the only required arguments are those present in the type representation.
 | |
| Any arguments after those are initializers. Using
 | |
| <tt>(cdr type)</tt> as the argument list provides only required arguments,
 | |
| so the value you get will not be initialized.
 | |
| 
 | |
| <p>
 | |
| The builtin <tt>c-value</tt> function is similar to this one, except that it
 | |
| lets you pass initializers.
 | |
| 
 | |
| <p>
 | |
| Cvalue constructors are generally permissive; they do the best they
 | |
| can with whatever you pass in. For example:
 | |
| 
 | |
| <ul>
 | |
| <pre>
 | |
| (c-value '(array int8 1))      ; ok, full type provided
 | |
| (c-value '(array int8))        ; error, no size information
 | |
| (c-value '(array int8) [0 1])  ; ok, size implied by initializer
 | |
| </pre>
 | |
| </ul>
 | |
| 
 | |
| <p>
 | |
| ccopy, c2lisp
 | |
| 
 | |
| <h2>8.4. Pointers, arrays, and strings</h2>
 | |
| 
 | |
| Pointer types are provided for completeness and C interoperability, but
 | |
| they should not generally be used from Lisp. femtoLisp doesn't know
 | |
| anything about a pointer except the raw address and the (alleged) type of the
 | |
| value it points to. Arrays are much more useful. They behave like references
 | |
| as in C, but femtoLisp tracks their sizes and performs bounds checking.
 | |
| 
 | |
| <p>
 | |
| Arrays are used to allocate strings. All strings share
 | |
| the incomplete array type <tt>(array char)</tt>:
 | |
| 
 | |
| <pre>
 | |
| > (c-value '(array char) [#\h #\e #\l #\l #\o])
 | |
| "hello"
 | |
| 
 | |
| > (sizeof that)
 | |
| 5
 | |
| </pre>
 | |
| 
 | |
| <tt>sizeof</tt> reveals that the size is known even though it is not
 | |
| reflected in the type (as is always the case with incomplete array types).
 | |
| 
 | |
| <p>
 | |
| Since femtoLisp tracks the sizes of all values, there is no need for NUL
 | |
| terminators. Strings are just arrays of bytes, and may contain zero bytes
 | |
| throughout. However, C functions require zero-terminated strings. To
 | |
| solve this problem, femtoLisp allocates magic strings that actually have
 | |
| space for one more byte than they appear to. The hidden extra byte is
 | |
| always zero. This guarantees that a C function operating on the string
 | |
| will never overrun its allocated space.
 | |
| 
 | |
| <p>
 | |
| Such magic strings are produced by double-quoted string literals, and by
 | |
| any explicit string-constructing function (such as <tt>string</tt>).
 | |
| 
 | |
| <p>
 | |
| Unfortunately you still need to be careful, because it is possible to
 | |
| allocate a non-magic character array with no terminator. The "hello"
 | |
| string above is an example of this, since it was constructed from an
 | |
| explicit vector of characters.
 | |
| Such an array would cause problems if passed to a function expecting a
 | |
| C string.
 | |
| 
 | |
| <p>
 | |
| deref
 | |
| 
 | |
| <h2>8.5. Access</h2>
 | |
| 
 | |
| cref,cset,byteref,byteset,ccopy
 | |
| 
 | |
| <h2>8.6. Memory management concerns</h2>
 | |
| 
 | |
| autorelease
 | |
| 
 | |
| 
 | |
| <h2>8.7. Guest functions</h2>
 | |
| 
 | |
| Functions written in C but designed to operate on Lisp values are
 | |
| known here as "guest functions". Although they are foreign, they live in
 | |
| Lisp's house and so live by its rules. Guest functions are what you
 | |
| use to write interpreter extensions, for example to implement a function
 | |
| like <tt>assoc</tt> in C for performance.
 | |
| 
 | |
| <p>
 | |
| Guest functions must have a particular signature:
 | |
| <pre>
 | |
| value_t func(value_t *args, uint32_t nargs);
 | |
| </pre>
 | |
| Guest functions must also be aware of the femtoLisp API and garbage
 | |
| collector.
 | |
| 
 | |
| 
 | |
| <h2>8.8. Native functions</h2>
 | |
| 
 | |
| </body>
 | |
| </html>
 |