vx-scheme-website/www/index.html

589 lines
23 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta name="generator" content=
"HTML Tidy for FreeBSD (vers 1st March 2002), see www.w3.org">
<title>Vx-Scheme</title>
<meta name="Author"
content="Colin Smith">
<meta name="description"
content="An implementation of Scheme for VxWorks.">
<meta name="keywords"
content="Scheme, Colin Smith, Google, Lisp, VxWorks, Real-Time">
<link rel="stylesheet" type="text/css" href="css.css">
</head>
<body>
<table class="doctable" border="0" cellspacing="0" summary=
"this table is just for layout.">
<tr class="toprow">
<td class="leftcol"></td>
<td class="maintitle">
<h2>vx-scheme</h2>
<h4>A Scheme interpreter for VxWorks.</h4>
<p align="right">[ <a href="download.html">Download</a> ]</p>
</td>
</tr>
<tr>
<td class="leftedge"></td>
<td class="body">
<h3>Introduction</h3>
<p>Vx-scheme is a compact, fairly efficient implementation of the
Scheme programming language. It has some special features designed
to allow it to integrate with the VxWorks real-time operating
system shell. It is very nearly compliant to the <a href=
"http://sicp.ai.mit.edu/Fall-2002/manuals/r4rs/r4rs_toc.html">R4RS</a>
language standard: in particular, it supports continuations with
infinite lifetimes and all the procedures mentioned in the
specification, including the optional ones (such as
<code>force</code> and <code>delay</code>). It can be used under
the terms of the <a href="LICENSE">Artistic License</a>.</p>
<p>[<a href="download.html">download</a>]</p>
<h3>Uses</h3> VX-scheme was briefly used in a humanoid robotics
project. See <a
href="http://mirriwinni.it.jcu.edu.au/~cgaskett/thewiki/RoboticsLearningAndVision">Chris
Gaskett's page</a> at James Cook University in Cairns, Australia.
<h3>History</h3>
<p>Scheme is sort of a hobby of mine, as I'm a fan of Abelson, Sussman
and Sussman's seminal text <a href=
"http://www-mitpress.mit.edu/sicp/">Structure and Interpretation of
Computer Programs</a> <b>[SICP]</b>. This implementation was born as I
tried to follow the Scheme implementation experiments in chapters <a
href=
"http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-25.html#%_chap_4">
4</a> and <a href=
"http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-30.html#%_chap_5">
5</a> of that book using C as the implementation language. Once that
was working well enough, I then undertook to implement the complete <a
href=
"http://sicp.ai.mit.edu/Fall-2002/manuals/r4rs/r4rs_toc.html">R4</a>
standard of the language, thinking it looked pretty easy! The R4
standard is a marvel of concision, but still conceals within it about
200 procedures and special forms. I also considered how the Scheme
language might be integrated with
<a href="http://www.bluedonkey.org/cgi-bin/twiki/bin/view/Books/VxWorksCookBook">VxWorks</a>, an RTOS made by my former employer,
<a href="http://www.windriver.com">Wind River</a> (today I work at
<a href="http://www.google.com/"><font color="#0000ff"><b>G</font><font color="#ff0000">o</font><font color="#cccc00">o</font><font color="#0000ff">g</font><font color="#009900">l</font><font color="#ff0000">e</font></a></b>).
That RTOS has a simple control shell based on C syntax, that's integrated
with the runtime symbol table. Could Scheme do that as well? I wanted
to find out.</p>
<h3>Design Goals</h3>
<p>First of all I wanted to learn the concepts in Scheme language
implementation first-hand, so that I could grow as a programmer. To
that end, I restrained myself from peeking at other implementations
of Scheme on the net "to see how it should be done." That
temptation was very hard to resist with <a href=
"http://www.swiss.ai.mit.edu/~jaffer/SCM.html">Aubrey Jaffer's
SCM</a>, which is considerably faster than my implementation and
almost as small. I did use SCM as a "reference implementation" to
find out the "right behavior" in a number of tricky circumstances,
and I made extensive use of his R4 test suite during the
project.</p>
<p>(My first version model had the special forms implemented as C
functions. But this made complete support of
<code>call-with-current-continuation</code> very difficult, as the
continuation, at any moment, was "shared" between the C and Scheme
stacks. This caused me to move to a strict register-machine model
as described in SICP ch. 5. This was a lot of work just to get one
procedure working right!)</p>
<p>VxWorks programmers prize compact implementations, and this was
perhaps the one constraint I adhered to most strongly. The entire
executable code is less than 64Kb of code/data on VxWorks and
FreeBSD. The system allocates an initial budget of only 10000 cons
cells, and garbage collects when that budget is exhausted
(interestingly, most of my benchmarks can actually survive on that
small ration!) Whenever the code bloated over 64Kb, I would stop
work and wring out the fat. (While the interpreter is written in
C++ for notational convenience's sake, my tight size budget forbade
the use of STL classes, or indeed any sort of template. The program
uses a rather strict subset of C++'s features). Not having STL at
hand meant I had to implement an efficient symbol store for the
program (I like Knuth's implementation of AVL for jobs like this. A
hash table might have been better, but AVL never runs out of gas.
Knuth's implementation, which eschews recursion, is just about as
fast as a balanced tree can get, IMO.)</p>
<blockquote><p><b>Note:</b>While version 0.4 was 64144 bytes (text +
data + bss) when compiled for VxSim on Tornado 2.2, later versions
(and versions for other operatings systems) are not quite that svelte.
The current size on Linux is 85.6K.</p></blockquote>
<p>I also didn't want any time-consuming implementation shortcuts.
Each and every procedure mentioned in the <a href=
"http://sicp.ai.mit.edu/Fall-2002/manuals/r4rs/r4rs_toc.html">R4</a>
spec has its own natural, efficient C implementation. (For example,
we don't ever do syntactic transformation, e.g., transforming
<code>let*</code> into nested <code>let</code>s or
<code>cond</code>s into nested <code>if</code>s, etc.)</p>
<p>That said, it's still an interpreter, and doesn't attempt any
compilation (or even the kinds of syntactic transformation that
could save time rather than lose it). The "register machine" at the
heart of the evaluator is a fairly faithful implementation of the
design given in SICP.</p>
<h3>VxWorks Shell Integration</h3>
<p>In the VxWorks "C" shell, you can work with integer variables
and call functions with a mix of integer and string parameters.
VxScheme can do these things too. In the event that a symbol has no
value in any of the current execution context's enclosing
environments, vx-scheme will "fall through" to the VxWorks symbol
table. If it finds a data symbol, it returns the current value of
the variable as the value of the expression. If it finds a text
symbol, vx-scheme constructs a procedure that will marshal its
arguments into integer form and invoke the underlying function.
It's meant to seem simple on the user side:</p>
<blockquote>
<pre>
<code>=&gt; i
<font class="output">#&lt;lambda args ...&gt;</font>
</code>
</pre>
</blockquote>
<p>...of course, in Scheme syntax, the expression " i "
<i>would</i> have the value of a procedure object, right? To
actally call the procedure, we surround it with parens as usual
(yes, this takes getting used to).</p>
<blockquote>
<pre>
<code>=&gt; (i)
<font class=
"output">NAME ENTRY TID PRI STATUS PC SP ERRNO DELAY
---------- ------------ -------- --- ---------- -------- -------- ------- -----
tExcTask _excTask 1d38d10 0 PEND 43a2d0 1d38c10 0 0
tLogTask _logTask 1d331e0 0 PEND 43a2d0 1d330e0 0 0
tShell _shell 1d28c68 1 READY 470d6c 1d27e74 1c0001 0
tWdbTask _wdbTask 1d2e028 3 PEND 43a2d0 1d2decc 0 0</font>
</code>
</pre>
</blockquote>
<p>VxWorks variables shine right through:</p>
<blockquote>
<pre>
<code>=&gt; vxTicks
<font class="output">42354</font>
=&gt; (/ vxTicks (sysClkRateGet))
<font class="output">714.39999999999998</font>
</code>
</pre>
</blockquote>
<p>Vx-scheme converts string arguments by passing the address to
the underlying C function (just as the standard target shell would
do). This has the interesting side effect of letting us use
<code>printf</code> from Scheme: and a very useful addition to the
language it is! (if not in the proper spirit).</p>
<blockquote>
<pre>
<code>=&gt; (define buf (malloc 0x100))
=&gt; (sprintf buf "str=%s int=%d\n" "hello" 99)
<font class="output">17</font>
=&gt; (printf "buf:%s\n" buf)
<font class="output">buf:str=hello int=99
22</font>
</code>
</pre>
</blockquote>
<p>Of course one thing we get out of this whole exercise is looping
and conditionals for the VxWorks shell, things it doesn't have under
its current C-expression syntax. Scheme also does a pretty good job of
working with files. A good example of these things is the code for the
<a href="../testcases/vx-test.scm">test suite driver</a>. Here's
another where we spawn a handful of tasks with different delays.</p>
<blockquote>
<pre>
<code>=&gt; (for-each (lambda (d) (sp 'taskDelay d)) '(1000 2000 3000))
<font class="output">task spawned: id = 0x1cefaf0, name = t1
task spawned: id = 0x1ce7978, name = t2
task spawned: id = 0x1cdf800, name = t3</font>
</code>
</pre>
</blockquote>
<p>That quote-mark in front of taskDelay is to prevent the
evaluator from substituting a lambda value in place of the
variable. We don't want that this time; instead we want the address
of taskDelay to be passed to the 'sp' function! The quote allows
these two cases to be treated correctly (all in the same
S-expression).</p>
<p>This little bit of code creates three tasks and pends them all
on the same semaphore.</p>
<blockquote>
<pre>
<code>=&gt; (let ((s (semBCreate 0 0)))
(do ((i 0 (+ i 1))) ((= i 3) 'ok) (sp 'semTake s -1)))
<font class="output">task spawned: id = 0x1cefaf8, name = t4
task spawned: id = 0x1ce7980, name = t5
task spawned: id = 0x1cdf808, name = t6
ok</font>
</code>
</pre>
</blockquote>
<p>This would be more interesting if we had a means of spawning a
task to compute a Scheme subexpression, which could then deliver
its value to a waiting continuation in the spawning environment.
But that's a story for another day...</p>
<h3>"Supported" Platforms</h3>
<p>These are the platforms that I habitually run the code on.
There's a shell script, "runtest", in the root of the distribution
that will run a modest test suite on FreeBSD and Cygwin. This
includes Jaffer's R4 compliance test as well as some things I
picked up off the net and from SICP.</p>
<ul>
<li><b><a href=
"http://www.windriver.com/products/tornado2/tornado22_info.html">Tornado
2.2 (VxWorks 5.5)</a></b><br>
<p class="item">A Tornado project file is provided that will build
vx-scheme for the Tornado 2.2 simulator. It should be easy to adapt
for the architecture of your choice.</p>
</li>
<li><b><a href="http://fedora.redhat.com/">Fedora</a></b><br>
<p class="item">I've recently installed Fedora Core 2 and I'm
pretty happy with it. VxScheme works just fine on it.</p></li>
<li><b><a href="http://www.cygwin.com/">Cygwin</a></b><br>
<p class="item">Similarly, just saying "make" in the
<code>unix</code> directory on Cygwin should produce a working
executable.</p>
</li>
<li><b><a href="http://msdn.microsoft.com/">Win32</a></b><br>
<p class="item">Visual Studio C++ .NET project files are provided. </p>
</li>
</ul>
<h3>Memory and Garbage Collection</h3>
<p>This implementation uses a fairly standard cons cell structure.
The fundamental cell is a struct of two machine pointers, car and
cdr. They are required to contain 8-byte aligned addresses, leaving
the lower three bits free for tagging. The LSB is the "atom" tag,
and the other two are for GC marking/freelist management.</p>
<p>Integers that will fit in 24 bits are stored directly "in the
pointer" with no heap allocation. No special syntax is required
to access this feature and spillover to 32-bit arithmetic is
automatic (though such numbers come from cell storage).</p>
<p>Garbage collection is straight mark &amp; sweep. We start with a
budget of 10000 conses. When an allocation fails, GC is attempted
at that moment. If we succeed in leaving at least 20% of the slab
free, we're happy. Otherwise, we note that our GC attempt wasn't
very successful, and arrange to allocate another slab at the next
allocation failure.</p>
<h3>Deficiencies, and Deviations from the R4 standard</h3>
<p>The R4 standard insists that hardware integer overflow must be
handled "correctly": either by spilling over into a
multiple-precision format, or raising an error. Vx-scheme does the
one thing that is forbidden: silently returning the wrong answer!
Integer arithmetic in vx-scheme is essentially a "front-end" for
the integer arithmetic provided by the C compiler and the hardware,
and no apologies are made for it.</p>
<p>Similarly, Scheme insists that when an object is written out, this
should be done so that when it is read back in, the exact same object
is produced. This is tricky for real numbers. In vx-scheme, again, the
floating point I/O is a "front end" for that provided by
"<code>strtod</code>" and "<code>printf %.15g</code>" respectively.
These don't always produce perfectly inverse behavior. Jaffer's SCM
takes the trouble to supply a custom FP I/O package that does have
this desirable behavior, but I decided not to bother. (Perhaps not
coincidentally, Guy L. Steele Jr., one of the inventors of Scheme,
published a <a
href="http://portal.acm.org/citation.cfm?doid=93542.93559">paper</a>
on stable floating point I/O.)</p>
<p>While the R4 standard doesn't require any better, perhaps now is
the time to point out that we only support 32-bit integers and
64-bit reals as numeric data types, declining to implement support
for rationals, arbitrary-precision integers, and complex numbers.
If Scheme itself seems an odd thing to run atop an RTOS, these
features would make it even more so!</p>
<p>Error handling in vx-scheme is weak. There's no backtrace
facility or debugger. When error conditions arise (typically
finding the wrong type of atom in an argument list) we print a
message indicating the type of atom expected and longjmp back to
the top of the read-eval-print loop.</p>
<p>Vx-Scheme is lazy about checking argument lists. If a call to a
standard procedure supplies extra arguments, these are typically
ignored. Other examples of slothfulness (like letting cond's have
multiple else's, etc.) abound. Vx-Scheme is of course interested in
giving the correct result for correctly-formed expressions, but is
amazingly indulgent of ill-formed ones.</p>
<p>Vx-Scheme is not quite reentrant. While there are very few
global variables (and most of those have invariant values that
could be shared among multiple threads), some aspects of vx-scheme
(such as memory management) have not yet been made MT-safe. Part of
the reason for this: how to implement a multi-threaded scheme on an
RTOS is actually an interesting question! One way to do it would be
to have separate Scheme name/evaluation spaces running in different
VxWorks tasks. But another way to do it would share the namespace
and allow the "spawning" of threads to evaluate Scheme
subexpressions and return the resulting values to the waiting
tasks. I haven't decided which (if any) of these models to
attempt.</p>
<h3>Extra Goodies</h3>
<p>If I were willing to learn about and implement R5's
syntax-extension facility, this implementation would be within
striking distance of R5 compliance. R4 requires no support for
defining new special forms whatsoever. But I decided to add support
for defmacro, a very handy tool. It's strong enough to implement
the stream processing features detailed in SICP's <a href=
"http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html#%_sec_3.5">
&sect;3.5</a>. Here's a more pedestrian example, the
<code>while</code> special form:</p>
<blockquote>
<pre>
<code>(defmacro (while test . body)
`(let loop ()
(if ,test
(begin ,@body (loop))
'ok)))
(define i 0)
(while (&lt; i 5)
(display i)
(set! i (+ i 1)))
<font class="output">--&gt; 01234ok</font>
</code>
</pre>
</blockquote>
<p>(I include this example because I think Scheme's <code>do</code>
iteration construct is actually a bit of sly humor directed at
procedural programming partisans: a kitchen-sink iteration model
that is very difficult to read!)</p>
<p>Benchmarking is facilitated by the <code>time</code> special
form, which evaluates its argument(s) (as if it were
<code>begin</code>), but returns a pair whose car is the elapsed
time in microseconds and whose cdr is the value of the last
expression in the sequence (just as <code>begin</code> would have
returned).</p>
<blockquote>
<pre>
<code>(define (count n)
(let loop ((i 1))
(if (= i n) #t (loop (+ i 1)))))
(time (count 100000))
<font class="output">--&gt; (628424 . #t)</font>
</code>
</pre>
</blockquote>
<p>The <code>gc</code> procedure can be used to force a garbage
collection.</p>
<h3>SLIB integration</h3>
<p>Chris Gaskett generously contributed an init file for Aubrey
Jaffer's SLIB. This is integrated into the test suite (on Linux
and Cygwin).</p>
<h3>Magic Cells</h3>
<p>This scheme interpreter has a facility wherein the evaluator
will call out to an OS layer just before it is about report that a
symbol has no binding. This gives the OS a chance to supply a
binding using its own resources (in the case of VxWorks, the C
symbol table).</p>
<p>How this works is the OS returns a <i>magic cell</i>, which is a
kind of atom that has two function pointers in it, <code>set</code>
and <code>get</code>. The Scheme system stores the magic cell in
the binding table. When VxWorks finds that an unbound Scheme symbol
matches a C variable, it returns one of these cells, which allows
the current value of the variable to be probed whenever Scheme code
calls for the variable to be evaulated.</p>
<p>VxWorks functions are a little more complicated: they actually
return a lambda expression which delegates to a special function
called "vx-invoke" whose job is to dispatch a call to the VxWorks
function after converting the arguments.</p>
<p>But I think magic cells are a neat hack: kind of like java bean
or COM &ldquo;properties.&rdquo; I'm sure it's been done before, of
course, but it's interesting to speculate on other uses for this
trick.</p>
<h3>Debugging Hacks</h3>
<p>There's an interface to supply a &ldquo;flag word&rdquo; to the
Scheme process. On UNIX/Cygwin, the environment variable
&lsquo;T&rsquo; can be set to an integer which sets any of the bits
in the following table. On VxWorks, the global variable
vxSchemeDebug can be set to the desired value using the C shell.
The bit values are:</p>
<ul>
<li><code>0x1:TRACE_EVAL.</code><br>
<p class="item">This shows every step of the register machine in
evaluating its input. This was very useful debugging my
implementation of SICP's register machine. For trenchant problems,
this can create a <i>lot</i> of output.</p>
</li>
<li><code>0x2: TRACE_GC.</code><br>
<p class="item">This one prints out a message when garbage is
collected. The first line shows the state of memory at the start in
the form m/n (where m is the number of cells in use and n is the
number of cells allocated). After GC is finished, these statistics
are printed again.</p>
</li>
<li><code>0x4: DEBUG_NO_INLINE_GC.</code><br>
<p class="item">Garbage collection is a tricky thing. Everything
depends on every useful cell being "reachable" from a "root set" of
machine registers. This implementation of Scheme takes pains to
organize all the computation around a set of abstract machine
registers (unlike other implementations of Scheme which allow the C
stack to contain references to Scheme objects). I know of no GC
bugs at the moment, but setting this flag is helpful to help
implicate/exculpate GC in a debug session: when set, the flag
prevents GC from occurring at memory exhaustion time (it is still
allowed when the top-level expression is complete).</p>
</li>
<li><code>0x8: DEBUG_MEMSTATS_AT_EXIT.</code><br>
<p class="item">The Scheme standard insists on a strict
implementation of tail recursion. Setting this flag causes a
one-line printout of the number of cells in-use/allocated at the
end of execution. This can be used to verify that tail-recursion
and garbage collection are working correctly.</p>
</li>
<li><code>0x10: DEBUG_PRINT_PROCEDURES</code><br>
<p class="item">Without this flag, when printing a value of
procedure type, the printer suppresses the body of the procedure,
but does print the argument list. Usually this is enough to remind
me what procedure is involved. The output looks like this:
<code>#&lt;lambda (a b) ...&gt;</code>. With this flag set, the
body of the procedure is printed as well, instead of an
ellipsis.</p></li>
<li><code>0x20: TRACE_GC_ALL</code><br>
<p class="item">Trace all marking and sweeping activity. This is
only useful if you are tracking down a garbage collection bug. Having
done this a few times, I can only offer my wish that this fate never
befalls you. As of this writing (10 May 2003), I am not aware of
any garbage collection bugs in vx-scheme.</p></li>
</ul>
Example use:
<ul>
<li>
<p class="item">UNIX: <code>T=1 ./vx-scheme &lt;
../testcases/pi.scm</code></p>
</li>
<li>
<p class="item">VxWorks: <code>vxSchemeDebug = 1;</code></p>
</li>
</ul>
<h3>VxWorks Links</h3>
<p>Check out the <a
href="https://web.archive.org/web/20070220163544/http://www.bluedonkey.org/cgi-bin/twiki/bin/view/Books/VxWorksCookBook">VxWorks
Cookbook</a> at John Gordon's site, bluedonkey.org. It's an
excellent collection of VxWorks (and related) embedded technique
assembled by one of the masters.</p>
<h3>Scheme Links</h3>
<ul>
<li><a href="//mit.scheme.org/">The MIT Scheme Homepage</a><br>
<p class="item"><i>The MIT Scheme interpretation compiles directly
to x86 machine code, and is consequently very fast, and has an
integrated debugger and graphics support. Very professional. Runs
on Linux, FreeBSD and Win32.</i></p>
</li>
<li><a href="http://www-mitpress.mit.edu/sicp/">Online text for
SICP</a><br>
<p class="item"><i>Worth the effort to study in detail. The sly
postponement of the discussion of the humble assignment statement
to page 220 of the book is a pedagogic</i>
tour-de-force<i>unequaled in my experience.</i></p>
</li>
<li><a href="http://www.schemers.org">Schemers.org</a><br>
<p class="item"><i>A clearinghouse of net Scheme resources.
Recommended.</i></p>
</li>
</ul>
<h3>Acknowledgments</h3>
<p>Permit me to thank the following people who have helped with
this research:</p>
<ul>
<li>
<p class="item">Benjamin S. Skrainka</p>
</li>
<li>
<p class="item">George V. Neville-Neil</p>
</li>
<li>
<p class="item">Brendan Smith</p>
</li>
<li>
<p class="item">...and my lovely wife Julia, who often wondered what
the point of having a computer day job was if I was just going to
spend the rest of my time hacking in the garage.</p> </li> </ul>
<p class="quiet">Copyright &copy; 2002-2003
Colin Smith.</p>
</td>
</tr>
</table>
</body>
</html>