dream-website/www/index.html

63 lines
5.0 KiB
HTML

<html>
<title>Notes on the Design of the 'dream' Scheme Interpreter</title>
<h1>Notes on the Design of the 'dream' Scheme Interpreter</h1>
Download here for Linux on x86 only: <a href="http://www.stripedgazelle.org/joey/src/dream.tar.gz">dream.tar.gz</a>
<p>
The design for the 'dream' Scheme interpreter began with the design given in Abelson and Sussman's <u>Structure and Interpretation of Computer Programs</u>.
</p>
<h2>Scheme Object Storage and Garbage Collection</h2>
<p>
Two areas of memory of equal size are used for the storage of scheme objects.
Both are aligned on an 8-byte boundary.
Only one of the two is used at a time by the scheme interpreter; when it becomes full, the garbage collector copies all scheme objects in use to the other memory area (which then becomes the active one).
Dynamically allocated scheme objects other than symbols are represented by a discrete number of quad-words which are allocated consecutively within the active memory area.
The simplest of these objects is the scheme pair which consists of two double-words each of which addresses a scheme object.
</p>
<p>
Symbols and statically allocated objects are not garbage-collected, but they must begin on a 2-byte boundary so that the addresses stored in pairs are always divisible by 2.
By virtue of the fact that all scheme objects begin on a 2-byte boundary, scheme objects other than scheme pairs are differentiated from scheme pairs by storing a double-word which is NOT divisible by 2 in the first half of the quad-word.
This double-word represents the type of scheme object.
The low byte of this type represents the major type classification used by the procedures boolean?, pair?, procedure?, char?, number?, symbol?, string?, vector?, input-port?, and output-port?.
Statically allocated objects are given a type which is negative so that the garbage collector can easily ignore them.
All 256 ascii chars, #t and #f, and the end-of-file object are statically allocated.
Only one double-word is necessary for these statically allocated objects.
In the case of chars and booleans, the value is stored in the high byte of the low word.
</p>
<p>
Symbols are also given a negative type, since they are not garbage collected, but they are dynamically created in a separate memory area devoted to them.
Each symbol begins with the double-word type header (on a 2 byte boundary) which is followed by the bytes of ascii code that form the name of the symbol.
A null (0) byte marks the end of the symbol name.
The address of every symbol is stored in an array of double-words.
</p>
<p>
Strings, unlike symbols, are stored along with the other dynamically allocated objects, and use the second double-word to store the address of their string of ascii byte codes (ending with a null byte).
Another pair of memory areas of equal size is used to store these strings of ascii byte codes.
When the active string storage area becomes full, the garbage collector copies in-use string data to the other string storage area (which then becomes the active one).
Otherwise the garbage collector leaves these string storage areas untouched.
</p>
<p>
Vectors are stored as consecutive pairs (but the first half of the first pair is the vector type header.)
All other objects which require more than a quad-word of storage simply store the address of a scheme pair in the second double-word and then use scheme pair and list structure to store everything they need.
These types must set the low bit in the high byte of the low word of their type in order to indicate to the garbage collector that this address in the second double-word must be followed just as if it were the cdr of a pair.
</p>
<p>
Special forms and built-in procedures store an address to JMP to in the second double-word.
</p>
<p>
The scheme object stack is maintained as a scheme list (dynamically allocated as pairs).
The garbage collector, when it runs, begins at the root of this scheme list.
Hence when garbage collection commences, only the registers need be pushed on to this scheme object stack and popped off afterwards to insure that all reachable objects are retained thoughout the garbage collection process.
The x86 native stack is used only for the flow of continuation control.
Consequently when call-with-current-continuation is invoked, the native stack is copied to a scheme list with each address represented as a special form.
</p>
<h2>Scheme Registers</h2>
<p>
The registers denoted by exp, env, unev, argl, val, and free in <u>Structure and Interpretation of Computer Programs</u> are implemented by the x86 registers edx, ebp, esi, edi, eax, and ebx respectively.
The x86 register ecx is left free to be used as temporary storage and need never necessarily point to a valid scheme object.
All other registers except esp must point to a valid scheme object (or null) when the garbage collector is invoked.
The stop and copy garbage collector registers old, new, and scan in <u>Structure and Interpretation of Computer Programs</u> are implemented by the x86 registers esi, edi, and eax respectively.
</p>
<hr>
<a href="http://www.stripedgazelle.org/joey/index.html">Home</a>
</html>