Files in doc and doc/html added :).

2001-05-20 19:01:37 +00:00 · 2001-05-20 19:01:37 +00:00 · 13b5e5e2d7
parent 812784c6bf
commit 13b5e5e2d7
7 changed files with 6091 additions and 0 deletions
--- a/doc/html/index.html
+++ b/doc/html/index.html
@ -0,0 +1,85 @@
+<HTML>
+<HEAD>
+<TITLE>The Scheme Underground Network Package</TITLE>
+</HEAD>
+
+<BODY>
+<H1>The Scheme Underground Network Package</H1>
+I have written a set of libraries for doing Net hacking from Scheme/scsh.
+It includes:
+<DL>
+<DT> An smtp client library.
+<DD> Forge mail from the comfort of your own Scheme process.
+
+<DT> rfc822 header library
+<DD> Read email-style headers. Useful in several contexts (smtp, http, etc.)
+
+<DT> Simple structured HTML output library
+<DD> Balanced delimiters, etc.
+
+<DT> The SU Web server
+<DD> This is a complete implementation of an HTTP 1.0 server in Scheme.
+     The server contains other standalone packages that may separately be of 
+     use:
+     <UL>
+     <LI> URI and URL parsers and unparsers.
+     <LI> A library to help writing CGI scripts in Scheme.
+     <LI> Server extensions for interfacing to CGI scripts.
+     <LI> Server extensions for uploading Scheme code.
+     </UL>
+    The server has three main design goals:
+    <DL>
+    <DT> Extensibility
+    <DD> The server is in fact nothing but extensions, using a mechanism
+	 called "path handlers" to define URL-specific services. It has a toolkit
+	 of services that can be used as-is, extended or built upon.
+	 User extensions have exactly the same status as the base services.
+
+	<P>
+	The extension mechanism allows for easy implementation of new services
+	without the overhead of the CGI interface. Since the server is written
+	on top of the Scheme shell, the full set of Unix system calls and
+	program tools is available to the implementor.
+
+    <DT> Mobile code
+    <DD> The server allows Scheme code to be uploaded for direct execution
+	 inside the server. The server has complete control over the code,
+	 and can safely execute it in restricted environments that do not
+	 provide access to potentially dangerous primitives (such as the
+	 "delete file" procedure.)
+
+
+    <DT> Clarity
+    <DD> I wrote this server to help myself understand the Web. It is voluminously
+	 commented, and I hope it will prove to be an aid in understanding the
+	 low-level details of the Web protocols.
+    </DL>
+
+    <P>
+    The S.U. server has the ability to upload code from Web clients and 
+    execute that code on behalf of the client in a protected environment.
+
+    <P>
+    Some <A HREF="su-httpd.html">simple documentation</A> on the server
+    is available.
+
+</DL>
+
+<H2>Obtaining the system</H2>
+The network code is available by
+<A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz">ftp</A>.
+To run the server, you need our 0.4 release of 
+<A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A>
+which has just been released.
+
+Beyond actually running the server,
+the separate parser libraries and other utilites may be of use as separate
+modules.
+
+<ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A>
+       / <A HREF="plan-file">shivers@ai.mit.edu</A></ADDRESS>
+
+</BODY>
+</HTML>
+
+
--- a/doc/html/su-httpd.html
+++ b/doc/html/su-httpd.html
@ -0,0 +1,482 @@
+<!-- check for *..* emphasis, etc., i.e., e.g. -->
+<HTML>
+<HEAD>
+<TITLE>The Scheme Underground Web system</TITLE>
+</HEAD>
+
+<BODY>
+<H1>The Scheme Underground Web System</H1>
+
+<ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A>
+       / <A HREF="plan-file">shivers@ai.mit.edu</A>
+</ADDRESS>
+July 1995
+
+<BLOCKQUOTE>
+Note: Netscape typesets description lists in a manner that makes the
+procedure descriptions below blur together, even in the absence of the
+HTML COMPACT attribute. You may just wish to print out a simple
+<A HREF="su-httpd.txt">ASCII version</A> of this note, instead.
+</BLOCKQUOTE>
+
+
+
+<!---------------------------------------------------------------------------->
+<H2>Introduction</H2>
+
+The
+<A HREF="http://www.ai.mit.edu/projects/su/su.html">Scheme underground</A>
+Web system is a package of
+<A HREF="http://www-swiss.ai.mit.edu/scheme-home.html">Scheme</A>
+code that provides
+utilities for interacting with the
+<A HREF="http://www.w3.org/">World-Wide Web</A>.
+This includes:
+<UL>
+<LI>  A Web server.
+<LI>  URI and URL parsers and un-parsers.
+<LI>  RFC822-style header parsers.
+<LI>  Code for performing structured html output
+<LI>  Code to assist in writing CGI Scheme programs
+      that can be used by any CGI-compliant HTTP server
+      (such as NCSA's httpd, or the S.U. Web server).
+</UL>
+
+ <P>
+The code can be obtained via
+<A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz">
+anonymous ftp</A>
+and is implemented in
+<A HREF="http://www-swiss.ai.mit.edu/~jar/s48.html">Scheme 48</A>,
+using the system calls and support procedures of
+<A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A>,
+the Scheme Shell.
+The code was written to be clear and modifiable --
+it is voluminously commented and all non-R4RS dependencies are
+described at the beginning of each source file.
+
+ <P>
+I do not have the time to write detailed documentation for these packages.
+However, they are very thoroughly commented, and I strongly recommend
+reading the source files; they were written to be read, and the source
+code comments should provide a clear description of the system.
+The remainder of this note gives an overview of the server's basic
+architecture and interfaces.
+
+<H2>The Scheme Underground Web Server</H2>
+
+The server was designed with three principle goals in mind:
+<DL>
+<DT> Extensibility
+<DD> The server is designed to make it easy to extend the basic
+     functionality.  In fact, the server is nothing but extensions.  There is
+     no distinction between the set of basic services provided by the server
+     implementation and user extensions -- they are both implemented in
+     Scheme, and have equal status. The design is "turtles all the way down."
+
+
+<DT> Mobile code
+<DD> Because the server is written in Scheme 48, it is simple to use the
+     Scheme 48 module system to upload programs to the server for safe
+     execution within a protected, server-chosen environment. The server
+     comes with a simple example upload service to demonstrate this
+     capability.
+
+
+<DT> Clarity of implementation
+<DD> Because the server is written in a high-level language, it should make
+     for a clearer exposition of the HTTP protocol and the associated URL
+     and URI notations than one written in a low-level language such as C.
+     This also should help to make the server easy to modify and adapt to
+     different uses.
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>Basic server structure</H3>
+
+The Web server is started by calling the <CODE>httpd</CODE> procedure,
+which takes one required and two optional arguments:
+<PRE>
+    (httpd <VAR>path-handler</VAR> [<VAR>port</VAR> <VAR>working-directory</VAR>])
+</PRE>
+
+The server accepts connections from the given port, which defaults to 80.
+The server runs with the working directory set to the given value,
+which defaults to
+<PRE>
+    /usr/local/etc/httpd
+</PRE>
+
+
+ <P>
+The server's basic loop is to wait on the port for a connection from an HTTP
+client. When it receives a connection, it reads in and parses the request into
+a special request data structure. Then the server forks a child process, who
+binds the current I/O ports to the connection socket, and then hands off to
+the top-level path handler (the first argument to <CODE>httpd</CODE>).
+The path-handler procedure is responsible for actually serving the request --
+it can be any arbitrary computation.
+Its output goes directly back to the HTTP client that sent the request.
+
+ <P>
+Before calling the path handler to service the request, the HTTP server
+installs an error handler that fields any uncaught error, sends an
+error reply to the client, and aborts the request transaction. Hence
+any error caused by a path-handler will be handled in a reasonable and
+robust fashion.
+
+ <P>
+The basic server loop, and the associated request data structure are the fixed
+architecture of the S.U. Web server; its flexibility lies in the notion of
+path handlers.
+
+
+<!---------------------------------------------------------------------------->
+<H3>Path handlers</H3>
+
+A path handler is a procedure taking two arguments:
+<PRE>
+    (path-handler <VAR>path</VAR> <VAR>req</VAR>)
+</PRE>
+
+
+The <VAR>req</VAR> argument is a request record giving all the details of the
+client's request; it has the following structure:
+<PRE>
+    (define-record request
+      method		; A string such as "GET", "PUT", etc.
+      uri		; The escaped URI string as read from request line.
+      url		; An http URL record (see url.scm).
+      version		; A (major . minor) integer pair.
+      headers		; An rfc822 header alist (see rfc822.scm).
+      socket)		; The socket connected to the client.
+</PRE>
+
+The <VAR>path</VAR> argument is the URL's path,
+parsed and split at slashes into a string list.
+For example, if the Web client dereferences URL
+<PRE>
+    http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz
+</PRE>
+then the server would pass the following path to the top-level handler:
+<PRE>
+    ("h" "shivers" "code" "web.tar.gz")
+</PRE>
+
+ <P>
+The path argument's pre-parsed representation as a string list makes it easy
+for the path handler to implement recursive operations dispatch on URL paths.
+
+ <P>
+Path handlers can do anything they like to respond to HTTP requests; they have
+the full range of Scheme to implement the desired functionality.  When
+handling HTTP requests that have an associated entity body (such as POST), the
+body should be read from the current input port. Path handlers should in all
+cases write their reply to the current output port. Path handlers should
+<EM>not</EM> perform I/O on the request record's socket.
+Path handlers are frequently called recursively, and doing I/O directly to the
+socket might bypass a filtering or other processing step interposed on the
+current I/O ports by some superior path handler.
+
+<!---------------------------------------------------------------------------->
+<H3>Basic path handlers</H3>
+
+Although the user can write any path-handler he likes, the S.U. server comes
+with a useful toolbox of basic path handlers that can be used and built upon:
+
+<DL>
+
+<DT>
+<CODE>(alist-path-dispatcher <VAR>ph-alist</VAR> <VAR>default-ph</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This procedure takes a string->path-handler alist, and a default
+    path handler, and returns a handler that dispatches on its path argument.
+    When the new path handler is applied to a path
+    <CODE>("foo" "bar" "baz")</CODE>,
+    it uses the first element of the path -- <CODE>"foo"</CODE> -- to
+    index into the alist.
+    If it finds an associated path handler in the alist, it
+    hands the request off to that handler, passing it the tail of the
+    path, <CODE>("bar" "baz")</CODE>.
+    On the other hand, if the path is empty, or the alist search does
+    not yield a hit, we hand off to the default path handler,
+    passing it the entire original path, <CODE>("foo" "bar" "baz")</CODE>.
+
+    <P>
+    This procedure is how you say: "If the first element of the URL's path
+    is `foo', do X; if it's `bar', do Y; otherwise, do Z." If one takes
+    an object-oriented view of the process, an alist path-handler does
+    method lookup on the requested operation, dispatching off to the
+    appropriate method defined for the URL.
+
+    <P>
+    The slash-delimited URI path structure implies an associated
+    tree of names. The path-handler system and the alist dispatcher
+    allow you to procedurally define the server's response to any arbitrary
+    subtree of the path space.
+
+    <P>
+    Example: <br>
+    A typical top-level path handler is
+
+<PRE>
+  (define ph
+    (alist-path-dispatcher
+	`(("h"       . ,(home-dir-handler "public_html"))
+	  ("cgi-bin" . ,(cgi-handler "/usr/local/etc/httpd/cgi-bin"))
+	  ("seval"   . ,seval-handler))
+	(rooted-file-handler "/usr/local/etc/httpd/htdocs")))
+</PRE>
+
+    This means:
+<UL>
+<LI> If the path looks like <CODE>("h" "shivers" "code" "web.tar.gz")</CODE>,
+     pass the path <CODE>("shivers" "code" "web.tar.gz")</CODE> to a
+     home-directory path handler.
+
+
+<LI> If the path looks like <CODE>("cgi-bin" "calendar")</CODE>,
+     pass <CODE>("calendar")</CODE> off to the CGI path handler.
+
+
+<LI> If the path looks like <CODE>("seval" ...)</CODE>,
+     the tail of the path is passed off to the code-uploading seval
+     path handler.
+
+<LI> Otherwise, the whole path is passed to a rooted file handler, who
+     will convert it into a filename, rooted at
+     <CODE>/usr/local/etc/httpd/htdocs</CODE>, and serve that file.
+</UL>
+
+
+<DT> <CODE>(home-dir-handler <VAR>subdir</VAR>) ->
+           <VAR>path-handler</CODE></VAR>
+<DD>
+    This procedure builds a path handler that does basic file serving
+    out of home directories. If the resulting path handler is passed
+    a path of <CODE>(<VAR>user</VAR> . <VAR>file-path</VAR>)</CODE>,
+    then it serves the file
+<PRE>
+    <VAR>user's-home-directory</VAR>/<VAR>subdir</VAR>/<VAR>file-path</VAR>
+</PRE>
+    The path handler only handles GET requests; the filename is not
+    allowed to contain <CODE>..</CODE> elements.
+
+
+<DT>
+<CODE>(tilde-home-dir-handler <VAR>subdir</VAR> <VAR>default-path-handler</VAR>)
+       -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This path handler examines the car of the path. If it is a string
+    beginning with a tilde, <em>e.g.</em>, "<CODE>~ziggy</CODE>",
+    then the string is taken
+    to mean a home directory, and the request is served similarly to a
+    <CODE>home-dir-handler</CODE> path handler.
+    Otherwise, the request is passed off
+    in its entirety to the default path handler.
+
+    <P>
+    This procedure is useful for implementing servers that provide the
+    semantics of the NCSA httpd server.
+
+
+<DT>
+<CODE>(cgi-handler <VAR>cgi-directory</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This procedure returns a path-handler that passes the request off to some
+    program using the CGI interface. The script name is taken from the
+    car of the path; it is checked for occurrences of <CODE>..</CODE>'s.
+    If the path is
+<PRE>
+    ("my-prog" "foo" "bar")
+</PRE>
+    then the program executed is
+<PRE>
+    <VAR>cgi-directory</VAR>/my-prog
+</PRE>
+    <P>
+    When the CGI path handler builds the process environment for the
+    CGI script, several elements
+    (<em>e.g.</em>, <CODE>$PATH</CODE> and <CODE>$SERVER_SOFTWARE</CODE>)
+    are request-invariant, and can be computed at server start-up time.
+    This can be done by calling
+<PRE>
+    (initialise-request-invariant-cgi-env)
+</PRE>
+    when the server starts up. This is <EM>not</EM> necessary,
+    but will make CGI requests a little faster.
+
+
+<DT>
+<CODE>(rooted-file-handler <VAR>root-dir</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    Returns a path handler that serves files from a particular root
+    in the file system. Only the GET operation is provided. The path
+    argument passed to the handler is converted into a filename,
+    and appended to <VAR>root-dir</VAR>.
+    The file name is checked for <CODE>..</CODE> components,
+    and the transaction is aborted if it does. Otherwise, the file is
+    served to the client.
+
+<DT>
+<CODE>(null-path-handler <VAR>path</VAR> <VAR>req</VAR>)</CODE>
+<DD>
+    This path handler is useful as a default handler. It handles no requests,
+    always returning a "404 Not found" reply to the client.
+
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>HTTP errors</H3>
+
+Authors of path-handlers need to be able to handle errors in a reasonably
+simple fashion. The S.U. Web server provides a set of error conditions that
+correspond to the error replies in the HTTP protocol. These errors can be
+raised with the <CODE>http-error</CODE> procedure.
+When the server runs a path handler,
+it runs it in the context of an error handler that catches these errors,
+sends an error reply to the client, and closes the transaction.
+
+<DL>
+
+<DT>
+<CODE>(http-error <VAR>reply-code</VAR> <VAR>req</VAR> [<VAR>extra</VAR> ...])</CODE>
+<DD>
+    This raises an http error condition. The reply code is one of the
+    numeric HTTP error reply codes, which are bound to the variables
+    <CODE>http-reply/ok</CODE>, <CODE>http-reply/not-found</CODE>,
+    <CODE>http-reply/bad-request</CODE>, and so
+    forth. The <VAR>req</VAR> argument is the request record that caused
+    the error.
+    Any following <VAR>extra</VAR> args are passed along for
+    informational purposes.
+    Different HTTP errors take different types of extra arguments.
+    For example, the "301 moved permanently" and "302 moved temporarily"
+    replies use the first two <VAR>extra</VAR> values as the
+    <CODE>URI:</CODE> and <CODE>Location:</CODE>
+    fields in the reply header, respectively. See the clauses of the
+    <CODE>send-http-error-reply</CODE> procedure for details.
+
+
+<DT>
+<CODE>(send-http-error-reply <VAR>reply-code</VAR> <VAR>request</VAR>
+                             [<VAR>extra</VAR> ...])
+</CODE>
+<DD>
+    This procedure writes an error reply out to the current output
+    port. If an error occurs during this process, it is caught, and
+    the procedure silently returns. The http server's standard error
+    handler passes all http errors raised during path-handler execution
+    to this procedure to generate the error reply before aborting the
+    request transaction.
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>Simple directory generation</H3>
+
+Most path-handlers that serve files to clients eventually call an internal
+procedure named <CODE>file-serve</CODE>,
+which implements a simple directory-generation service using the
+following rules:
+<UL>
+<LI> If the filename has the <EM>form</EM> of a directory
+     (<EM>i.e.</EM>, it ends with a slash),
+     then <CODE>file-serve</CODE> actually looks for a
+     file named "<CODE>index.html</CODE>" in that directory.
+
+<LI> If the filename names a directory, but is not in directory form
+      (<EM>i.e.</EM>, it doesn't end in a slash,
+      as in "<CODE>/usr/include</CODE>" or "<CODE>/usr/raj</CODE>"),
+      then <CODE>file-serve</CODE> sends back a "301 moved permanently"
+      message,
+      redirecting the client to a slash-terminated version of the original
+      URL. For example, the URL
+<PRE>
+    http://clark.lcs.mit.edu/~shivers
+</PRE>
+      would be redirected to
+<PRE>
+    http://clark.lcs.mit.edu/~shivers/
+</PRE>
+
+<LI> If the filename names a regular file, it is served to the client.
+</UL>
+
+
+<!---------------------------------------------------------------------------->
+<H3>Support procs</H3>
+
+The source files contain a host of support procedures which will be of utility
+to anyone writing a custom path-handler. Read the files first.
+
+
+<!---------------------------------------------------------------------------->
+<H3>Losing</H3>
+
+Be aware of two Unix problems, which may require workarounds:
+<OL>
+
+<LI>
+   NeXTSTEP's Posix implementation of the <CODE>getpwnam()</CODE> routine
+   will silently tell you that every user has uid 0. This means
+   that if your server, running as root, does a
+<PRE>
+    (set-uid (user->uid "nobody"))
+</PRE>
+   it will essentially do a
+<PRE>
+    (set-uid 0)
+</PRE>
+   and you will thus still be running as root.
+
+   <P>
+   The fix is to manually find out who user nobody is (he's -2 on my
+   system), and to hard-wire this into the server:
+<PRE>
+    (set-uid -2)
+</PRE>
+   This problem is NeXTSTEP specific. If you are using not using NeXTSTEP,
+   no problem.
+
+
+<LI>
+   On NeXTSTEP, the ip-address->host-name translation routine
+   (in C, <CODE>gethostbyaddr()</CODE>; in scsh,
+   <CODE>(host-info addr)</CODE>) does not
+   use the DNS system; it goes through NeXT's propietary Netinfo
+   system, and may not return a fully-qualified domain name. For
+   example, on my system, I get "amelia-earhart", when I want
+   "amelia-earhart.lcs.mit.edu". Since the server uses this name
+   to construct redirection URL's to be sent back to the Web client,
+   they need to be FQDN's.
+
+   <P>
+   This problem may occur on other OS's;
+   I cannot determine if <CODE>gethostbyaddr()</CODE>
+   is required to return a FQDN or not. (I would appreciate hearing the
+   answer if you know; my local Internet guru's couldn't tell me.)
+
+   <P>
+   If your system doesn't give you a complete Internet address when
+   you say
+<PRE>
+    (host-info:name (host-info (system-name)))
+</PRE>
+   then you have this problem.
+
+   <P>
+   The server has a workaround. There is a procedure exported from
+   the httpd-core package:
+<PRE>
+    (set-my-fqdn name)
+</PRE>
+   Call this to crow-bar the server's idea of its own Internet host name
+   before running the server, and all will be well.
+</OL>
+
+</BODY>
+</HTML>
--- a/doc/rfc2396.txt
+++ b/doc/rfc2396.txt
--- a/doc/rfc822.scm.doc
+++ b/doc/rfc822.scm.doc
@ -0,0 +1,161 @@
+This file documents names defined in rfc822.scm:
+
+
+
+
+NOTES
+
+
+
+A note on line-terminators:
+
+Line-terminating sequences are always a drag, because there's no
+agreement on them -- the Net protocols and DOS use cr/lf; Unix uses
+lf; the Mac uses cr. One one hand, you'd like to use the code for all
+of the above, on the other, you'd also like to use the code for strict
+applications that need definitely not to recognise bare cr's or lf's
+as terminators.
+
+RFC 822 requires a cr/lf (carriage-return/line-feed) pair to terminate
+lines of text. On the other hand, careful perusal of the text shows up
+some ambiguities (there are maybe three or four of these, and I'm too
+lazy to write them all down). Furthermore, it is an unfortunate fact
+that many Unix apps separate lines of RFC 822 text with simple
+linefeeds (e.g., messages kept in /usr/spool/mail). As a result, this
+code takes a broad-minded view of line-terminators: lines can be
+terminated by either cr/lf or just lf, and either terminating sequence
+is trimmed.
+
+If you need stricter parsing, you can call the lower-level procedure
+%READ-RFC-822-FIELD and %READ-RFC822-HEADERS procs. They take the
+read-line procedure as an extra parameter. This means that you can
+pass in a procedure that recognises only cr/lf's, or only cr's (for a
+Mac app, perhaps), and you can determine whether or not the
+terminators get trimmed. However, your read-line procedure must
+indicate the header-terminating empty line by returning *either* the
+empty string or the two-char string cr/lf (or the EOF object).
+
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+
+(read-rfc822-field [port])
+(%read-rfc822-field read-line port)
+
+Read one field from the port, and return two values [NAME BODY]:
+
+ - NAME	 Symbol such as 'subject or 'to. The field name is converted
+         to a symbol using the Scheme implementation's preferred
+         case. If the implementation reads symbols in a case-sensitive
+         fashion (e.g., scsh), lowercase is used. This means you can
+         compare these symbols to quoted constants using EQ?. When
+         printing these field names out, it looks best if you capitalise
+         them with (CAPITALIZE-STRING (SYMBOL->STRING FIELD-NAME)).
+
+ - BODY	 List of strings which are the field's body, e.g. 
+         ("shivers@lcs.mit.edu"). Each list element is one line from
+         the field's body, so if the field spreads out over three lines,
+         then the body is a list of three strings. The terminating
+         cr/lf's are trimmed from each string. A leading space or a
+         leading horizontal tab is also trimmed, but one and onyl one.
+
+When there are no more fields -- EOF or a blank line has terminated
+the header section -- then the procedure returns [#f #f].
+ 
+The %READ-RFC822-FIELD variant allows you to specify your own
+read-line procedure. The one used by READ-RFC822-FIELD terminates
+lines with either cr/lf or just lf, and it trims the terminator from
+the line. Your read-line procedure should trim the terminator of the
+line, so an empty line is returned as an empty string.
+
+The procedures raise an error if the syntax of the read field (the
+line returned by the read-line-function) is illegal (RFC822 illegal).
+
+
+
+read-rfc822-headers [port]
+%read-rfc822-headers read-line port
+
+Read in and parse up a section of text that looks like the header
+portion of an RFC 822 message. Return an alist mapping a field name (a
+symbol such as 'date or 'subject) to a list of field bodies -- one for
+each occurence of the field in the header. So if there are five
+"Received-by:" fields in the header, the alist maps 'received-by to a
+five element list. Each body is in turn represented by a list of
+strings -- one for each line of the field. So a field spread across
+three lines would produce a three element body.
+
+The %READ-RFC822-HEADERS variant allows you to specify your own
+read-line procedure. See notes (A note on line-terminators) above for
+reasons why.
+
+
+
+rejoin-header-lines alist [seperator] 
+
+Takes a field alist such as is returned by READ-RFC822-HEADERS and
+returns an equivalent alist. Each body (string list) in the input
+alist is joined into a single list in the output alist. SEPARATOR is
+the string used to join these elements together; it defaults to a
+single space " ", but can usefully be "\n" or "\r\n".
+
+To rejoin a single body list, use scsh's JOIN-STRINGS procedure.
+
+
+
+For the following definitions' examples, let's use this set of of
+RFC822 headers:
+     From: shivers
+     To: ziggy,
+       newts
+     To: gjs, tk
+
+
+
+get-header-all headers name
+
+returns all entries or #f, p.e.
+(get-header-all hdrs 'to)   -> ((" ziggy," " newts") (" gjs, tk"))
+
+
+
+get-header-lines headers name
+
+returns all lines of the first entry or #f, p.e.
+(get-header-lines hdrs 'to) -> (" ziggy," " newts")
+
+
+
+get-headers headers name [seperator]
+
+returns the first entry with the lines joined together by seperator
+(newline by default (\n)), p.e.
+(get-header hdrs 'to)       -> "ziggy,\n newts"
+
+
+
+htab
+
+is the horizontal tab (ascii-code 9)
+
+
+
+string->symbol-pref
+
+is a procedure that takes a string and converts it to a symbol
+using the Scheme implementation's preferred case. The preferred case
+is recognized by a doing a symbol->string conversion of 'a.
+
+
+
+
+DESIREABLE FUNCTIONALITIES
+
+ - Unfolding long lines.
+ - Lexing structured fields.
+ - Unlexing structured fields into canonical form.
+ - Parsing and unparsing dates.
+ - Parsing and unparsing addresses.
--- a/doc/rfc822.txt
+++ b/doc/rfc822.txt
--- a/doc/uri.scm.doc
+++ b/doc/uri.scm.doc
@ -0,0 +1,150 @@
+This file documents names specified in uri.scm.
+
+
+
+
+NOTES
+
+URIs are of following syntax:
+
+[scheme] : path [? search ] [# fragmentid]
+
+Parts in [] may be ommitted. The last part is usually referred to as
+fragid in this document. 
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+char-set
+uri-reserved
+
+A list of reserved characters (semicolon, slash, hash, question mark,
+double colon and space).
+
+procedure 
+parse-uri uri-string --> (scheme, path, search, frag-id)
+
+Multiple-value return: scheme, path, search, frag-id, in this
+order. scheme, search and frag-id are either #f or a string. path is a
+nonempty list of strings. An empty path is a list containing the empty
+string. parse-uri tries to be tolerant of the various ways people build broken URIs out there on the Net (so it is not absolutely conform with RFC 1630).
+
+
+procedure
+unescape-uri string [start [end]] --> string
+
+Unescapes a string. This procedure should only be used *after* the url
+(!)  was parsed, since unescaping may introduce characters that blow
+up the parse (that's why escape sequences are used in URIs ;).
+Escape-sequences are of following scheme: %hh where h is a hexadecimal
+digit. E.g. %20 is space (ASCII character 32).
+
+
+procedure
+hex-digit? character --> boolean
+
+Returns #t if character is a hexadecimal digit (i.e., one of 1-9, a-f,
+A-F), #f otherwise.
+
+
+procedure
+hexchar->int character --> number
+
+Translates the given character to an integer, p.e. (hexchar->int \#a)
+=> 10.
+
+
+procedure
+int->hexchar integer --> character
+
+Translates the given integer from range 1-15 into an hexadecimal
+character (uses uppercase letters), p.e. (int->hexchar 14) => E. 
+
+
+char-set
+uri-escaped-chars
+
+A set of characters that are escaped in URIs. These are the following
+characters: dollar ($), minus (-), underscore (_), at (@), dot (.),
+and-sign (&), exclamation mark (!), asterisk (*), backslash (\),
+double quote ("), single quote ('), open brace ((), close brace ()),
+comma (,) plus (+) and all other characters that are neither letters
+nor digits (such as space and control characters).
+
+
+procedure
+escape-uri string [escaped-chars] --> string
+
+Escapes characters of string that are given with escaped-chars.
+escaped-chars default to uri-escaped-chars. Be careful with using this
+procedure to chunks of text with syntactically meaningful reserved
+characters (e.g., paths with URI slashes or colons) -- they'll be
+escaped, and lose their special meaning. E.g. it would be a mistake to
+apply escape-uri to "//lcs.mit.edu:8001/foo/bar.html" because the
+slashes and colons would be escaped. Note that esacpe-uri doesn't
+check this as it would lose his meaning.
+
+
+procedure
+resolve-uri cscheme cp scheme p --> (scheme, path)
+
+Sorry, I can't figure out what resolve-uri is inteded to do. Perhaps
+I find it out later.
+
+The code seems to have a bug: In the body of receive, there's a
+loop. j should, according to the comment, count sequential /. But j
+counts nothing in the body. Either zero is added ((lp (cdr cp-tail)
+(cons (car cp-tail) rhead) (+ j 0))) or j is set to 1 ((lp (cdr
+cp-tail) (cons (car cp-tail) rhead) 1))). Nevertheless, j is expected
+to reach value numsl that can be larger than one. So what? I am
+confused.
+
+
+procedure
+rev-append list-a list-b --> list
+
+Performs a (append (reverse list-a) list-b). The comment says it
+should be defined in a list package but I am wondering how often this
+will be used.
+
+
+procedure
+split-uri-path uri start end --> list
+
+Splits uri at /'s. Only the substring given with start (inclusive) and
+end (exclusive) is considered. Start and end - 1 have to be within the
+range of the uri-string.  Otherwise an index-out-of-range exception
+will be raised. Example: (split-uri-path "foo/bar/colon" 4 11) ==>
+'("bar" "col")
+
+
+procedure
+simplify-uri-path path --> list
+
+Removes "." and ".." entries from path. The result is a (maybe empty)
+list representing a path that does not contain any "." or "..". The
+list can only be empty if the path did not start with "/" (for the
+rare occasion someone wants to simplify a relative path). The result
+is #f if the path tries to back up past root, for example by "/.." or
+"/foo/../.." or just "..". "//" may occur somewhere in the path
+referring to root but not being backed up.
+Examples: 
+(simplify-uri-path (split-uri-path "/foo/bar/baz/.." 0 15))
+==> '("" "foo" "bar")  
+
+(simplify-uri-path (split-uri-path "foo/bar/baz/../../.." 0 20))
+==> '()
+
+(simplify-uri-path (split-uri-path "/foo/../.." 0 10))
+==> #f          ; tried to back up root
+
+(simplify-uri-path (split-uri-path "foo/bar//" 0 9))
+==> '("")       ; "//" refers to root
+
+(simplify-uri-path (split-uri-path "foo/bar/" 0 8))
+==> '("")       ; last "/" also refers to root
+
+(simplify-uri-path (split-uri-path "/foo/bar//baz/../.." 0 19))
+==> #f          ; tries to back up root
--- a/doc/url.scm.doc
+++ b/doc/url.scm.doc
@ -0,0 +1,69 @@
+This file documents names defined in url.scm
+
+
+
+
+NOTES
+
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+userhost                           record
+
+A record containing the fields user, password, host and port. Created
+by parsing a string like //<user>:<password>@<host>:<port>/. The
+record describes path-prefixes of the form
+//<user>:<password>@<host>:<port>/ These are frequently used as the
+initial prefix of URL's describing Internet resources.
+
+
+parse-userhost path default
+
+Parse a URI path (a list representing a path, not a string!) into a
+userhost record. Default values are taken from the userhost record
+DEFAULT except for the host. Returns a userhost record if it wins, and
+#f if it cannot parse the path. It is an error if the specified path
+does not begin with '//..' like noted at userhost.
+
+
+userhost-escaped-chars             list
+
+The union of uri-escaped-chars and the characters '@' and ':'. Used
+for the unparser.
+
+
+userhost->string userhost          procedure
+
+Unparses a userhost record to a string.
+
+
+http-url                           record
+
+Record containing the fields userhost (a userhost record), path (a
+path list), search and frag-id. The PATH slot of this record is the
+URL's path split at slashes, e.g., "foo/bar//baz/" => ("foo" "bar" ""
+"baz" ""). These elements are in raw, unescaped format. To convert
+back to a string, use (uri-path-list->path (map escape-uri pathlist)).
+
+
+parse-http-url path search frag-id       procedure
+
+Returns a http-url record. path, search and frag-id are results of a
+parse-uri call on the initial uri. See there (uri.scm) for further
+details. search and frag-id are stored as they are. This parser
+decodes the path elements. It is an error if the path specifies an
+user or a password as this is not allowd at http-urls.
+
+
+default-http-userhost                    record
+
+A userhost record that specifies the port as 80 and anything else as
+#f.
+
+
+http-url->string http-url
+
+Unparses the given http-url to a string.