diff --git a/doc/html/index.html b/doc/html/index.html
new file mode 100644
index 0000000..0b8e359
--- /dev/null
+++ b/doc/html/index.html
@@ -0,0 +1,85 @@
+<HTML>
+<HEAD>
+<TITLE>The Scheme Underground Network Package</TITLE>
+</HEAD>
+
+<BODY>
+<H1>The Scheme Underground Network Package</H1>
+I have written a set of libraries for doing Net hacking from Scheme/scsh.
+It includes:
+<DL>
+<DT> An smtp client library.
+<DD> Forge mail from the comfort of your own Scheme process.
+
+<DT> rfc822 header library
+<DD> Read email-style headers. Useful in several contexts (smtp, http, etc.)
+
+<DT> Simple structured HTML output library
+<DD> Balanced delimiters, etc.
+
+<DT> The SU Web server
+<DD> This is a complete implementation of an HTTP 1.0 server in Scheme.
+     The server contains other standalone packages that may separately be of 
+     use:
+     <UL>
+     <LI> URI and URL parsers and unparsers.
+     <LI> A library to help writing CGI scripts in Scheme.
+     <LI> Server extensions for interfacing to CGI scripts.
+     <LI> Server extensions for uploading Scheme code.
+     </UL>
+    The server has three main design goals:
+    <DL>
+    <DT> Extensibility
+    <DD> The server is in fact nothing but extensions, using a mechanism
+	 called "path handlers" to define URL-specific services. It has a toolkit
+	 of services that can be used as-is, extended or built upon.
+	 User extensions have exactly the same status as the base services.
+
+	<P>
+	The extension mechanism allows for easy implementation of new services
+	without the overhead of the CGI interface. Since the server is written
+	on top of the Scheme shell, the full set of Unix system calls and
+	program tools is available to the implementor.
+
+    <DT> Mobile code
+    <DD> The server allows Scheme code to be uploaded for direct execution
+	 inside the server. The server has complete control over the code,
+	 and can safely execute it in restricted environments that do not
+	 provide access to potentially dangerous primitives (such as the
+	 "delete file" procedure.)
+
+
+    <DT> Clarity
+    <DD> I wrote this server to help myself understand the Web. It is voluminously
+	 commented, and I hope it will prove to be an aid in understanding the
+	 low-level details of the Web protocols.
+    </DL>
+
+    <P>
+    The S.U. server has the ability to upload code from Web clients and 
+    execute that code on behalf of the client in a protected environment.
+
+    <P>
+    Some <A HREF="su-httpd.html">simple documentation</A> on the server
+    is available.
+
+</DL>
+
+<H2>Obtaining the system</H2>
+The network code is available by
+<A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz">ftp</A>.
+To run the server, you need our 0.4 release of 
+<A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A>
+which has just been released.
+
+Beyond actually running the server,
+the separate parser libraries and other utilites may be of use as separate
+modules.
+
+<ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A>
+       / <A HREF="plan-file">shivers@ai.mit.edu</A></ADDRESS>
+
+</BODY>
+</HTML>
+
+
diff --git a/doc/html/su-httpd.html b/doc/html/su-httpd.html
new file mode 100644
index 0000000..356aa37
--- /dev/null
+++ b/doc/html/su-httpd.html
@@ -0,0 +1,482 @@
+<!-- check for *..* emphasis, etc., i.e., e.g. -->
+<HTML>
+<HEAD>
+<TITLE>The Scheme Underground Web system</TITLE>
+</HEAD>
+
+<BODY>
+<H1>The Scheme Underground Web System</H1>
+
+<ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A>
+       / <A HREF="plan-file">shivers@ai.mit.edu</A>
+</ADDRESS>
+July 1995
+
+<BLOCKQUOTE>
+Note: Netscape typesets description lists in a manner that makes the
+procedure descriptions below blur together, even in the absence of the
+HTML COMPACT attribute. You may just wish to print out a simple
+<A HREF="su-httpd.txt">ASCII version</A> of this note, instead.
+</BLOCKQUOTE>
+
+
+
+<!---------------------------------------------------------------------------->
+<H2>Introduction</H2>
+
+The
+<A HREF="http://www.ai.mit.edu/projects/su/su.html">Scheme underground</A>
+Web system is a package of
+<A HREF="http://www-swiss.ai.mit.edu/scheme-home.html">Scheme</A>
+code that provides
+utilities for interacting with the
+<A HREF="http://www.w3.org/">World-Wide Web</A>.
+This includes:
+<UL>
+<LI>  A Web server.
+<LI>  URI and URL parsers and un-parsers.
+<LI>  RFC822-style header parsers.
+<LI>  Code for performing structured html output
+<LI>  Code to assist in writing CGI Scheme programs
+      that can be used by any CGI-compliant HTTP server
+      (such as NCSA's httpd, or the S.U. Web server).
+</UL>
+
+ <P>
+The code can be obtained via
+<A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz">
+anonymous ftp</A>
+and is implemented in
+<A HREF="http://www-swiss.ai.mit.edu/~jar/s48.html">Scheme 48</A>,
+using the system calls and support procedures of
+<A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A>,
+the Scheme Shell.
+The code was written to be clear and modifiable --
+it is voluminously commented and all non-R4RS dependencies are
+described at the beginning of each source file.
+
+ <P>
+I do not have the time to write detailed documentation for these packages.
+However, they are very thoroughly commented, and I strongly recommend
+reading the source files; they were written to be read, and the source
+code comments should provide a clear description of the system.
+The remainder of this note gives an overview of the server's basic
+architecture and interfaces.
+
+<H2>The Scheme Underground Web Server</H2>
+
+The server was designed with three principle goals in mind:
+<DL>
+<DT> Extensibility
+<DD> The server is designed to make it easy to extend the basic
+     functionality.  In fact, the server is nothing but extensions.  There is
+     no distinction between the set of basic services provided by the server
+     implementation and user extensions -- they are both implemented in
+     Scheme, and have equal status. The design is "turtles all the way down."
+
+
+<DT> Mobile code
+<DD> Because the server is written in Scheme 48, it is simple to use the
+     Scheme 48 module system to upload programs to the server for safe
+     execution within a protected, server-chosen environment. The server
+     comes with a simple example upload service to demonstrate this
+     capability.
+
+
+<DT> Clarity of implementation
+<DD> Because the server is written in a high-level language, it should make
+     for a clearer exposition of the HTTP protocol and the associated URL
+     and URI notations than one written in a low-level language such as C.
+     This also should help to make the server easy to modify and adapt to
+     different uses.
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>Basic server structure</H3>
+
+The Web server is started by calling the <CODE>httpd</CODE> procedure,
+which takes one required and two optional arguments:
+<PRE>
+    (httpd <VAR>path-handler</VAR> [<VAR>port</VAR> <VAR>working-directory</VAR>])
+</PRE>
+
+The server accepts connections from the given port, which defaults to 80.
+The server runs with the working directory set to the given value,
+which defaults to
+<PRE>
+    /usr/local/etc/httpd
+</PRE>
+
+
+ <P>
+The server's basic loop is to wait on the port for a connection from an HTTP
+client. When it receives a connection, it reads in and parses the request into
+a special request data structure. Then the server forks a child process, who
+binds the current I/O ports to the connection socket, and then hands off to
+the top-level path handler (the first argument to <CODE>httpd</CODE>).
+The path-handler procedure is responsible for actually serving the request --
+it can be any arbitrary computation.
+Its output goes directly back to the HTTP client that sent the request.
+
+ <P>
+Before calling the path handler to service the request, the HTTP server
+installs an error handler that fields any uncaught error, sends an
+error reply to the client, and aborts the request transaction. Hence
+any error caused by a path-handler will be handled in a reasonable and
+robust fashion.
+
+ <P>
+The basic server loop, and the associated request data structure are the fixed
+architecture of the S.U. Web server; its flexibility lies in the notion of
+path handlers.
+
+
+<!---------------------------------------------------------------------------->
+<H3>Path handlers</H3>
+
+A path handler is a procedure taking two arguments:
+<PRE>
+    (path-handler <VAR>path</VAR> <VAR>req</VAR>)
+</PRE>
+
+
+The <VAR>req</VAR> argument is a request record giving all the details of the
+client's request; it has the following structure:
+<PRE>
+    (define-record request
+      method		; A string such as "GET", "PUT", etc.
+      uri		; The escaped URI string as read from request line.
+      url		; An http URL record (see url.scm).
+      version		; A (major . minor) integer pair.
+      headers		; An rfc822 header alist (see rfc822.scm).
+      socket)		; The socket connected to the client.
+</PRE>
+
+The <VAR>path</VAR> argument is the URL's path,
+parsed and split at slashes into a string list.
+For example, if the Web client dereferences URL
+<PRE>
+    http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz
+</PRE>
+then the server would pass the following path to the top-level handler:
+<PRE>
+    ("h" "shivers" "code" "web.tar.gz")
+</PRE>
+
+ <P>
+The path argument's pre-parsed representation as a string list makes it easy
+for the path handler to implement recursive operations dispatch on URL paths.
+
+ <P>
+Path handlers can do anything they like to respond to HTTP requests; they have
+the full range of Scheme to implement the desired functionality.  When
+handling HTTP requests that have an associated entity body (such as POST), the
+body should be read from the current input port. Path handlers should in all
+cases write their reply to the current output port. Path handlers should
+<EM>not</EM> perform I/O on the request record's socket.
+Path handlers are frequently called recursively, and doing I/O directly to the
+socket might bypass a filtering or other processing step interposed on the
+current I/O ports by some superior path handler.
+
+<!---------------------------------------------------------------------------->
+<H3>Basic path handlers</H3>
+
+Although the user can write any path-handler he likes, the S.U. server comes
+with a useful toolbox of basic path handlers that can be used and built upon:
+
+<DL>
+
+<DT>
+<CODE>(alist-path-dispatcher <VAR>ph-alist</VAR> <VAR>default-ph</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This procedure takes a string->path-handler alist, and a default
+    path handler, and returns a handler that dispatches on its path argument.
+    When the new path handler is applied to a path
+    <CODE>("foo" "bar" "baz")</CODE>,
+    it uses the first element of the path -- <CODE>"foo"</CODE> -- to
+    index into the alist.
+    If it finds an associated path handler in the alist, it
+    hands the request off to that handler, passing it the tail of the
+    path, <CODE>("bar" "baz")</CODE>.
+    On the other hand, if the path is empty, or the alist search does
+    not yield a hit, we hand off to the default path handler,
+    passing it the entire original path, <CODE>("foo" "bar" "baz")</CODE>.
+
+    <P>
+    This procedure is how you say: "If the first element of the URL's path
+    is `foo', do X; if it's `bar', do Y; otherwise, do Z." If one takes
+    an object-oriented view of the process, an alist path-handler does
+    method lookup on the requested operation, dispatching off to the
+    appropriate method defined for the URL.
+
+    <P>
+    The slash-delimited URI path structure implies an associated
+    tree of names. The path-handler system and the alist dispatcher
+    allow you to procedurally define the server's response to any arbitrary
+    subtree of the path space.
+
+    <P>
+    Example: <br>
+    A typical top-level path handler is
+
+<PRE>
+  (define ph
+    (alist-path-dispatcher
+	`(("h"       . ,(home-dir-handler "public_html"))
+	  ("cgi-bin" . ,(cgi-handler "/usr/local/etc/httpd/cgi-bin"))
+	  ("seval"   . ,seval-handler))
+	(rooted-file-handler "/usr/local/etc/httpd/htdocs")))
+</PRE>
+
+    This means:
+<UL>
+<LI> If the path looks like <CODE>("h" "shivers" "code" "web.tar.gz")</CODE>,
+     pass the path <CODE>("shivers" "code" "web.tar.gz")</CODE> to a
+     home-directory path handler.
+
+
+<LI> If the path looks like <CODE>("cgi-bin" "calendar")</CODE>,
+     pass <CODE>("calendar")</CODE> off to the CGI path handler.
+
+
+<LI> If the path looks like <CODE>("seval" ...)</CODE>,
+     the tail of the path is passed off to the code-uploading seval
+     path handler.
+
+<LI> Otherwise, the whole path is passed to a rooted file handler, who
+     will convert it into a filename, rooted at
+     <CODE>/usr/local/etc/httpd/htdocs</CODE>, and serve that file.
+</UL>
+
+
+<DT> <CODE>(home-dir-handler <VAR>subdir</VAR>) ->
+           <VAR>path-handler</CODE></VAR>
+<DD>
+    This procedure builds a path handler that does basic file serving
+    out of home directories. If the resulting path handler is passed
+    a path of <CODE>(<VAR>user</VAR> . <VAR>file-path</VAR>)</CODE>,
+    then it serves the file
+<PRE>
+    <VAR>user's-home-directory</VAR>/<VAR>subdir</VAR>/<VAR>file-path</VAR>
+</PRE>
+    The path handler only handles GET requests; the filename is not
+    allowed to contain <CODE>..</CODE> elements.
+
+
+<DT>
+<CODE>(tilde-home-dir-handler <VAR>subdir</VAR> <VAR>default-path-handler</VAR>)
+       -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This path handler examines the car of the path. If it is a string
+    beginning with a tilde, <em>e.g.</em>, "<CODE>~ziggy</CODE>",
+    then the string is taken
+    to mean a home directory, and the request is served similarly to a
+    <CODE>home-dir-handler</CODE> path handler.
+    Otherwise, the request is passed off
+    in its entirety to the default path handler.
+
+    <P>
+    This procedure is useful for implementing servers that provide the
+    semantics of the NCSA httpd server.
+
+
+<DT>
+<CODE>(cgi-handler <VAR>cgi-directory</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    This procedure returns a path-handler that passes the request off to some
+    program using the CGI interface. The script name is taken from the
+    car of the path; it is checked for occurrences of <CODE>..</CODE>'s.
+    If the path is
+<PRE>
+    ("my-prog" "foo" "bar")
+</PRE>
+    then the program executed is
+<PRE>
+    <VAR>cgi-directory</VAR>/my-prog
+</PRE>
+    <P>
+    When the CGI path handler builds the process environment for the
+    CGI script, several elements
+    (<em>e.g.</em>, <CODE>$PATH</CODE> and <CODE>$SERVER_SOFTWARE</CODE>)
+    are request-invariant, and can be computed at server start-up time.
+    This can be done by calling
+<PRE>
+    (initialise-request-invariant-cgi-env)
+</PRE>
+    when the server starts up. This is <EM>not</EM> necessary,
+    but will make CGI requests a little faster.
+
+
+<DT>
+<CODE>(rooted-file-handler <VAR>root-dir</VAR>) -> <VAR>path-handler</VAR>
+</CODE>
+<DD>
+    Returns a path handler that serves files from a particular root
+    in the file system. Only the GET operation is provided. The path
+    argument passed to the handler is converted into a filename,
+    and appended to <VAR>root-dir</VAR>.
+    The file name is checked for <CODE>..</CODE> components,
+    and the transaction is aborted if it does. Otherwise, the file is
+    served to the client.
+
+<DT>
+<CODE>(null-path-handler <VAR>path</VAR> <VAR>req</VAR>)</CODE>
+<DD>
+    This path handler is useful as a default handler. It handles no requests,
+    always returning a "404 Not found" reply to the client.
+
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>HTTP errors</H3>
+
+Authors of path-handlers need to be able to handle errors in a reasonably
+simple fashion. The S.U. Web server provides a set of error conditions that
+correspond to the error replies in the HTTP protocol. These errors can be
+raised with the <CODE>http-error</CODE> procedure.
+When the server runs a path handler,
+it runs it in the context of an error handler that catches these errors,
+sends an error reply to the client, and closes the transaction.
+
+<DL>
+
+<DT>
+<CODE>(http-error <VAR>reply-code</VAR> <VAR>req</VAR> [<VAR>extra</VAR> ...])</CODE>
+<DD>
+    This raises an http error condition. The reply code is one of the
+    numeric HTTP error reply codes, which are bound to the variables
+    <CODE>http-reply/ok</CODE>, <CODE>http-reply/not-found</CODE>,
+    <CODE>http-reply/bad-request</CODE>, and so
+    forth. The <VAR>req</VAR> argument is the request record that caused
+    the error.
+    Any following <VAR>extra</VAR> args are passed along for
+    informational purposes.
+    Different HTTP errors take different types of extra arguments.
+    For example, the "301 moved permanently" and "302 moved temporarily"
+    replies use the first two <VAR>extra</VAR> values as the
+    <CODE>URI:</CODE> and <CODE>Location:</CODE>
+    fields in the reply header, respectively. See the clauses of the
+    <CODE>send-http-error-reply</CODE> procedure for details.
+
+
+<DT>
+<CODE>(send-http-error-reply <VAR>reply-code</VAR> <VAR>request</VAR>
+                             [<VAR>extra</VAR> ...])
+</CODE>
+<DD>
+    This procedure writes an error reply out to the current output
+    port. If an error occurs during this process, it is caught, and
+    the procedure silently returns. The http server's standard error
+    handler passes all http errors raised during path-handler execution
+    to this procedure to generate the error reply before aborting the
+    request transaction.
+</DL>
+
+<!---------------------------------------------------------------------------->
+<H3>Simple directory generation</H3>
+
+Most path-handlers that serve files to clients eventually call an internal
+procedure named <CODE>file-serve</CODE>,
+which implements a simple directory-generation service using the
+following rules:
+<UL>
+<LI> If the filename has the <EM>form</EM> of a directory
+     (<EM>i.e.</EM>, it ends with a slash),
+     then <CODE>file-serve</CODE> actually looks for a
+     file named "<CODE>index.html</CODE>" in that directory.
+
+<LI> If the filename names a directory, but is not in directory form
+      (<EM>i.e.</EM>, it doesn't end in a slash,
+      as in "<CODE>/usr/include</CODE>" or "<CODE>/usr/raj</CODE>"),
+      then <CODE>file-serve</CODE> sends back a "301 moved permanently"
+      message,
+      redirecting the client to a slash-terminated version of the original
+      URL. For example, the URL
+<PRE>
+    http://clark.lcs.mit.edu/~shivers
+</PRE>
+      would be redirected to
+<PRE>
+    http://clark.lcs.mit.edu/~shivers/
+</PRE>
+
+<LI> If the filename names a regular file, it is served to the client.
+</UL>
+
+
+<!---------------------------------------------------------------------------->
+<H3>Support procs</H3>
+
+The source files contain a host of support procedures which will be of utility
+to anyone writing a custom path-handler. Read the files first.
+
+
+<!---------------------------------------------------------------------------->
+<H3>Losing</H3>
+
+Be aware of two Unix problems, which may require workarounds:
+<OL>
+
+<LI>
+   NeXTSTEP's Posix implementation of the <CODE>getpwnam()</CODE> routine
+   will silently tell you that every user has uid 0. This means
+   that if your server, running as root, does a
+<PRE>
+    (set-uid (user->uid "nobody"))
+</PRE>
+   it will essentially do a
+<PRE>
+    (set-uid 0)
+</PRE>
+   and you will thus still be running as root.
+
+   <P>
+   The fix is to manually find out who user nobody is (he's -2 on my
+   system), and to hard-wire this into the server:
+<PRE>
+    (set-uid -2)
+</PRE>
+   This problem is NeXTSTEP specific. If you are using not using NeXTSTEP,
+   no problem.
+
+
+<LI>
+   On NeXTSTEP, the ip-address->host-name translation routine
+   (in C, <CODE>gethostbyaddr()</CODE>; in scsh,
+   <CODE>(host-info addr)</CODE>) does not
+   use the DNS system; it goes through NeXT's propietary Netinfo
+   system, and may not return a fully-qualified domain name. For
+   example, on my system, I get "amelia-earhart", when I want
+   "amelia-earhart.lcs.mit.edu". Since the server uses this name
+   to construct redirection URL's to be sent back to the Web client,
+   they need to be FQDN's.
+
+   <P>
+   This problem may occur on other OS's;
+   I cannot determine if <CODE>gethostbyaddr()</CODE>
+   is required to return a FQDN or not. (I would appreciate hearing the
+   answer if you know; my local Internet guru's couldn't tell me.)
+
+   <P>
+   If your system doesn't give you a complete Internet address when
+   you say
+<PRE>
+    (host-info:name (host-info (system-name)))
+</PRE>
+   then you have this problem.
+
+   <P>
+   The server has a workaround. There is a procedure exported from
+   the httpd-core package:
+<PRE>
+    (set-my-fqdn name)
+</PRE>
+   Call this to crow-bar the server's idea of its own Internet host name
+   before running the server, and all will be well.
+</OL>
+
+</BODY>
+</HTML>
diff --git a/doc/rfc2396.txt b/doc/rfc2396.txt
new file mode 100644
index 0000000..5bd5211
--- /dev/null
+++ b/doc/rfc2396.txt
@@ -0,0 +1,2243 @@
+
+
+
+
+
+
+Network Working Group                                     T. Berners-Lee
+Request for Comments: 2396                                       MIT/LCS
+Updates: 1808, 1738                                          R. Fielding
+Category: Standards Track                                    U.C. Irvine
+                                                             L. Masinter
+                                                       Xerox Corporation
+                                                             August 1998
+
+
+           Uniform Resource Identifiers (URI): Generic Syntax
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (1998).  All Rights Reserved.
+
+IESG Note
+
+   This paper describes a "superset" of operations that can be applied
+   to URI.  It consists of both a grammar and a description of basic
+   functionality for URI.  To understand what is a valid URI, both the
+   grammar and the associated description have to be studied.  Some of
+   the functionality described is not applicable to all URI schemes, and
+   some operations are only possible when certain media types are
+   retrieved using the URI, regardless of the scheme used.
+
+Abstract
+
+   A Uniform Resource Identifier (URI) is a compact string of characters
+   for identifying an abstract or physical resource.  This document
+   defines the generic syntax of URI, including both absolute and
+   relative forms, and guidelines for their use; it revises and replaces
+   the generic definitions in RFC 1738 and RFC 1808.
+
+   This document defines a grammar that is a superset of all valid URI,
+   such that an implementation can parse the common components of a URI
+   reference without knowing the scheme-specific requirements of every
+   possible identifier type.  This document does not define a generative
+   grammar for URI; that task will be performed by the individual
+   specifications of each URI scheme.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 1]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+1. Introduction
+
+   Uniform Resource Identifiers (URI) provide a simple and extensible
+   means for identifying a resource.  This specification of URI syntax
+   and semantics is derived from concepts introduced by the World Wide
+   Web global information initiative, whose use of such objects dates
+   from 1990 and is described in "Universal Resource Identifiers in WWW"
+   [RFC1630].  The specification of URI is designed to meet the
+   recommendations laid out in "Functional Recommendations for Internet
+   Resource Locators" [RFC1736] and "Functional Requirements for Uniform
+   Resource Names" [RFC1737].
+
+   This document updates and merges "Uniform Resource Locators"
+   [RFC1738] and "Relative Uniform Resource Locators" [RFC1808] in order
+   to define a single, generic syntax for all URI.  It excludes those
+   portions of RFC 1738 that defined the specific syntax of individual
+   URL schemes; those portions will be updated as separate documents, as
+   will the process for registration of new URI schemes.  This document
+   does not discuss the issues and recommendation for dealing with
+   characters outside of the US-ASCII character set [ASCII]; those
+   recommendations are discussed in a separate document.
+
+   All significant changes from the prior RFCs are noted in Appendix G.
+
+1.1 Overview of URI
+
+   URI are characterized by the following definitions:
+
+      Uniform
+         Uniformity provides several benefits: it allows different types
+         of resource identifiers to be used in the same context, even
+         when the mechanisms used to access those resources may differ;
+         it allows uniform semantic interpretation of common syntactic
+         conventions across different types of resource identifiers; it
+         allows introduction of new types of resource identifiers
+         without interfering with the way that existing identifiers are
+         used; and, it allows the identifiers to be reused in many
+         different contexts, thus permitting new applications or
+         protocols to leverage a pre-existing, large, and widely-used
+         set of resource identifiers.
+
+      Resource
+         A resource can be anything that has identity.  Familiar
+         examples include an electronic document, an image, a service
+         (e.g., "today's weather report for Los Angeles"), and a
+         collection of other resources.  Not all resources are network
+         "retrievable"; e.g., human beings, corporations, and bound
+         books in a library can also be considered resources.
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 2]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+         The resource is the conceptual mapping to an entity or set of
+         entities, not necessarily the entity which corresponds to that
+         mapping at any particular instance in time.  Thus, a resource
+         can remain constant even when its content---the entities to
+         which it currently corresponds---changes over time, provided
+         that the conceptual mapping is not changed in the process.
+
+      Identifier
+         An identifier is an object that can act as a reference to
+         something that has identity.  In the case of URI, the object is
+         a sequence of characters with a restricted syntax.
+
+   Having identified a resource, a system may perform a variety of
+   operations on the resource, as might be characterized by such words
+   as `access', `update', `replace', or `find attributes'.
+
+1.2. URI, URL, and URN
+
+   A URI can be further classified as a locator, a name, or both.  The
+   term "Uniform Resource Locator" (URL) refers to the subset of URI
+   that identify resources via a representation of their primary access
+   mechanism (e.g., their network "location"), rather than identifying
+   the resource by name or by some other attribute(s) of that resource.
+   The term "Uniform Resource Name" (URN) refers to the subset of URI
+   that are required to remain globally unique and persistent even when
+   the resource ceases to exist or becomes unavailable.
+
+   The URI scheme (Section 3.1) defines the namespace of the URI, and
+   thus may further restrict the syntax and semantics of identifiers
+   using that scheme.  This specification defines those elements of the
+   URI syntax that are either required of all URI schemes or are common
+   to many URI schemes.  It thus defines the syntax and semantics that
+   are needed to implement a scheme-independent parsing mechanism for
+   URI references, such that the scheme-dependent handling of a URI can
+   be postponed until the scheme-dependent semantics are needed.  We use
+   the term URL below when describing syntax or semantics that only
+   apply to locators.
+
+   Although many URL schemes are named after protocols, this does not
+   imply that the only way to access the URL's resource is via the named
+   protocol.  Gateways, proxies, caches, and name resolution services
+   might be used to access some resources, independent of the protocol
+   of their origin, and the resolution of some URL may require the use
+   of more than one protocol (e.g., both DNS and HTTP are typically used
+   to access an "http" URL's resource when it can't be found in a local
+   cache).
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 3]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   A URN differs from a URL in that it's primary purpose is persistent
+   labeling of a resource with an identifier.  That identifier is drawn
+   from one of a set of defined namespaces, each of which has its own
+   set name structure and assignment procedures.  The "urn" scheme has
+   been reserved to establish the requirements for a standardized URN
+   namespace, as defined in "URN Syntax" [RFC2141] and its related
+   specifications.
+
+   Most of the examples in this specification demonstrate URL, since
+   they allow the most varied use of the syntax and often have a
+   hierarchical namespace.  A parser of the URI syntax is capable of
+   parsing both URL and URN references as a generic URI; once the scheme
+   is determined, the scheme-specific parsing can be performed on the
+   generic URI components.  In other words, the URI syntax is a superset
+   of the syntax of all URI schemes.
+
+1.3. Example URI
+
+   The following examples illustrate URI that are in common use.
+
+   ftp://ftp.is.co.za/rfc/rfc1808.txt
+      -- ftp scheme for File Transfer Protocol services
+
+   gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles
+      -- gopher scheme for Gopher and Gopher+ Protocol services
+
+   http://www.math.uio.no/faq/compression-faq/part1.html
+      -- http scheme for Hypertext Transfer Protocol services
+
+   mailto:mduerst@ifi.unizh.ch
+      -- mailto scheme for electronic mail addresses
+
+   news:comp.infosystems.www.servers.unix
+      -- news scheme for USENET news groups and articles
+
+   telnet://melvyl.ucop.edu/
+      -- telnet scheme for interactive services via the TELNET Protocol
+
+1.4. Hierarchical URI and Relative Forms
+
+   An absolute identifier refers to a resource independent of the
+   context in which the identifier is used.  In contrast, a relative
+   identifier refers to a resource by describing the difference within a
+   hierarchical namespace between the current context and an absolute
+   identifier of the resource.
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 4]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Some URI schemes support a hierarchical naming system, where the
+   hierarchy of the name is denoted by a "/" delimiter separating the
+   components in the scheme. This document defines a scheme-independent
+   `relative' form of URI reference that can be used in conjunction with
+   a `base' URI (of a hierarchical scheme) to produce another URI. The
+   syntax of hierarchical URI is described in Section 3; the relative
+   URI calculation is described in Section 5.
+
+1.5. URI Transcribability
+
+   The URI syntax was designed with global transcribability as one of
+   its main concerns. A URI is a sequence of characters from a very
+   limited set, i.e. the letters of the basic Latin alphabet, digits,
+   and a few special characters.  A URI may be represented in a variety
+   of ways: e.g., ink on paper, pixels on a screen, or a sequence of
+   octets in a coded character set.  The interpretation of a URI depends
+   only on the characters used and not how those characters are
+   represented in a network protocol.
+
+   The goal of transcribability can be described by a simple scenario.
+   Imagine two colleagues, Sam and Kim, sitting in a pub at an
+   international conference and exchanging research ideas.  Sam asks Kim
+   for a location to get more information, so Kim writes the URI for the
+   research site on a napkin.  Upon returning home, Sam takes out the
+   napkin and types the URI into a computer, which then retrieves the
+   information to which Kim referred.
+
+   There are several design concerns revealed by the scenario:
+
+      o  A URI is a sequence of characters, which is not always
+         represented as a sequence of octets.
+
+      o  A URI may be transcribed from a non-network source, and thus
+         should consist of characters that are most likely to be able to
+         be typed into a computer, within the constraints imposed by
+         keyboards (and related input devices) across languages and
+         locales.
+
+      o  A URI often needs to be remembered by people, and it is easier
+         for people to remember a URI when it consists of meaningful
+         components.
+
+   These design concerns are not always in alignment.  For example, it
+   is often the case that the most meaningful name for a URI component
+   would require characters that cannot be typed into some systems.  The
+   ability to transcribe the resource identifier from one medium to
+   another was considered more important than having its URI consist of
+   the most meaningful of components.  In local and regional contexts
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 5]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   and with improving technology, users might benefit from being able to
+   use a wider range of characters; such use is not defined in this
+   document.
+
+1.6. Syntax Notation and Common Elements
+
+   This document uses two conventions to describe and define the syntax
+   for URI.  The first, called the layout form, is a general description
+   of the order of components and component separators, as in
+
+      <first>/<second>;<third>?<fourth>
+
+   The component names are enclosed in angle-brackets and any characters
+   outside angle-brackets are literal separators.  Whitespace should be
+   ignored.  These descriptions are used informally and do not define
+   the syntax requirements.
+
+   The second convention is a BNF-like grammar, used to define the
+   formal URI syntax.  The grammar is that of [RFC822], except that "|"
+   is used to designate alternatives.  Briefly, rules are separated from
+   definitions by an equal "=", indentation is used to continue a rule
+   definition over more than one line, literals are quoted with "",
+   parentheses "(" and ")" are used to group elements, optional elements
+   are enclosed in "[" and "]" brackets, and elements may be preceded
+   with <n>* to designate n or more repetitions of the following
+   element; n defaults to 0.
+
+   Unlike many specifications that use a BNF-like grammar to define the
+   bytes (octets) allowed by a protocol, the URI grammar is defined in
+   terms of characters.  Each literal in the grammar corresponds to the
+   character it represents, rather than to the octet encoding of that
+   character in any particular coded character set.  How a URI is
+   represented in terms of bits and bytes on the wire is dependent upon
+   the character encoding of the protocol used to transport it, or the
+   charset of the document which contains it.
+
+   The following definitions are common to many elements:
+
+      alpha    = lowalpha | upalpha
+
+      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
+                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
+                 "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
+
+      upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
+                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
+                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 6]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
+                 "8" | "9"
+
+      alphanum = alpha | digit
+
+   The complete URI syntax is collected in Appendix A.
+
+2. URI Characters and Escape Sequences
+
+   URI consist of a restricted set of characters, primarily chosen to
+   aid transcribability and usability both in computer systems and in
+   non-computer communications. Characters used conventionally as
+   delimiters around URI were excluded.  The restricted set of
+   characters consists of digits, letters, and a few graphic symbols
+   were chosen from those common to most of the character encodings and
+   input facilities available to Internet users.
+
+      uric          = reserved | unreserved | escaped
+
+   Within a URI, characters are either used as delimiters, or to
+   represent strings of data (octets) within the delimited portions.
+   Octets are either represented directly by a character (using the US-
+   ASCII character for that octet [ASCII]) or by an escape encoding.
+   This representation is elaborated below.
+
+2.1 URI and non-ASCII characters
+
+   The relationship between URI and characters has been a source of
+   confusion for characters that are not part of US-ASCII. To describe
+   the relationship, it is useful to distinguish between a "character"
+   (as a distinguishable semantic entity) and an "octet" (an 8-bit
+   byte). There are two mappings, one from URI characters to octets, and
+   a second from octets to original characters:
+
+   URI character sequence->octet sequence->original character sequence
+
+   A URI is represented as a sequence of characters, not as a sequence
+   of octets. That is because URI might be "transported" by means that
+   are not through a computer network, e.g., printed on paper, read over
+   the radio, etc.
+
+   A URI scheme may define a mapping from URI characters to octets;
+   whether this is done depends on the scheme. Commonly, within a
+   delimited component of a URI, a sequence of characters may be used to
+   represent a sequence of octets. For example, the character "a"
+   represents the octet 97 (decimal), while the character sequence "%",
+   "0", "a" represents the octet 10 (decimal).
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 7]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   There is a second translation for some resources: the sequence of
+   octets defined by a component of the URI is subsequently used to
+   represent a sequence of characters. A 'charset' defines this mapping.
+   There are many charsets in use in Internet protocols. For example,
+   UTF-8 [UTF-8] defines a mapping from sequences of octets to sequences
+   of characters in the repertoire of ISO 10646.
+
+   In the simplest case, the original character sequence contains only
+   characters that are defined in US-ASCII, and the two levels of
+   mapping are simple and easily invertible: each 'original character'
+   is represented as the octet for the US-ASCII code for it, which is,
+   in turn, represented as either the US-ASCII character, or else the
+   "%" escape sequence for that octet.
+
+   For original character sequences that contain non-ASCII characters,
+   however, the situation is more difficult. Internet protocols that
+   transmit octet sequences intended to represent character sequences
+   are expected to provide some way of identifying the charset used, if
+   there might be more than one [RFC2277].  However, there is currently
+   no provision within the generic URI syntax to accomplish this
+   identification. An individual URI scheme may require a single
+   charset, define a default charset, or provide a way to indicate the
+   charset used.
+
+   It is expected that a systematic treatment of character encoding
+   within URI will be developed as a future modification of this
+   specification.
+
+2.2. Reserved Characters
+
+   Many URI include components consisting of or delimited by, certain
+   special characters.  These characters are called "reserved", since
+   their usage within the URI component is limited to their reserved
+   purpose.  If the data for a URI component would conflict with the
+   reserved purpose, then the conflicting data must be escaped before
+   forming the URI.
+
+      reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
+                    "$" | ","
+
+   The "reserved" syntax class above refers to those characters that are
+   allowed within a URI, but which may not be allowed within a
+   particular component of the generic URI syntax; they are used as
+   delimiters of the components described in Section 3.
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 8]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Characters in the "reserved" set are not reserved in all contexts.
+   The set of characters actually reserved within any given URI
+   component is defined by that component. In general, a character is
+   reserved if the semantics of the URI changes if the character is
+   replaced with its escaped US-ASCII encoding.
+
+2.3. Unreserved Characters
+
+   Data characters that are allowed in a URI but do not have a reserved
+   purpose are called unreserved.  These include upper and lower case
+   letters, decimal digits, and a limited set of punctuation marks and
+   symbols.
+
+      unreserved  = alphanum | mark
+
+      mark        = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
+
+   Unreserved characters can be escaped without changing the semantics
+   of the URI, but this should not be done unless the URI is being used
+   in a context that does not allow the unescaped character to appear.
+
+2.4. Escape Sequences
+
+   Data must be escaped if it does not have a representation using an
+   unreserved character; this includes data that does not correspond to
+   a printable character of the US-ASCII coded character set, or that
+   corresponds to any US-ASCII character that is disallowed, as
+   explained below.
+
+2.4.1. Escaped Encoding
+
+   An escaped octet is encoded as a character triplet, consisting of the
+   percent character "%" followed by the two hexadecimal digits
+   representing the octet code. For example, "%20" is the escaped
+   encoding for the US-ASCII space character.
+
+      escaped     = "%" hex hex
+      hex         = digit | "A" | "B" | "C" | "D" | "E" | "F" |
+                            "a" | "b" | "c" | "d" | "e" | "f"
+
+2.4.2. When to Escape and Unescape
+
+   A URI is always in an "escaped" form, since escaping or unescaping a
+   completed URI might change its semantics.  Normally, the only time
+   escape encodings can safely be made is when the URI is being created
+   from its component parts; each component may have its own set of
+   characters that are reserved, so only the mechanism responsible for
+   generating or interpreting that component can determine whether or
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 9]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   not escaping a character will change its semantics. Likewise, a URI
+   must be separated into its components before the escaped characters
+   within those components can be safely decoded.
+
+   In some cases, data that could be represented by an unreserved
+   character may appear escaped; for example, some of the unreserved
+   "mark" characters are automatically escaped by some systems.  If the
+   given URI scheme defines a canonicalization algorithm, then
+   unreserved characters may be unescaped according to that algorithm.
+   For example, "%7e" is sometimes used instead of "~" in an http URL
+   path, but the two are equivalent for an http URL.
+
+   Because the percent "%" character always has the reserved purpose of
+   being the escape indicator, it must be escaped as "%25" in order to
+   be used as data within a URI.  Implementers should be careful not to
+   escape or unescape the same string more than once, since unescaping
+   an already unescaped string might lead to misinterpreting a percent
+   data character as another escaped character, or vice versa in the
+   case of escaping an already escaped string.
+
+2.4.3. Excluded US-ASCII Characters
+
+   Although they are disallowed within the URI syntax, we include here a
+   description of those US-ASCII characters that have been excluded and
+   the reasons for their exclusion.
+
+   The control characters in the US-ASCII coded character set are not
+   used within a URI, both because they are non-printable and because
+   they are likely to be misinterpreted by some control mechanisms.
+
+   control     = <US-ASCII coded characters 00-1F and 7F hexadecimal>
+
+   The space character is excluded because significant spaces may
+   disappear and insignificant spaces may be introduced when URI are
+   transcribed or typeset or subjected to the treatment of word-
+   processing programs.  Whitespace is also used to delimit URI in many
+   contexts.
+
+   space       = <US-ASCII coded character 20 hexadecimal>
+
+   The angle-bracket "<" and ">" and double-quote (") characters are
+   excluded because they are often used as the delimiters around URI in
+   text documents and protocol fields.  The character "#" is excluded
+   because it is used to delimit a URI from a fragment identifier in URI
+   references (Section 4). The percent character "%" is excluded because
+   it is used for the encoding of escaped characters.
+
+   delims      = "<" | ">" | "#" | "%" | <">
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 10]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Other characters are excluded because gateways and other transport
+   agents are known to sometimes modify such characters, or they are
+   used as delimiters.
+
+   unwise      = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
+
+   Data corresponding to excluded characters must be escaped in order to
+   be properly represented within a URI.
+
+3. URI Syntactic Components
+
+   The URI syntax is dependent upon the scheme.  In general, absolute
+   URI are written as follows:
+
+      <scheme>:<scheme-specific-part>
+
+   An absolute URI contains the name of the scheme being used (<scheme>)
+   followed by a colon (":") and then a string (the <scheme-specific-
+   part>) whose interpretation depends on the scheme.
+
+   The URI syntax does not require that the scheme-specific-part have
+   any general structure or set of semantics which is common among all
+   URI.  However, a subset of URI do share a common syntax for
+   representing hierarchical relationships within the namespace.  This
+   "generic URI" syntax consists of a sequence of four main components:
+
+      <scheme>://<authority><path>?<query>
+
+   each of which, except <scheme>, may be absent from a particular URI.
+   For example, some URI schemes do not allow an <authority> component,
+   and others do not use a <query> component.
+
+      absoluteURI   = scheme ":" ( hier_part | opaque_part )
+
+   URI that are hierarchical in nature use the slash "/" character for
+   separating hierarchical components.  For some file systems, a "/"
+   character (used to denote the hierarchical structure of a URI) is the
+   delimiter used to construct a file name hierarchy, and thus the URI
+   path will look similar to a file pathname.  This does NOT imply that
+   the resource is a file or that the URI maps to an actual filesystem
+   pathname.
+
+      hier_part     = ( net_path | abs_path ) [ "?" query ]
+
+      net_path      = "//" authority [ abs_path ]
+
+      abs_path      = "/"  path_segments
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 11]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   URI that do not make use of the slash "/" character for separating
+   hierarchical components are considered opaque by the generic URI
+   parser.
+
+      opaque_part   = uric_no_slash *uric
+
+      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
+                      "&" | "=" | "+" | "$" | ","
+
+   We use the term <path> to refer to both the <abs_path> and
+   <opaque_part> constructs, since they are mutually exclusive for any
+   given URI and can be parsed as a single component.
+
+3.1. Scheme Component
+
+   Just as there are many different methods of access to resources,
+   there are a variety of schemes for identifying such resources.  The
+   URI syntax consists of a sequence of components separated by reserved
+   characters, with the first component defining the semantics for the
+   remainder of the URI string.
+
+   Scheme names consist of a sequence of characters beginning with a
+   lower case letter and followed by any combination of lower case
+   letters, digits, plus ("+"), period ("."), or hyphen ("-").  For
+   resiliency, programs interpreting URI should treat upper case letters
+   as equivalent to lower case in scheme names (e.g., allow "HTTP" as
+   well as "http").
+
+      scheme        = alpha *( alpha | digit | "+" | "-" | "." )
+
+   Relative URI references are distinguished from absolute URI in that
+   they do not begin with a scheme name.  Instead, the scheme is
+   inherited from the base URI, as described in Section 5.2.
+
+3.2. Authority Component
+
+   Many URI schemes include a top hierarchical element for a naming
+   authority, such that the namespace defined by the remainder of the
+   URI is governed by that authority.  This authority component is
+   typically defined by an Internet-based server or a scheme-specific
+   registry of naming authorities.
+
+      authority     = server | reg_name
+
+   The authority component is preceded by a double slash "//" and is
+   terminated by the next slash "/", question-mark "?", or by the end of
+   the URI.  Within the authority component, the characters ";", ":",
+   "@", "?", and "/" are reserved.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 12]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   An authority component is not required for a URI scheme to make use
+   of relative references.  A base URI without an authority component
+   implies that any relative reference will also be without an authority
+   component.
+
+3.2.1. Registry-based Naming Authority
+
+   The structure of a registry-based naming authority is specific to the
+   URI scheme, but constrained to the allowed characters for an
+   authority component.
+
+      reg_name      = 1*( unreserved | escaped | "$" | "," |
+                          ";" | ":" | "@" | "&" | "=" | "+" )
+
+3.2.2. Server-based Naming Authority
+
+   URL schemes that involve the direct use of an IP-based protocol to a
+   specified server on the Internet use a common syntax for the server
+   component of the URI's scheme-specific data:
+
+      <userinfo>@<host>:<port>
+
+   where <userinfo> may consist of a user name and, optionally, scheme-
+   specific information about how to gain authorization to access the
+   server.  The parts "<userinfo>@" and ":<port>" may be omitted.
+
+      server        = [ [ userinfo "@" ] hostport ]
+
+   The user information, if present, is followed by a commercial at-sign
+   "@".
+
+      userinfo      = *( unreserved | escaped |
+                         ";" | ":" | "&" | "=" | "+" | "$" | "," )
+
+   Some URL schemes use the format "user:password" in the userinfo
+   field. This practice is NOT RECOMMENDED, because the passing of
+   authentication information in clear text (such as URI) has proven to
+   be a security risk in almost every case where it has been used.
+
+   The host is a domain name of a network host, or its IPv4 address as a
+   set of four decimal digit groups separated by ".".  Literal IPv6
+   addresses are not supported.
+
+      hostport      = host [ ":" port ]
+      host          = hostname | IPv4address
+      hostname      = *( domainlabel "." ) toplabel [ "." ]
+      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
+      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 13]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
+      port          = *digit
+
+   Hostnames take the form described in Section 3 of [RFC1034] and
+   Section 2.1 of [RFC1123]: a sequence of domain labels separated by
+   ".", each domain label starting and ending with an alphanumeric
+   character and possibly also containing "-" characters.  The rightmost
+   domain label of a fully qualified domain name will never start with a
+   digit, thus syntactically distinguishing domain names from IPv4
+   addresses, and may be followed by a single "." if it is necessary to
+   distinguish between the complete domain name and any local domain.
+   To actually be "Uniform" as a resource locator, a URL hostname should
+   be a fully qualified domain name.  In practice, however, the host
+   component may be a local domain literal.
+
+      Note: A suitable representation for including a literal IPv6
+      address as the host part of a URL is desired, but has not yet been
+      determined or implemented in practice.
+
+   The port is the network port number for the server.  Most schemes
+   designate protocols that have a default port number.  Another port
+   number may optionally be supplied, in decimal, separated from the
+   host by a colon.  If the port is omitted, the default port number is
+   assumed.
+
+3.3. Path Component
+
+   The path component contains data, specific to the authority (or the
+   scheme if there is no authority component), identifying the resource
+   within the scope of that scheme and authority.
+
+      path          = [ abs_path | opaque_part ]
+
+      path_segments = segment *( "/" segment )
+      segment       = *pchar *( ";" param )
+      param         = *pchar
+
+      pchar         = unreserved | escaped |
+                      ":" | "@" | "&" | "=" | "+" | "$" | ","
+
+   The path may consist of a sequence of path segments separated by a
+   single slash "/" character.  Within a path segment, the characters
+   "/", ";", "=", and "?" are reserved.  Each path segment may include a
+   sequence of parameters, indicated by the semicolon ";" character.
+   The parameters are not significant to the parsing of relative
+   references.
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 14]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+3.4. Query Component
+
+   The query component is a string of information to be interpreted by
+   the resource.
+
+      query         = *uric
+
+   Within a query component, the characters ";", "/", "?", ":", "@",
+   "&", "=", "+", ",", and "$" are reserved.
+
+4. URI References
+
+   The term "URI-reference" is used here to denote the common usage of a
+   resource identifier.  A URI reference may be absolute or relative,
+   and may have additional information attached in the form of a
+   fragment identifier.  However, "the URI" that results from such a
+   reference includes only the absolute URI after the fragment
+   identifier (if any) is removed and after any relative URI is resolved
+   to its absolute form.  Although it is possible to limit the
+   discussion of URI syntax and semantics to that of the absolute
+   result, most usage of URI is within general URI references, and it is
+   impossible to obtain the URI from such a reference without also
+   parsing the fragment and resolving the relative form.
+
+      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
+
+   The syntax for relative URI is a shortened form of that for absolute
+   URI, where some prefix of the URI is missing and certain path
+   components ("." and "..") have a special meaning when, and only when,
+   interpreting a relative path.  The relative URI syntax is defined in
+   Section 5.
+
+4.1. Fragment Identifier
+
+   When a URI reference is used to perform a retrieval action on the
+   identified resource, the optional fragment identifier, separated from
+   the URI by a crosshatch ("#") character, consists of additional
+   reference information to be interpreted by the user agent after the
+   retrieval action has been successfully completed.  As such, it is not
+   part of a URI, but is often used in conjunction with a URI.
+
+      fragment      = *uric
+
+   The semantics of a fragment identifier is a property of the data
+   resulting from a retrieval action, regardless of the type of URI used
+   in the reference.  Therefore, the format and interpretation of
+   fragment identifiers is dependent on the media type [RFC2046] of the
+   retrieval result.  The character restrictions described in Section 2
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 15]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   for URI also apply to the fragment in a URI-reference.  Individual
+   media types may define additional restrictions or structure within
+   the fragment for specifying different types of "partial views" that
+   can be identified within that media type.
+
+   A fragment identifier is only meaningful when a URI reference is
+   intended for retrieval and the result of that retrieval is a document
+   for which the identified fragment is consistently defined.
+
+4.2. Same-document References
+
+   A URI reference that does not contain a URI is a reference to the
+   current document.  In other words, an empty URI reference within a
+   document is interpreted as a reference to the start of that document,
+   and a reference containing only a fragment identifier is a reference
+   to the identified fragment of that document.  Traversal of such a
+   reference should not result in an additional retrieval action.
+   However, if the URI reference occurs in a context that is always
+   intended to result in a new request, as in the case of HTML's FORM
+   element, then an empty URI reference represents the base URI of the
+   current document and should be replaced by that URI when transformed
+   into a request.
+
+4.3. Parsing a URI Reference
+
+   A URI reference is typically parsed according to the four main
+   components and fragment identifier in order to determine what
+   components are present and whether the reference is relative or
+   absolute.  The individual components are then parsed for their
+   subparts and, if not opaque, to verify their validity.
+
+   Although the BNF defines what is allowed in each component, it is
+   ambiguous in terms of differentiating between an authority component
+   and a path component that begins with two slash characters.  The
+   greedy algorithm is used for disambiguation: the left-most matching
+   rule soaks up as much of the URI reference string as it is capable of
+   matching.  In other words, the authority component wins.
+
+   Readers familiar with regular expressions should see Appendix B for a
+   concrete parsing example and test oracle.
+
+5. Relative URI References
+
+   It is often the case that a group or "tree" of documents has been
+   constructed to serve a common purpose; the vast majority of URI in
+   these documents point to resources within the tree rather than
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 16]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   outside of it.  Similarly, documents located at a particular site are
+   much more likely to refer to other resources at that site than to
+   resources at remote sites.
+
+   Relative addressing of URI allows document trees to be partially
+   independent of their location and access scheme.  For instance, it is
+   possible for a single set of hypertext documents to be simultaneously
+   accessible and traversable via each of the "file", "http", and "ftp"
+   schemes if the documents refer to each other using relative URI.
+   Furthermore, such document trees can be moved, as a whole, without
+   changing any of the relative references.  Experience within the WWW
+   has demonstrated that the ability to perform relative referencing is
+   necessary for the long-term usability of embedded URI.
+
+   The syntax for relative URI takes advantage of the <hier_part> syntax
+   of <absoluteURI> (Section 3) in order to express a reference that is
+   relative to the namespace of another hierarchical URI.
+
+      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
+
+   A relative reference beginning with two slash characters is termed a
+   network-path reference, as defined by <net_path> in Section 3.  Such
+   references are rarely used.
+
+   A relative reference beginning with a single slash character is
+   termed an absolute-path reference, as defined by <abs_path> in
+   Section 3.
+
+   A relative reference that does not begin with a scheme name or a
+   slash character is termed a relative-path reference.
+
+      rel_path      = rel_segment [ abs_path ]
+
+      rel_segment   = 1*( unreserved | escaped |
+                          ";" | "@" | "&" | "=" | "+" | "$" | "," )
+
+   Within a relative-path reference, the complete path segments "." and
+   ".." have special meanings: "the current hierarchy level" and "the
+   level above this hierarchy level", respectively.  Although this is
+   very similar to their use within Unix-based filesystems to indicate
+   directory levels, these path components are only considered special
+   when resolving a relative-path reference to its absolute form
+   (Section 5.2).
+
+   Authors should be aware that a path segment which contains a colon
+   character cannot be used as the first segment of a relative URI path
+   (e.g., "this:that"), because it would be mistaken for a scheme name.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 17]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   It is therefore necessary to precede such segments with other
+   segments (e.g., "./this:that") in order for them to be referenced as
+   a relative path.
+
+   It is not necessary for all URI within a given scheme to be
+   restricted to the <hier_part> syntax, since the hierarchical
+   properties of that syntax are only necessary when relative URI are
+   used within a particular document.  Documents can only make use of
+   relative URI when their base URI fits within the <hier_part> syntax.
+   It is assumed that any document which contains a relative reference
+   will also have a base URI that obeys the syntax.  In other words,
+   relative URI cannot be used within a document that has an unsuitable
+   base URI.
+
+   Some URI schemes do not allow a hierarchical syntax matching the
+   <hier_part> syntax, and thus cannot use relative references.
+
+5.1. Establishing a Base URI
+
+   The term "relative URI" implies that there exists some absolute "base
+   URI" against which the relative reference is applied.  Indeed, the
+   base URI is necessary to define the semantics of any relative URI
+   reference; without it, a relative reference is meaningless.  In order
+   for relative URI to be usable within a document, the base URI of that
+   document must be known to the parser.
+
+   The base URI of a document can be established in one of four ways,
+   listed below in order of precedence.  The order of precedence can be
+   thought of in terms of layers, where the innermost defined base URI
+   has the highest precedence.  This can be visualized graphically as:
+
+      .----------------------------------------------------------.
+      |  .----------------------------------------------------.  |
+      |  |  .----------------------------------------------.  |  |
+      |  |  |  .----------------------------------------.  |  |  |
+      |  |  |  |  .----------------------------------.  |  |  |  |
+      |  |  |  |  |       <relative_reference>       |  |  |  |  |
+      |  |  |  |  `----------------------------------'  |  |  |  |
+      |  |  |  | (5.1.1) Base URI embedded in the       |  |  |  |
+      |  |  |  |         document's content             |  |  |  |
+      |  |  |  `----------------------------------------'  |  |  |
+      |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
+      |  |  |         (message, document, or none).        |  |  |
+      |  |  `----------------------------------------------'  |  |
+      |  | (5.1.3) URI used to retrieve the entity            |  |
+      |  `----------------------------------------------------'  |
+      | (5.1.4) Default Base URI is application-dependent        |
+      `----------------------------------------------------------'
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 18]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+5.1.1. Base URI within Document Content
+
+   Within certain document media types, the base URI of the document can
+   be embedded within the content itself such that it can be readily
+   obtained by a parser.  This can be useful for descriptive documents,
+   such as tables of content, which may be transmitted to others through
+   protocols other than their usual retrieval context (e.g., E-Mail or
+   USENET news).
+
+   It is beyond the scope of this document to specify how, for each
+   media type, the base URI can be embedded.  It is assumed that user
+   agents manipulating such media types will be able to obtain the
+   appropriate syntax from that media type's specification.  An example
+   of how the base URI can be embedded in the Hypertext Markup Language
+   (HTML) [RFC1866] is provided in Appendix D.
+
+   A mechanism for embedding the base URI within MIME container types
+   (e.g., the message and multipart types) is defined by MHTML
+   [RFC2110].  Protocols that do not use the MIME message header syntax,
+   but which do allow some form of tagged metainformation to be included
+   within messages, may define their own syntax for defining the base
+   URI as part of a message.
+
+5.1.2. Base URI from the Encapsulating Entity
+
+   If no base URI is embedded, the base URI of a document is defined by
+   the document's retrieval context.  For a document that is enclosed
+   within another entity (such as a message or another document), the
+   retrieval context is that entity; thus, the default base URI of the
+   document is the base URI of the entity in which the document is
+   encapsulated.
+
+5.1.3. Base URI from the Retrieval URI
+
+   If no base URI is embedded and the document is not encapsulated
+   within some other entity (e.g., the top level of a composite entity),
+   then, if a URI was used to retrieve the base document, that URI shall
+   be considered the base URI.  Note that if the retrieval was the
+   result of a redirected request, the last URI used (i.e., that which
+   resulted in the actual retrieval of the document) is the base URI.
+
+5.1.4. Default Base URI
+
+   If none of the conditions described in Sections 5.1.1--5.1.3 apply,
+   then the base URI is defined by the context of the application.
+   Since this definition is necessarily application-dependent, failing
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 19]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   to define the base URI using one of the other methods may result in
+   the same content being interpreted differently by different types of
+   application.
+
+   It is the responsibility of the distributor(s) of a document
+   containing relative URI to ensure that the base URI for that document
+   can be established.  It must be emphasized that relative URI cannot
+   be used reliably in situations where the document's base URI is not
+   well-defined.
+
+5.2. Resolving Relative References to Absolute Form
+
+   This section describes an example algorithm for resolving URI
+   references that might be relative to a given base URI.
+
+   The base URI is established according to the rules of Section 5.1 and
+   parsed into the four main components as described in Section 3.  Note
+   that only the scheme component is required to be present in the base
+   URI; the other components may be empty or undefined.  A component is
+   undefined if its preceding separator does not appear in the URI
+   reference; the path component is never undefined, though it may be
+   empty.  The base URI's query component is not used by the resolution
+   algorithm and may be discarded.
+
+   For each URI reference, the following steps are performed in order:
+
+   1) The URI reference is parsed into the potential four components and
+      fragment identifier, as described in Section 4.3.
+
+   2) If the path component is empty and the scheme, authority, and
+      query components are undefined, then it is a reference to the
+      current document and we are done.  Otherwise, the reference URI's
+      query and fragment components are defined as found (or not found)
+      within the URI reference and not inherited from the base URI.
+
+   3) If the scheme component is defined, indicating that the reference
+      starts with a scheme name, then the reference is interpreted as an
+      absolute URI and we are done.  Otherwise, the reference URI's
+      scheme is inherited from the base URI's scheme component.
+
+      Due to a loophole in prior specifications [RFC1630], some parsers
+      allow the scheme name to be present in a relative URI if it is the
+      same as the base URI scheme.  Unfortunately, this can conflict
+      with the correct parsing of non-hierarchical URI.  For backwards
+      compatibility, an implementation may work around such references
+      by removing the scheme if it matches that of the base URI and the
+      scheme is known to always use the <hier_part> syntax.  The parser
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 20]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      can then continue with the steps below for the remainder of the
+      reference components.  Validating parsers should mark such a
+      misformed relative reference as an error.
+
+   4) If the authority component is defined, then the reference is a
+      network-path and we skip to step 7.  Otherwise, the reference
+      URI's authority is inherited from the base URI's authority
+      component, which will also be undefined if the URI scheme does not
+      use an authority component.
+
+   5) If the path component begins with a slash character ("/"), then
+      the reference is an absolute-path and we skip to step 7.
+
+   6) If this step is reached, then we are resolving a relative-path
+      reference.  The relative path needs to be merged with the base
+      URI's path.  Although there are many ways to do this, we will
+      describe a simple method using a separate string buffer.
+
+      a) All but the last segment of the base URI's path component is
+         copied to the buffer.  In other words, any characters after the
+         last (right-most) slash character, if any, are excluded.
+
+      b) The reference's path component is appended to the buffer
+         string.
+
+      c) All occurrences of "./", where "." is a complete path segment,
+         are removed from the buffer string.
+
+      d) If the buffer string ends with "." as a complete path segment,
+         that "." is removed.
+
+      e) All occurrences of "<segment>/../", where <segment> is a
+         complete path segment not equal to "..", are removed from the
+         buffer string.  Removal of these path segments is performed
+         iteratively, removing the leftmost matching pattern on each
+         iteration, until no matching pattern remains.
+
+      f) If the buffer string ends with "<segment>/..", where <segment>
+         is a complete path segment not equal to "..", that
+         "<segment>/.." is removed.
+
+      g) If the resulting buffer string still begins with one or more
+         complete path segments of "..", then the reference is
+         considered to be in error.  Implementations may handle this
+         error by retaining these components in the resolved path (i.e.,
+         treating them as part of the final URI), by removing them from
+         the resolved path (i.e., discarding relative levels above the
+         root), or by avoiding traversal of the reference.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 21]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      h) The remaining buffer string is the reference URI's new path
+         component.
+
+   7) The resulting URI components, including any inherited from the
+      base URI, are recombined to give the absolute form of the URI
+      reference.  Using pseudocode, this would be
+
+         result = ""
+
+         if scheme is defined then
+             append scheme to result
+             append ":" to result
+
+         if authority is defined then
+             append "//" to result
+             append authority to result
+
+         append path to result
+
+         if query is defined then
+             append "?" to result
+             append query to result
+
+         if fragment is defined then
+             append "#" to result
+             append fragment to result
+
+         return result
+
+      Note that we must be careful to preserve the distinction between a
+      component that is undefined, meaning that its separator was not
+      present in the reference, and a component that is empty, meaning
+      that the separator was present and was immediately followed by the
+      next component separator or the end of the reference.
+
+   The above algorithm is intended to provide an example by which the
+   output of implementations can be tested -- implementation of the
+   algorithm itself is not required.  For example, some systems may find
+   it more efficient to implement step 6 as a pair of segment stacks
+   being merged, rather than as a series of string pattern replacements.
+
+      Note: Some WWW client applications will fail to separate the
+      reference's query component from its path component before merging
+      the base and reference paths in step 6 above.  This may result in
+      a loss of information if the query component contains the strings
+      "/../" or "/./".
+
+   Resolution examples are provided in Appendix C.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 22]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+6. URI Normalization and Equivalence
+
+   In many cases, different URI strings may actually identify the
+   identical resource. For example, the host names used in URL are
+   actually case insensitive, and the URL <http://www.XEROX.com> is
+   equivalent to <http://www.xerox.com>. In general, the rules for
+   equivalence and definition of a normal form, if any, are scheme
+   dependent. When a scheme uses elements of the common syntax, it will
+   also use the common syntax equivalence rules, namely that the scheme
+   and hostname are case insensitive and a URL with an explicit ":port",
+   where the port is the default for the scheme, is equivalent to one
+   where the port is elided.
+
+7. Security Considerations
+
+   A URI does not in itself pose a security threat.  Users should beware
+   that there is no general guarantee that a URL, which at one time
+   located a given resource, will continue to do so.  Nor is there any
+   guarantee that a URL will not locate a different resource at some
+   later point in time, due to the lack of any constraint on how a given
+   authority apportions its namespace.  Such a guarantee can only be
+   obtained from the person(s) controlling that namespace and the
+   resource in question.  A specific URI scheme may include additional
+   semantics, such as name persistence, if those semantics are required
+   of all naming authorities for that scheme.
+
+   It is sometimes possible to construct a URL such that an attempt to
+   perform a seemingly harmless, idempotent operation, such as the
+   retrieval of an entity associated with the resource, will in fact
+   cause a possibly damaging remote operation to occur.  The unsafe URL
+   is typically constructed by specifying a port number other than that
+   reserved for the network protocol in question.  The client
+   unwittingly contacts a site that is in fact running a different
+   protocol.  The content of the URL contains instructions that, when
+   interpreted according to this other protocol, cause an unexpected
+   operation.  An example has been the use of a gopher URL to cause an
+   unintended or impersonating message to be sent via a SMTP server.
+
+   Caution should be used when using any URL that specifies a port
+   number other than the default for the protocol, especially when it is
+   a number within the reserved space.
+
+   Care should be taken when a URL contains escaped delimiters for a
+   given protocol (for example, CR and LF characters for telnet
+   protocols) that these are not unescaped before transmission.  This
+   might violate the protocol, but avoids the potential for such
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 23]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   characters to be used to simulate an extra operation or parameter in
+   that protocol, which might lead to an unexpected and possibly harmful
+   remote operation to be performed.
+
+   It is clearly unwise to use a URL that contains a password which is
+   intended to be secret. In particular, the use of a password within
+   the 'userinfo' component of a URL is strongly disrecommended except
+   in those rare cases where the 'password' parameter is intended to be
+   public.
+
+8. Acknowledgements
+
+   This document was derived from RFC 1738 [RFC1738] and RFC 1808
+   [RFC1808]; the acknowledgements in those specifications still apply.
+   In addition, contributions by Gisle Aas, Martin Beet, Martin Duerst,
+   Jim Gettys, Martijn Koster, Dave Kristol, Daniel LaLiberte, Foteos
+   Macrides, James Marshall, Ryan Moats, Keith Moore, and Lauren Wood
+   are gratefully acknowledged.
+
+9. References
+
+   [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and
+             Languages", BCP 18, RFC 2277, January 1998.
+
+   [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
+             Unifying Syntax for the Expression of Names and Addresses
+             of Objects on the Network as used in the World-Wide Web",
+             RFC 1630, June 1994.
+
+   [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, Editors,
+             "Uniform Resource Locators (URL)", RFC 1738, December 1994.
+
+   [RFC1866] Berners-Lee T., and D. Connolly, "HyperText Markup Language
+             Specification -- 2.0", RFC 1866, November 1995.
+
+   [RFC1123] Braden, R., Editor, "Requirements for Internet Hosts --
+             Application and Support", STD 3, RFC 1123, October 1989.
+
+   [RFC822]  Crocker, D., "Standard for the Format of ARPA Internet Text
+             Messages", STD 11, RFC 822, August 1982.
+
+   [RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC
+             1808, June 1995.
+
+   [RFC2046] Freed, N., and N. Borenstein, "Multipurpose Internet Mail
+             Extensions (MIME) Part Two: Media Types", RFC 2046,
+             November 1996.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 24]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   [RFC1736] Kunze, J., "Functional Recommendations for Internet
+             Resource Locators", RFC 1736, February 1995.
+
+   [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.
+
+   [RFC1034] Mockapetris, P., "Domain Names - Concepts and Facilities",
+             STD 13, RFC 1034, November 1987.
+
+   [RFC2110] Palme, J., and A. Hopmann, "MIME E-mail Encapsulation of
+             Aggregate Documents, such as HTML (MHTML)", RFC 2110, March
+             1997.
+
+   [RFC1737] Sollins, K., and L. Masinter, "Functional Requirements for
+             Uniform Resource Names", RFC 1737, December 1994.
+
+   [ASCII]   US-ASCII. "Coded Character Set -- 7-bit American Standard
+             Code for Information Interchange", ANSI X3.4-1986.
+
+   [UTF-8]   Yergeau, F., "UTF-8, a transformation format of ISO 10646",
+             RFC 2279, January 1998.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 25]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+10. Authors' Addresses
+
+   Tim Berners-Lee
+   World Wide Web Consortium
+   MIT Laboratory for Computer Science, NE43-356
+   545 Technology Square
+   Cambridge, MA 02139
+
+   Fax: +1(617)258-8682
+   EMail: timbl@w3.org
+
+
+   Roy T. Fielding
+   Department of Information and Computer Science
+   University of California, Irvine
+   Irvine, CA  92697-3425
+
+   Fax: +1(949)824-1715
+   EMail: fielding@ics.uci.edu
+
+
+   Larry Masinter
+   Xerox PARC
+   3333 Coyote Hill Road
+   Palo Alto, CA 94034
+
+   Fax: +1(415)812-4333
+   EMail: masinter@parc.xerox.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 26]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+A. Collected BNF for URI
+
+      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
+      absoluteURI   = scheme ":" ( hier_part | opaque_part )
+      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
+
+      hier_part     = ( net_path | abs_path ) [ "?" query ]
+      opaque_part   = uric_no_slash *uric
+
+      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
+                      "&" | "=" | "+" | "$" | ","
+
+      net_path      = "//" authority [ abs_path ]
+      abs_path      = "/"  path_segments
+      rel_path      = rel_segment [ abs_path ]
+
+      rel_segment   = 1*( unreserved | escaped |
+                          ";" | "@" | "&" | "=" | "+" | "$" | "," )
+
+      scheme        = alpha *( alpha | digit | "+" | "-" | "." )
+
+      authority     = server | reg_name
+
+      reg_name      = 1*( unreserved | escaped | "$" | "," |
+                          ";" | ":" | "@" | "&" | "=" | "+" )
+
+      server        = [ [ userinfo "@" ] hostport ]
+      userinfo      = *( unreserved | escaped |
+                         ";" | ":" | "&" | "=" | "+" | "$" | "," )
+
+      hostport      = host [ ":" port ]
+      host          = hostname | IPv4address
+      hostname      = *( domainlabel "." ) toplabel [ "." ]
+      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
+      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
+      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
+      port          = *digit
+
+      path          = [ abs_path | opaque_part ]
+      path_segments = segment *( "/" segment )
+      segment       = *pchar *( ";" param )
+      param         = *pchar
+      pchar         = unreserved | escaped |
+                      ":" | "@" | "&" | "=" | "+" | "$" | ","
+
+      query         = *uric
+
+      fragment      = *uric
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 27]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      uric          = reserved | unreserved | escaped
+      reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
+                      "$" | ","
+      unreserved    = alphanum | mark
+      mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
+                      "(" | ")"
+
+      escaped       = "%" hex hex
+      hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
+                              "a" | "b" | "c" | "d" | "e" | "f"
+
+      alphanum      = alpha | digit
+      alpha         = lowalpha | upalpha
+
+      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
+                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
+                 "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
+      upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
+                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
+                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
+      digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
+                 "8" | "9"
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 28]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+B. Parsing a URI Reference with a Regular Expression
+
+   As described in Section 4.3, the generic URI syntax is not sufficient
+   to disambiguate the components of some forms of URI.  Since the
+   "greedy algorithm" described in that section is identical to the
+   disambiguation method used by POSIX regular expressions, it is
+   natural and commonplace to use a regular expression for parsing the
+   potential four components and fragment identifier of a URI reference.
+
+   The following line is the regular expression for breaking-down a URI
+   reference into its components.
+
+      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
+       12            3  4          5       6  7        8 9
+
+   The numbers in the second line above are only to assist readability;
+   they indicate the reference points for each subexpression (i.e., each
+   paired parenthesis).  We refer to the value matched for subexpression
+   <n> as $<n>.  For example, matching the above expression to
+
+      http://www.ics.uci.edu/pub/ietf/uri/#Related
+
+   results in the following subexpression matches:
+
+      $1 = http:
+      $2 = http
+      $3 = //www.ics.uci.edu
+      $4 = www.ics.uci.edu
+      $5 = /pub/ietf/uri/
+      $6 = <undefined>
+      $7 = <undefined>
+      $8 = #Related
+      $9 = Related
+
+   where <undefined> indicates that the component is not present, as is
+   the case for the query component in the above example.  Therefore, we
+   can determine the value of the four components and fragment as
+
+      scheme    = $2
+      authority = $4
+      path      = $5
+      query     = $7
+      fragment  = $9
+
+   and, going in the opposite direction, we can recreate a URI reference
+   from its components using the algorithm in step 7 of Section 5.2.
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 29]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+C. Examples of Resolving Relative URI References
+
+   Within an object with a well-defined base URI of
+
+      http://a/b/c/d;p?q
+
+   the relative URI would be resolved as follows:
+
+C.1.  Normal Examples
+
+      g:h           =  g:h
+      g             =  http://a/b/c/g
+      ./g           =  http://a/b/c/g
+      g/            =  http://a/b/c/g/
+      /g            =  http://a/g
+      //g           =  http://g
+      ?y            =  http://a/b/c/?y
+      g?y           =  http://a/b/c/g?y
+      #s            =  (current document)#s
+      g#s           =  http://a/b/c/g#s
+      g?y#s         =  http://a/b/c/g?y#s
+      ;x            =  http://a/b/c/;x
+      g;x           =  http://a/b/c/g;x
+      g;x?y#s       =  http://a/b/c/g;x?y#s
+      .             =  http://a/b/c/
+      ./            =  http://a/b/c/
+      ..            =  http://a/b/
+      ../           =  http://a/b/
+      ../g          =  http://a/b/g
+      ../..         =  http://a/
+      ../../        =  http://a/
+      ../../g       =  http://a/g
+
+C.2.  Abnormal Examples
+
+   Although the following abnormal examples are unlikely to occur in
+   normal practice, all URI parsers should be capable of resolving them
+   consistently.  Each example uses the same base as above.
+
+   An empty reference refers to the start of the current document.
+
+      <>            =  (current document)
+
+   Parsers must be careful in handling the case where there are more
+   relative path ".." segments than there are hierarchical levels in the
+   base URI's path.  Note that the ".." syntax cannot be used to change
+   the authority component of a URI.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 30]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      ../../../g    =  http://a/../g
+      ../../../../g =  http://a/../../g
+
+   In practice, some implementations strip leading relative symbolic
+   elements (".", "..") after applying a relative URI calculation, based
+   on the theory that compensating for obvious author errors is better
+   than allowing the request to fail.  Thus, the above two references
+   will be interpreted as "http://a/g" by some implementations.
+
+   Similarly, parsers must avoid treating "." and ".." as special when
+   they are not complete components of a relative path.
+
+      /./g          =  http://a/./g
+      /../g         =  http://a/../g
+      g.            =  http://a/b/c/g.
+      .g            =  http://a/b/c/.g
+      g..           =  http://a/b/c/g..
+      ..g           =  http://a/b/c/..g
+
+   Less likely are cases where the relative URI uses unnecessary or
+   nonsensical forms of the "." and ".." complete path segments.
+
+      ./../g        =  http://a/b/g
+      ./g/.         =  http://a/b/c/g/
+      g/./h         =  http://a/b/c/g/h
+      g/../h        =  http://a/b/c/h
+      g;x=1/./y     =  http://a/b/c/g;x=1/y
+      g;x=1/../y    =  http://a/b/c/y
+
+   All client applications remove the query component from the base URI
+   before resolving relative URI.  However, some applications fail to
+   separate the reference's query and/or fragment components from a
+   relative path before merging it with the base path.  This error is
+   rarely noticed, since typical usage of a fragment never includes the
+   hierarchy ("/") character, and the query component is not normally
+   used within relative references.
+
+      g?y/./x       =  http://a/b/c/g?y/./x
+      g?y/../x      =  http://a/b/c/g?y/../x
+      g#s/./x       =  http://a/b/c/g#s/./x
+      g#s/../x      =  http://a/b/c/g#s/../x
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 31]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Some parsers allow the scheme name to be present in a relative URI if
+   it is the same as the base URI scheme.  This is considered to be a
+   loophole in prior specifications of partial URI [RFC1630]. Its use
+   should be avoided.
+
+      http:g        =  http:g           ; for validating parsers
+                    |  http://a/b/c/g   ; for backwards compatibility
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 32]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+D. Embedding the Base URI in HTML documents
+
+   It is useful to consider an example of how the base URI of a document
+   can be embedded within the document's content.  In this appendix, we
+   describe how documents written in the Hypertext Markup Language
+   (HTML) [RFC1866] can include an embedded base URI.  This appendix
+   does not form a part of the URI specification and should not be
+   considered as anything more than a descriptive example.
+
+   HTML defines a special element "BASE" which, when present in the
+   "HEAD" portion of a document, signals that the parser should use the
+   BASE element's "HREF" attribute as the base URI for resolving any
+   relative URI.  The "HREF" attribute must be an absolute URI.  Note
+   that, in HTML, element and attribute names are case-insensitive.  For
+   example:
+
+      <!doctype html public "-//IETF//DTD HTML//EN">
+      <HTML><HEAD>
+      <TITLE>An example HTML document</TITLE>
+      <BASE href="http://www.ics.uci.edu/Test/a/b/c">
+      </HEAD><BODY>
+      ... <A href="../x">a hypertext anchor</A> ...
+      </BODY></HTML>
+
+   A parser reading the example document should interpret the given
+   relative URI "../x" as representing the absolute URI
+
+      <http://www.ics.uci.edu/Test/a/x>
+
+   regardless of the context in which the example document was obtained.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 33]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+E. Recommendations for Delimiting URI in Context
+
+   URI are often transmitted through formats that do not provide a clear
+   context for their interpretation.  For example, there are many
+   occasions when URI are included in plain text; examples include text
+   sent in electronic mail, USENET news messages, and, most importantly,
+   printed on paper.  In such cases, it is important to be able to
+   delimit the URI from the rest of the text, and in particular from
+   punctuation marks that might be mistaken for part of the URI.
+
+   In practice, URI are delimited in a variety of ways, but usually
+   within double-quotes "http://test.com/", angle brackets
+   <http://test.com/>, or just using whitespace
+
+                             http://test.com/
+
+   These wrappers do not form part of the URI.
+
+   In the case where a fragment identifier is associated with a URI
+   reference, the fragment would be placed within the brackets as well
+   (separated from the URI with a "#" character).
+
+   In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may
+   need to be added to break long URI across lines. The whitespace
+   should be ignored when extracting the URI.
+
+   No whitespace should be introduced after a hyphen ("-") character.
+   Because some typesetters and printers may (erroneously) introduce a
+   hyphen at the end of line when breaking a line, the interpreter of a
+   URI containing a line break immediately after a hyphen should ignore
+   all unescaped whitespace around the line break, and should be aware
+   that the hyphen may or may not actually be part of the URI.
+
+   Using <> angle brackets around each URI is especially recommended as
+   a delimiting style for URI that contain whitespace.
+
+   The prefix "URL:" (with or without a trailing space) was recommended
+   as a way to used to help distinguish a URL from other bracketed
+   designators, although this is not common in practice.
+
+   For robustness, software that accepts user-typed URI should attempt
+   to recognize and strip both delimiters and embedded whitespace.
+
+   For example, the text:
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 34]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      Yes, Jim, I found it under "http://www.w3.org/Addressing/",
+      but you can probably pick it up from <ftp://ds.internic.
+      net/rfc/>.  Note the warning in <http://www.ics.uci.edu/pub/
+      ietf/uri/historical.html#WARNING>.
+
+   contains the URI references
+
+      http://www.w3.org/Addressing/
+      ftp://ds.internic.net/rfc/
+      http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 35]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+F. Abbreviated URLs
+
+   The URL syntax was designed for unambiguous reference to network
+   resources and extensibility via the URL scheme.  However, as URL
+   identification and usage have become commonplace, traditional media
+   (television, radio, newspapers, billboards, etc.) have increasingly
+   used abbreviated URL references.  That is, a reference consisting of
+   only the authority and path portions of the identified resource, such
+   as
+
+      www.w3.org/Addressing/
+
+   or simply the DNS hostname on its own.  Such references are primarily
+   intended for human interpretation rather than machine, with the
+   assumption that context-based heuristics are sufficient to complete
+   the URL (e.g., most hostnames beginning with "www" are likely to have
+   a URL prefix of "http://").  Although there is no standard set of
+   heuristics for disambiguating abbreviated URL references, many client
+   implementations allow them to be entered by the user and
+   heuristically resolved.  It should be noted that such heuristics may
+   change over time, particularly when new URL schemes are introduced.
+
+   Since an abbreviated URL has the same syntax as a relative URL path,
+   abbreviated URL references cannot be used in contexts where relative
+   URLs are expected.  This limits the use of abbreviated URLs to places
+   where there is no defined base URL, such as dialog boxes and off-line
+   advertisements.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 36]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+G. Summary of Non-editorial Changes
+
+G.1. Additions
+
+   Section 4 (URI References) was added to stem the confusion regarding
+   "what is a URI" and how to describe fragment identifiers given that
+   they are not part of the URI, but are part of the URI syntax and
+   parsing concerns.  In addition, it provides a reference definition
+   for use by other IETF specifications (HTML, HTTP, etc.) that have
+   previously attempted to redefine the URI syntax in order to account
+   for the presence of fragment identifiers in URI references.
+
+   Section 2.4 was rewritten to clarify a number of misinterpretations
+   and to leave room for fully internationalized URI.
+
+   Appendix F on abbreviated URLs was added to describe the shortened
+   references often seen on television and magazine advertisements and
+   explain why they are not used in other contexts.
+
+G.2. Modifications from both RFC 1738 and RFC 1808
+
+   Changed to URI syntax instead of just URL.
+
+   Confusion regarding the terms "character encoding", the URI
+   "character set", and the escaping of characters with %<hex><hex>
+   equivalents has (hopefully) been reduced.  Many of the BNF rule names
+   regarding the character sets have been changed to more accurately
+   describe their purpose and to encompass all "characters" rather than
+   just US-ASCII octets.  Unless otherwise noted here, these
+   modifications do not affect the URI syntax.
+
+   Both RFC 1738 and RFC 1808 refer to the "reserved" set of characters
+   as if URI-interpreting software were limited to a single set of
+   characters with a reserved purpose (i.e., as meaning something other
+   than the data to which the characters correspond), and that this set
+   was fixed by the URI scheme.  However, this has not been true in
+   practice; any character that is interpreted differently when it is
+   escaped is, in effect, reserved.  Furthermore, the interpreting
+   engine on a HTTP server is often dependent on the resource, not just
+   the URI scheme.  The description of reserved characters has been
+   changed accordingly.
+
+   The plus "+", dollar "$", and comma "," characters have been added to
+   those in the "reserved" set, since they are treated as reserved
+   within the query component.
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 37]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   The tilde "~" character was added to those in the "unreserved" set,
+   since it is extensively used on the Internet in spite of the
+   difficulty to transcribe it with some keyboards.
+
+   The syntax for URI scheme has been changed to require that all
+   schemes begin with an alpha character.
+
+   The "user:password" form in the previous BNF was changed to a
+   "userinfo" token, and the possibility that it might be
+   "user:password" made scheme specific. In particular, the use of
+   passwords in the clear is not even suggested by the syntax.
+
+   The question-mark "?" character was removed from the set of allowed
+   characters for the userinfo in the authority component, since testing
+   showed that many applications treat it as reserved for separating the
+   query component from the rest of the URI.
+
+   The semicolon ";" character was added to those stated as being
+   reserved within the authority component, since several new schemes
+   are using it as a separator within userinfo to indicate the type of
+   user authentication.
+
+   RFC 1738 specified that the path was separated from the authority
+   portion of a URI by a slash.  RFC 1808 followed suit, but with a
+   fudge of carrying around the separator as a "prefix" in order to
+   describe the parsing algorithm.  RFC 1630 never had this problem,
+   since it considered the slash to be part of the path.  In writing
+   this specification, it was found to be impossible to accurately
+   describe and retain the difference between the two URI
+      <foo:/bar>   and   <foo:bar>
+   without either considering the slash to be part of the path (as
+   corresponds to actual practice) or creating a separate component just
+   to hold that slash.  We chose the former.
+
+G.3. Modifications from RFC 1738
+
+   The definition of specific URL schemes and their scheme-specific
+   syntax and semantics has been moved to separate documents.
+
+   The URL host was defined as a fully-qualified domain name.  However,
+   many URLs are used without fully-qualified domain names (in contexts
+   for which the full qualification is not necessary), without any host
+   (as in some file URLs), or with a host of "localhost".
+
+   The URL port is now *digit instead of 1*digit, since systems are
+   expected to handle the case where the ":" separator between host and
+   port is supplied without a port.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 38]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   The recommendations for delimiting URI in context (Appendix E) have
+   been adjusted to reflect current practice.
+
+G.4. Modifications from RFC 1808
+
+   RFC 1808 (Section 4) defined an empty URL reference (a reference
+   containing nothing aside from the fragment identifier) as being a
+   reference to the base URL.  Unfortunately, that definition could be
+   interpreted, upon selection of such a reference, as a new retrieval
+   action on that resource.  Since the normal intent of such references
+   is for the user agent to change its view of the current document to
+   the beginning of the specified fragment within that document, not to
+   make an additional request of the resource, a description of how to
+   correctly interpret an empty reference has been added in Section 4.
+
+   The description of the mythical Base header field has been replaced
+   with a reference to the Content-Location header field defined by
+   MHTML [RFC2110].
+
+   RFC 1808 described various schemes as either having or not having the
+   properties of the generic URI syntax.  However, the only requirement
+   is that the particular document containing the relative references
+   have a base URI that abides by the generic URI syntax, regardless of
+   the URI scheme, so the associated description has been updated to
+   reflect that.
+
+   The BNF term <net_loc> has been replaced with <authority>, since the
+   latter more accurately describes its use and purpose.  Likewise, the
+   authority is no longer restricted to the IP server syntax.
+
+   Extensive testing of current client applications demonstrated that
+   the majority of deployed systems do not use the ";" character to
+   indicate trailing parameter information, and that the presence of a
+   semicolon in a path segment does not affect the relative parsing of
+   that segment.  Therefore, parameters have been removed as a separate
+   component and may now appear in any path segment.  Their influence
+   has been removed from the algorithm for resolving a relative URI
+   reference.  The resolution examples in Appendix C have been modified
+   to reflect this change.
+
+   Implementations are now allowed to work around misformed relative
+   references that are prefixed by the same scheme as the base URI, but
+   only for schemes known to use the <hier_part> syntax.
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 39]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+H.  Full Copyright Statement
+
+   Copyright (C) The Internet Society (1998).  All Rights Reserved.
+
+   This document and translations of it may be copied and furnished to
+   others, and derivative works that comment on or otherwise explain it
+   or assist in its implementation may be prepared, copied, published
+   and distributed, in whole or in part, without restriction of any
+   kind, provided that the above copyright notice and this paragraph are
+   included on all such copies and derivative works.  However, this
+   document itself may not be modified in any way, such as by removing
+   the copyright notice or references to the Internet Society or other
+   Internet organizations, except as needed for the purpose of
+   developing Internet standards in which case the procedures for
+   copyrights defined in the Internet Standards process must be
+   followed, or as required to translate it into languages other than
+   English.
+
+   The limited permissions granted above are perpetual and will not be
+   revoked by the Internet Society or its successors or assigns.
+
+   This document and the information contained herein is provided on an
+   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 40]
+
diff --git a/doc/rfc822.scm.doc b/doc/rfc822.scm.doc
new file mode 100644
index 0000000..a2e38c7
--- /dev/null
+++ b/doc/rfc822.scm.doc
@@ -0,0 +1,161 @@
+This file documents names defined in rfc822.scm:
+
+
+
+
+NOTES
+
+
+
+A note on line-terminators:
+
+Line-terminating sequences are always a drag, because there's no
+agreement on them -- the Net protocols and DOS use cr/lf; Unix uses
+lf; the Mac uses cr. One one hand, you'd like to use the code for all
+of the above, on the other, you'd also like to use the code for strict
+applications that need definitely not to recognise bare cr's or lf's
+as terminators.
+
+RFC 822 requires a cr/lf (carriage-return/line-feed) pair to terminate
+lines of text. On the other hand, careful perusal of the text shows up
+some ambiguities (there are maybe three or four of these, and I'm too
+lazy to write them all down). Furthermore, it is an unfortunate fact
+that many Unix apps separate lines of RFC 822 text with simple
+linefeeds (e.g., messages kept in /usr/spool/mail). As a result, this
+code takes a broad-minded view of line-terminators: lines can be
+terminated by either cr/lf or just lf, and either terminating sequence
+is trimmed.
+
+If you need stricter parsing, you can call the lower-level procedure
+%READ-RFC-822-FIELD and %READ-RFC822-HEADERS procs. They take the
+read-line procedure as an extra parameter. This means that you can
+pass in a procedure that recognises only cr/lf's, or only cr's (for a
+Mac app, perhaps), and you can determine whether or not the
+terminators get trimmed. However, your read-line procedure must
+indicate the header-terminating empty line by returning *either* the
+empty string or the two-char string cr/lf (or the EOF object).
+
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+
+(read-rfc822-field [port])
+(%read-rfc822-field read-line port)
+
+Read one field from the port, and return two values [NAME BODY]:
+
+ - NAME	 Symbol such as 'subject or 'to. The field name is converted
+         to a symbol using the Scheme implementation's preferred
+         case. If the implementation reads symbols in a case-sensitive
+         fashion (e.g., scsh), lowercase is used. This means you can
+         compare these symbols to quoted constants using EQ?. When
+         printing these field names out, it looks best if you capitalise
+         them with (CAPITALIZE-STRING (SYMBOL->STRING FIELD-NAME)).
+
+ - BODY	 List of strings which are the field's body, e.g. 
+         ("shivers@lcs.mit.edu"). Each list element is one line from
+         the field's body, so if the field spreads out over three lines,
+         then the body is a list of three strings. The terminating
+         cr/lf's are trimmed from each string. A leading space or a
+         leading horizontal tab is also trimmed, but one and onyl one.
+
+When there are no more fields -- EOF or a blank line has terminated
+the header section -- then the procedure returns [#f #f].
+ 
+The %READ-RFC822-FIELD variant allows you to specify your own
+read-line procedure. The one used by READ-RFC822-FIELD terminates
+lines with either cr/lf or just lf, and it trims the terminator from
+the line. Your read-line procedure should trim the terminator of the
+line, so an empty line is returned as an empty string.
+
+The procedures raise an error if the syntax of the read field (the
+line returned by the read-line-function) is illegal (RFC822 illegal).
+
+
+
+read-rfc822-headers [port]
+%read-rfc822-headers read-line port
+
+Read in and parse up a section of text that looks like the header
+portion of an RFC 822 message. Return an alist mapping a field name (a
+symbol such as 'date or 'subject) to a list of field bodies -- one for
+each occurence of the field in the header. So if there are five
+"Received-by:" fields in the header, the alist maps 'received-by to a
+five element list. Each body is in turn represented by a list of
+strings -- one for each line of the field. So a field spread across
+three lines would produce a three element body.
+
+The %READ-RFC822-HEADERS variant allows you to specify your own
+read-line procedure. See notes (A note on line-terminators) above for
+reasons why.
+
+
+
+rejoin-header-lines alist [seperator] 
+
+Takes a field alist such as is returned by READ-RFC822-HEADERS and
+returns an equivalent alist. Each body (string list) in the input
+alist is joined into a single list in the output alist. SEPARATOR is
+the string used to join these elements together; it defaults to a
+single space " ", but can usefully be "\n" or "\r\n".
+
+To rejoin a single body list, use scsh's JOIN-STRINGS procedure.
+
+
+
+For the following definitions' examples, let's use this set of of
+RFC822 headers:
+     From: shivers
+     To: ziggy,
+       newts
+     To: gjs, tk
+
+
+
+get-header-all headers name
+
+returns all entries or #f, p.e.
+(get-header-all hdrs 'to)   -> ((" ziggy," " newts") (" gjs, tk"))
+
+
+
+get-header-lines headers name
+
+returns all lines of the first entry or #f, p.e.
+(get-header-lines hdrs 'to) -> (" ziggy," " newts")
+
+
+
+get-headers headers name [seperator]
+
+returns the first entry with the lines joined together by seperator
+(newline by default (\n)), p.e.
+(get-header hdrs 'to)       -> "ziggy,\n newts"
+
+
+
+htab
+
+is the horizontal tab (ascii-code 9)
+
+
+
+string->symbol-pref
+
+is a procedure that takes a string and converts it to a symbol
+using the Scheme implementation's preferred case. The preferred case
+is recognized by a doing a symbol->string conversion of 'a.
+
+
+
+
+DESIREABLE FUNCTIONALITIES
+
+ - Unfolding long lines.
+ - Lexing structured fields.
+ - Unlexing structured fields into canonical form.
+ - Parsing and unparsing dates.
+ - Parsing and unparsing addresses.
diff --git a/doc/rfc822.txt b/doc/rfc822.txt
new file mode 100644
index 0000000..35b09a3
--- /dev/null
+++ b/doc/rfc822.txt
@@ -0,0 +1,2901 @@
+
+ 
+
+
+
+
+     RFC #  822
+
+     Obsoletes:  RFC #733  (NIC #41952)
+
+
+
+
+
+
+
+
+
+
+
+
+                        STANDARD FOR THE FORMAT OF
+
+                        ARPA INTERNET TEXT MESSAGES
+
+
+
+
+
+
+                              August 13, 1982
+
+
+
+
+
+
+                                Revised by
+
+                             David H. Crocker
+
+
+                      Dept. of Electrical Engineering
+                 University of Delaware, Newark, DE  19711
+                      Network:  DCrocker @ UDel-Relay
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+                             TABLE OF CONTENTS
+
+
+     PREFACE ....................................................   ii
+
+     1.  INTRODUCTION ...........................................    1
+
+         1.1.  Scope ............................................    1
+         1.2.  Communication Framework ..........................    2
+
+     2.  NOTATIONAL CONVENTIONS .................................    3
+
+     3.  LEXICAL ANALYSIS OF MESSAGES ...........................    5
+
+         3.1.  General Description ..............................    5
+         3.2.  Header Field Definitions .........................    9
+         3.3.  Lexical Tokens ...................................   10
+         3.4.  Clarifications ...................................   11
+
+     4.  MESSAGE SPECIFICATION ..................................   17
+
+         4.1.  Syntax ...........................................   17
+         4.2.  Forwarding .......................................   19
+         4.3.  Trace Fields .....................................   20
+         4.4.  Originator Fields ................................   21
+         4.5.  Receiver Fields ..................................   23
+         4.6.  Reference Fields .................................   23
+         4.7.  Other Fields .....................................   24
+
+     5.  DATE AND TIME SPECIFICATION ............................   26
+
+         5.1.  Syntax ...........................................   26
+         5.2.  Semantics ........................................   26
+
+     6.  ADDRESS SPECIFICATION ..................................   27
+
+         6.1.  Syntax ...........................................   27
+         6.2.  Semantics ........................................   27
+         6.3.  Reserved Address .................................   33
+
+     7.  BIBLIOGRAPHY ...........................................   34
+
+
+                             APPENDIX
+
+     A.  EXAMPLES ...............................................   36
+     B.  SIMPLE FIELD PARSING ...................................   40
+     C.  DIFFERENCES FROM RFC #733 ..............................   41
+     D.  ALPHABETICAL LISTING OF SYNTAX RULES ...................   44
+
+
+     August 13, 1982               - i -                      RFC #822
+
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+                                  PREFACE
+
+
+          By 1977, the Arpanet employed several informal standards for
+     the  text  messages (mail) sent among its host computers.  It was
+     felt necessary to codify these practices and  provide  for  those
+     features  that  seemed  imminent.   The result of that effort was
+     Request for Comments (RFC) #733, "Standard for the Format of ARPA
+     Network Text Message", by Crocker, Vittal, Pogran, and Henderson.
+     The specification attempted to avoid major  changes  in  existing
+     software, while permitting several new features.
+
+          This document revises the specifications  in  RFC  #733,  in
+     order  to  serve  the  needs  of the larger and more complex ARPA
+     Internet.  Some of RFC #733's features failed  to  gain  adequate
+     acceptance.   In  order to simplify the standard and the software
+     that follows it, these features have been removed.   A  different
+     addressing  scheme  is  used, to handle the case of inter-network
+     mail; and the concept of re-transmission has been introduced.
+
+          This specification is intended for use in the ARPA Internet.
+     However, an attempt has been made to free it of any dependence on
+     that environment, so that it can be applied to other network text
+     message systems.
+
+          The specification of RFC #733 took place over the course  of
+     one  year, using the ARPANET mail environment, itself, to provide
+     an on-going forum for discussing the capabilities to be included.
+     More  than  twenty individuals, from across the country, partici-
+     pated in  the  original  discussion.   The  development  of  this
+     revised specification has, similarly, utilized network mail-based
+     group discussion.  Both specification efforts  greatly  benefited
+     from the comments and ideas of the participants.
+
+          The syntax of the standard,  in  RFC  #733,  was  originally
+     specified  in  the  Backus-Naur Form (BNF) meta-language.  Ken L.
+     Harrenstien, of SRI International, was responsible for  re-coding
+     the  BNF  into  an  augmented  BNF  that makes the representation
+     smaller and easier to understand.
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - ii -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     1.  INTRODUCTION
+
+     1.1.  SCOPE
+
+          This standard specifies a syntax for text messages that  are
+     sent  among  computer  users, within the framework of "electronic
+     mail".  The standard supersedes  the  one  specified  in  ARPANET
+     Request  for Comments #733, "Standard for the Format of ARPA Net-
+     work Text Messages".
+
+          In this context, messages are viewed as having  an  envelope
+     and  contents.   The  envelope  contains  whatever information is
+     needed to accomplish transmission  and  delivery.   The  contents
+     compose  the object to be delivered to the recipient.  This stan-
+     dard applies only to the format and some of the semantics of mes-
+     sage  contents.   It contains no specification of the information
+     in the envelope.
+
+          However, some message systems may use information  from  the
+     contents  to create the envelope.  It is intended that this stan-
+     dard facilitate the acquisition of such information by programs.
+
+          Some message systems may  store  messages  in  formats  that
+     differ  from the one specified in this standard.  This specifica-
+     tion is intended strictly as a definition of what message content
+     format is to be passed BETWEEN hosts.
+
+     Note:  This standard is NOT intended to dictate the internal for-
+            mats  used  by sites, the specific message system features
+            that they are expected to support, or any of  the  charac-
+            teristics  of  user interface programs that create or read
+            messages.
+
+          A distinction should be made between what the  specification
+     REQUIRES  and  what  it ALLOWS.  Messages can be made complex and
+     rich with formally-structured components of information or can be
+     kept small and simple, with a minimum of such information.  Also,
+     the standard simplifies the interpretation  of  differing  visual
+     formats  in  messages;  only  the  visual  aspect of a message is
+     affected and not the interpretation  of  information  within  it.
+     Implementors may choose to retain such visual distinctions.
+
+          The formal definition is divided into four levels.  The bot-
+     tom level describes the meta-notation used in this document.  The
+     second level describes basic lexical analyzers that  feed  tokens
+     to  higher-level  parsers.   Next is an overall specification for
+     messages; it permits distinguishing individual fields.   Finally,
+     there is definition of the contents of several structured fields.
+
+
+
+     August 13, 1982               - 1 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     1.2.  COMMUNICATION FRAMEWORK
+
+          Messages consist of lines of text.   No  special  provisions
+     are  made for encoding drawings, facsimile, speech, or structured
+     text.  No significant consideration has been given  to  questions
+     of  data  compression  or to transmission and storage efficiency,
+     and the standard tends to be free with the number  of  bits  con-
+     sumed.   For  example,  field  names  are specified as free text,
+     rather than special terse codes.
+
+          A general "memo" framework is used.  That is, a message con-
+     sists of some information in a rigid format, followed by the main
+     part of the message, with a format that is not specified in  this
+     document.   The  syntax of several fields of the rigidly-formated
+     ("headers") section is defined in  this  specification;  some  of
+     these fields must be included in all messages.
+
+          The syntax  that  distinguishes  between  header  fields  is
+     specified  separately  from  the  internal  syntax for particular
+     fields.  This separation is intended to allow simple  parsers  to
+     operate on the general structure of messages, without concern for
+     the detailed structure of individual header fields.   Appendix  B
+     is provided to facilitate construction of these parsers.
+
+          In addition to the fields specified in this document, it  is
+     expected  that  other fields will gain common use.  As necessary,
+     the specifications for these "extension-fields" will be published
+     through  the same mechanism used to publish this document.  Users
+     may also  wish  to  extend  the  set  of  fields  that  they  use
+     privately.  Such "user-defined fields" are permitted.
+
+          The framework severely constrains document tone and  appear-
+     ance and is primarily useful for most intra-organization communi-
+     cations and  well-structured   inter-organization  communication.
+     It  also  can  be used for some types of inter-process communica-
+     tion, such as simple file transfer and remote job entry.  A  more
+     robust  framework might allow for multi-font, multi-color, multi-
+     dimension encoding of information.  A  less  robust  one,  as  is
+     present  in  most  single-machine  message  systems,  would  more
+     severely constrain the ability to add fields and the decision  to
+     include specific fields.  In contrast with paper-based communica-
+     tion, it is interesting to note that the RECEIVER  of  a  message
+     can   exercise  an  extraordinary  amount  of  control  over  the
+     message's appearance.  The amount of actual control available  to
+     message  receivers  is  contingent upon the capabilities of their
+     individual message systems.
+
+
+
+
+
+     August 13, 1982               - 2 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     2.  NOTATIONAL CONVENTIONS
+
+          This specification uses an augmented Backus-Naur Form  (BNF)
+     notation.  The differences from standard BNF involve naming rules
+     and indicating repetition and "local" alternatives.
+
+     2.1.  RULE NAMING
+
+          Angle brackets ("<", ">") are not  used,  in  general.   The
+     name  of  a rule is simply the name itself, rather than "<name>".
+     Quotation-marks enclose literal text (which may be  upper  and/or
+     lower  case).   Certain  basic  rules  are  in uppercase, such as
+     SPACE, TAB, CRLF, DIGIT, ALPHA, etc.  Angle brackets are used  in
+     rule  definitions,  and  in  the rest of this  document, whenever
+     their presence will facilitate discerning the use of rule names.
+
+     2.2.  RULE1 / RULE2:  ALTERNATIVES
+
+          Elements separated by slash ("/") are alternatives.   There-
+     fore "foo / bar" will accept foo or bar.
+
+     2.3.  (RULE1 RULE2):  LOCAL ALTERNATIVES
+
+          Elements enclosed in parentheses are  treated  as  a  single
+     element.   Thus,  "(elem  (foo  /  bar)  elem)"  allows the token
+     sequences "elem foo elem" and "elem bar elem".
+
+     2.4.  *RULE:  REPETITION
+
+          The character "*" preceding an element indicates repetition.
+     The full form is:
+
+                              <l>*<m>element
+
+     indicating at least <l> and at most <m> occurrences  of  element.
+     Default values are 0 and infinity so that "*(element)" allows any
+     number, including zero; "1*element" requires at  least  one;  and
+     "1*2element" allows one or two.
+
+     2.5.  [RULE]:  OPTIONAL
+
+          Square brackets enclose optional elements; "[foo  bar]"   is
+     equivalent to "*1(foo bar)".
+
+     2.6.  NRULE:  SPECIFIC REPETITION
+
+          "<n>(element)" is equivalent to "<n>*<n>(element)"; that is,
+     exactly  <n>  occurrences  of (element). Thus 2DIGIT is a 2-digit
+     number, and 3ALPHA is a string of three alphabetic characters.
+
+
+     August 13, 1982               - 3 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     2.7.  #RULE:  LISTS
+
+          A construct "#" is defined, similar to "*", as follows:
+
+                              <l>#<m>element
+
+     indicating at least <l> and at most <m> elements, each  separated
+     by  one  or more commas (","). This makes the usual form of lists
+     very easy; a rule such as '(element *("," element))' can be shown
+     as  "1#element".   Wherever this construct is used, null elements
+     are allowed, but do not  contribute  to  the  count  of  elements
+     present.   That  is,  "(element),,(element)"  is  permitted,  but
+     counts as only two elements.  Therefore, where at least one  ele-
+     ment  is required, at least one non-null element must be present.
+     Default values are 0 and infinity so that "#(element)" allows any
+     number,  including  zero;  "1#element" requires at least one; and
+     "1#2element" allows one or two.
+
+     2.8.  ; COMMENTS
+
+          A semi-colon, set off some distance to  the  right  of  rule
+     text,  starts  a comment that continues to the end of line.  This
+     is a simple way of including useful notes in  parallel  with  the
+     specifications.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982               - 4 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     3.  LEXICAL ANALYSIS OF MESSAGES
+
+     3.1.  GENERAL DESCRIPTION
+
+          A message consists of header fields and, optionally, a body.
+     The  body  is simply a sequence of lines containing ASCII charac-
+     ters.  It is separated from the headers by a null line  (i.e.,  a
+     line with nothing preceding the CRLF).
+
+     3.1.1.  LONG HEADER FIELDS
+
+        Each header field can be viewed as a single, logical  line  of
+        ASCII  characters,  comprising  a field-name and a field-body.
+        For convenience, the field-body  portion  of  this  conceptual
+        entity  can be split into a multiple-line representation; this
+        is called "folding".  The general rule is that wherever  there
+        may  be  linear-white-space  (NOT  simply  LWSP-chars), a CRLF
+        immediately followed by AT LEAST one LWSP-char may instead  be
+        inserted.  Thus, the single line
+
+            To:  "Joe & J. Harvey" <ddd @Org>, JJV @ BBN
+
+        can be represented as:
+
+            To:  "Joe & J. Harvey" <ddd @ Org>,
+                    JJV@BBN
+
+        and
+
+            To:  "Joe & J. Harvey"
+                            <ddd@ Org>, JJV
+             @BBN
+
+        and
+
+            To:  "Joe &
+             J. Harvey" <ddd @ Org>, JJV @ BBN
+
+             The process of moving  from  this  folded   multiple-line
+        representation  of a header field to its single line represen-
+        tation is called "unfolding".  Unfolding  is  accomplished  by
+        regarding   CRLF   immediately  followed  by  a  LWSP-char  as
+        equivalent to the LWSP-char.
+
+        Note:  While the standard  permits  folding  wherever  linear-
+               white-space is permitted, it is recommended that struc-
+               tured fields, such as those containing addresses, limit
+               folding  to higher-level syntactic breaks.  For address
+               fields, it  is  recommended  that  such  folding  occur
+
+
+     August 13, 1982               - 5 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+               between addresses, after the separating comma.
+
+     3.1.2.  STRUCTURE OF HEADER FIELDS
+
+        Once a field has been unfolded, it may be viewed as being com-
+        posed of a field-name followed by a colon (":"), followed by a
+        field-body, and  terminated  by  a  carriage-return/line-feed.
+        The  field-name must be composed of printable ASCII characters
+        (i.e., characters that  have  values  between  33.  and  126.,
+        decimal, except colon).  The field-body may be composed of any
+        ASCII characters, except CR or LF.  (While CR and/or LF may be
+        present  in the actual text, they are removed by the action of
+        unfolding the field.)
+
+        Certain field-bodies of headers may be  interpreted  according
+        to  an  internal  syntax  that some systems may wish to parse.
+        These  fields  are  called  "structured   fields".    Examples
+        include  fields containing dates and addresses.  Other fields,
+        such as "Subject"  and  "Comments",  are  regarded  simply  as
+        strings of text.
+
+        Note:  Any field which has a field-body  that  is  defined  as
+               other  than  simply <text> is to be treated as a struc-
+               tured field.
+
+               Field-names, unstructured field bodies  and  structured
+               field bodies each are scanned by their own, independent
+               "lexical" analyzers.
+
+     3.1.3.  UNSTRUCTURED FIELD BODIES
+
+        For some fields, such as "Subject" and "Comments",  no  struc-
+        turing  is assumed, and they are treated simply as <text>s, as
+        in the message body.  Rules of folding apply to these  fields,
+        so  that  such  field  bodies  which occupy several lines must
+        therefore have the second and successive lines indented by  at
+        least one LWSP-char.
+
+     3.1.4.  STRUCTURED FIELD BODIES
+
+        To aid in the creation and reading of structured  fields,  the
+        free  insertion   of linear-white-space (which permits folding
+        by inclusion of CRLFs)  is  allowed  between  lexical  tokens.
+        Rather  than  obscuring  the  syntax  specifications for these
+        structured fields with explicit syntax for this  linear-white-
+        space, the existence of another "lexical" analyzer is assumed.
+        This analyzer does not apply  for  unstructured  field  bodies
+        that  are  simply  strings  of  text, as described above.  The
+        analyzer provides  an  interpretation  of  the  unfolded  text
+
+
+     August 13, 1982               - 6 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        composing  the body of the field as a sequence of lexical sym-
+        bols.
+
+        These symbols are:
+
+                     -  individual special characters
+                     -  quoted-strings
+                     -  domain-literals
+                     -  comments
+                     -  atoms
+
+        The first four of these symbols  are  self-delimiting.   Atoms
+        are not; they are delimited by the self-delimiting symbols and
+        by  linear-white-space.   For  the  purposes  of  regenerating
+        sequences  of  atoms  and quoted-strings, exactly one SPACE is
+        assumed to exist, and should be used, between them.  (Also, in
+        the "Clarifications" section on "White Space", below, note the
+        rules about treatment of multiple contiguous LWSP-chars.)
+
+        So, for example, the folded body of an address field
+
+            ":sysmail"@  Some-Group. Some-Org,
+            Muhammed.(I am  the greatest) Ali @(the)Vegas.WBA
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982               - 7 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        is analyzed into the following lexical symbols and types:
+
+                    :sysmail              quoted string
+                    @                     special
+                    Some-Group            atom
+                    .                     special
+                    Some-Org              atom
+                    ,                     special
+                    Muhammed              atom
+                    .                     special
+                    (I am  the greatest)  comment
+                    Ali                   atom
+                    @                     atom
+                    (the)                 comment
+                    Vegas                 atom
+                    .                     special
+                    WBA                   atom
+
+        The canonical representations for the data in these  addresses
+        are the following strings:
+
+                        ":sysmail"@Some-Group.Some-Org
+
+        and
+
+                            Muhammed.Ali@Vegas.WBA
+
+        Note:  For purposes of display, and when passing  such  struc-
+               tured information to other systems, such as mail proto-
+               col  services,  there  must  be  NO  linear-white-space
+               between  <word>s  that are separated by period (".") or
+               at-sign ("@") and exactly one SPACE between  all  other
+               <word>s.  Also, headers should be in a folded form.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982               - 8 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     3.2.  HEADER FIELD DEFINITIONS
+
+          These rules show a field meta-syntax, without regard for the
+     particular  type  or internal syntax.  Their purpose is to permit
+     detection of fields; also, they present to  higher-level  parsers
+     an image of each field as fitting on one line.
+
+     field       =  field-name ":" [ field-body ] CRLF
+
+     field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">
+
+     field-body  =  field-body-contents
+                    [CRLF LWSP-char field-body]
+
+     field-body-contents =
+                   <the ASCII characters making up the field-body, as
+                    defined in the following sections, and consisting
+                    of combinations of atom, quoted-string, and
+                    specials tokens, or else consisting of texts>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982               - 9 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     3.3.  LEXICAL TOKENS
+
+          The following rules are used to define an underlying lexical
+     analyzer,  which  feeds  tokens to higher level parsers.  See the
+     ANSI references, in the Bibliography.
+
+                                                 ; (  Octal, Decimal.)
+     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
+     ALPHA       =  <any ASCII alphabetic character>
+                                                 ; (101-132, 65.- 90.)
+                                                 ; (141-172, 97.-122.)
+     DIGIT       =  <any ASCII decimal digit>    ; ( 60- 71, 48.- 57.)
+     CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
+                     character and DEL>          ; (    177,     127.)
+     CR          =  <ASCII CR, carriage return>  ; (     15,      13.)
+     LF          =  <ASCII LF, linefeed>         ; (     12,      10.)
+     SPACE       =  <ASCII SP, space>            ; (     40,      32.)
+     HTAB        =  <ASCII HT, horizontal-tab>   ; (     11,       9.)
+     <">         =  <ASCII quote mark>           ; (     42,      34.)
+     CRLF        =  CR LF
+
+     LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE
+
+     linear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE
+                                                 ; CRLF => folding
+
+     specials    =  "(" / ")" / "<" / ">" / "@"  ; Must be in quoted-
+                 /  "," / ";" / ":" / "\" / <">  ;  string, to use
+                 /  "." / "[" / "]"              ;  within a word.
+
+     delimiters  =  specials / linear-white-space / comment
+
+     text        =  <any CHAR, including bare    ; => atoms, specials,
+                     CR & bare LF, but NOT       ;  comments and
+                     including CRLF>             ;  quoted-strings are
+                                                 ;  NOT recognized.
+
+     atom        =  1*<any CHAR except specials, SPACE and CTLs>
+
+     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
+                                                 ;   quoted chars.
+
+     qtext       =  <any CHAR excepting <">,     ; => may be folded
+                     "\" & CR, and including
+                     linear-white-space>
+
+     domain-literal =  "[" *(dtext / quoted-pair) "]"
+
+
+
+
+     August 13, 1982              - 10 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     dtext       =  <any CHAR excluding "[",     ; => may be folded
+                     "]", "\" & CR, & including
+                     linear-white-space>
+
+     comment     =  "(" *(ctext / quoted-pair / comment) ")"
+
+     ctext       =  <any CHAR excluding "(",     ; => may be folded
+                     ")", "\" & CR, & including
+                     linear-white-space>
+
+     quoted-pair =  "\" CHAR                     ; may quote any char
+
+     phrase      =  1*word                       ; Sequence of words
+
+     word        =  atom / quoted-string
+
+
+     3.4.  CLARIFICATIONS
+
+     3.4.1.  QUOTING
+
+        Some characters are reserved for special interpretation,  such
+        as  delimiting lexical tokens.  To permit use of these charac-
+        ters as uninterpreted data, a quoting mechanism  is  provided.
+        To quote a character, precede it with a backslash ("\").
+
+        This mechanism is not fully general.  Characters may be quoted
+        only  within  a subset of the lexical constructs.  In particu-
+        lar, quoting is limited to use within:
+
+                             -  quoted-string
+                             -  domain-literal
+                             -  comment
+
+        Within these constructs, quoting is REQUIRED for  CR  and  "\"
+        and for the character(s) that delimit the token (e.g., "(" and
+        ")" for a comment).  However, quoting  is  PERMITTED  for  any
+        character.
+
+        Note:  In particular, quoting is NOT permitted  within  atoms.
+               For  example  when  the local-part of an addr-spec must
+               contain a special character, a quoted  string  must  be
+               used.  Therefore, a specification such as:
+
+                            Full\ Name@Domain
+
+               is not legal and must be specified as:
+
+                            "Full Name"@Domain
+
+
+     August 13, 1982              - 11 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     3.4.2.  WHITE SPACE
+
+        Note:  In structured field bodies, multiple linear space ASCII
+               characters  (namely  HTABs  and  SPACEs) are treated as
+               single spaces and may freely surround any  symbol.   In
+               all header fields, the only place in which at least one
+               LWSP-char is REQUIRED is at the beginning of  continua-
+               tion lines in a folded field.
+
+        When passing text to processes  that  do  not  interpret  text
+        according to this standard (e.g., mail protocol servers), then
+        NO linear-white-space characters should occur between a period
+        (".") or at-sign ("@") and a <word>.  Exactly ONE SPACE should
+        be used in place of arbitrary linear-white-space  and  comment
+        sequences.
+
+        Note:  Within systems conforming to this standard, wherever  a
+               member of the list of delimiters is allowed, LWSP-chars
+               may also occur before and/or after it.
+
+        Writers of  mail-sending  (i.e.,  header-generating)  programs
+        should realize that there is no network-wide definition of the
+        effect of ASCII HT (horizontal-tab) characters on the  appear-
+        ance  of  text  at another network host; therefore, the use of
+        tabs in message headers, though permitted, is discouraged.
+
+     3.4.3.  COMMENTS
+
+        A comment is a set of ASCII characters, which is  enclosed  in
+        matching  parentheses  and which is not within a quoted-string
+        The comment construct permits message originators to add  text
+        which  will  be  useful  for  human readers, but which will be
+        ignored by the formal semantics.  Comments should be  retained
+        while  the  message  is subject to interpretation according to
+        this standard.  However, comments  must  NOT  be  included  in
+        other  cases,  such  as  during  protocol  exchanges with mail
+        servers.
+
+        Comments nest, so that if an unquoted left parenthesis  occurs
+        in  a  comment  string,  there  must  also be a matching right
+        parenthesis.  When a comment acts as the delimiter  between  a
+        sequence of two lexical symbols, such as two atoms, it is lex-
+        ically equivalent with a single SPACE,  for  the  purposes  of
+        regenerating  the  sequence, such as when passing the sequence
+        onto a mail protocol server.  Comments are  detected  as  such
+        only within field-bodies of structured fields.
+
+        If a comment is to be "folded" onto multiple lines,  then  the
+        syntax  for  folding  must  be  adhered to.  (See the "Lexical
+
+
+     August 13, 1982              - 12 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        Analysis of Messages" section on "Folding Long Header  Fields"
+        above,  and  the  section on "Case Independence" below.)  Note
+        that  the  official  semantics  therefore  do  not  "see"  any
+        unquoted CRLFs that are in comments, although particular pars-
+        ing programs may wish to note their presence.  For these  pro-
+        grams,  it would be reasonable to interpret a "CRLF LWSP-char"
+        as being a CRLF that is part of the comment; i.e., the CRLF is
+        kept  and  the  LWSP-char is discarded.  Quoted CRLFs (i.e., a
+        backslash followed by a CR followed by a  LF)  still  must  be
+        followed by at least one LWSP-char.
+
+     3.4.4.  DELIMITING AND QUOTING CHARACTERS
+
+        The quote character (backslash) and  characters  that  delimit
+        syntactic  units  are not, generally, to be taken as data that
+        are part of the delimited or quoted unit(s).   In  particular,
+        the   quotation-marks   that   define   a  quoted-string,  the
+        parentheses that define  a  comment  and  the  backslash  that
+        quotes  a  following  character  are  NOT  part of the quoted-
+        string, comment or quoted character.  A quotation-mark that is
+        to  be  part  of  a quoted-string, a parenthesis that is to be
+        part of a comment and a backslash that is to be part of either
+        must  each be preceded by the quote-character backslash ("\").
+        Note that the syntax allows any character to be quoted  within
+        a  quoted-string  or  comment; however only certain characters
+        MUST be quoted to be included as data.  These  characters  are
+        the  ones that are not part of the alternate text group (i.e.,
+        ctext or qtext).
+
+        The one exception to this rule  is  that  a  single  SPACE  is
+        assumed  to  exist  between  contiguous words in a phrase, and
+        this interpretation is independent of  the  actual  number  of
+        LWSP-chars  that  the  creator  places  between the words.  To
+        include more than one SPACE, the creator must make  the  LWSP-
+        chars be part of a quoted-string.
+
+        Quotation marks that delimit a quoted string  and  backslashes
+        that  quote  the  following character should NOT accompany the
+        quoted-string when the string is passed to processes  that  do
+        not interpret data according to this specification (e.g., mail
+        protocol servers).
+
+     3.4.5.  QUOTED-STRINGS
+
+        Where permitted (i.e., in words in structured fields)  quoted-
+        strings  are  treated  as a single symbol.  That is, a quoted-
+        string is equivalent to an atom, syntactically.  If a  quoted-
+        string  is to be "folded" onto multiple lines, then the syntax
+        for folding must be adhered to.  (See the "Lexical Analysis of
+
+
+     August 13, 1982              - 13 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        Messages"  section  on "Folding Long Header Fields" above, and
+        the section on "Case  Independence"  below.)   Therefore,  the
+        official  semantics  do  not  "see" any bare CRLFs that are in
+        quoted-strings; however particular parsing programs  may  wish
+        to  note  their presence.  For such programs, it would be rea-
+        sonable to interpret a "CRLF LWSP-char" as being a CRLF  which
+        is  part  of the quoted-string; i.e., the CRLF is kept and the
+        LWSP-char is discarded.  Quoted CRLFs (i.e., a backslash  fol-
+        lowed  by  a CR followed by a LF) are also subject to rules of
+        folding, but the presence of the quoting character (backslash)
+        explicitly  indicates  that  the  CRLF  is  data to the quoted
+        string.  Stripping off the first following LWSP-char  is  also
+        appropriate when parsing quoted CRLFs.
+
+     3.4.6.  BRACKETING CHARACTERS
+
+        There is one type of bracket which must occur in matched pairs
+        and may have pairs nested within each other:
+
+            o   Parentheses ("(" and ")") are used  to  indicate  com-
+                ments.
+
+        There are three types of brackets which must occur in  matched
+        pairs, and which may NOT be nested:
+
+            o   Colon/semi-colon (":" and ";") are   used  in  address
+                specifications  to  indicate that the included list of
+                addresses are to be treated as a group.
+
+            o   Angle brackets ("<" and ">")  are  generally  used  to
+                indicate  the  presence of a one machine-usable refer-
+                ence (e.g., delimiting mailboxes), possibly  including
+                source-routing to the machine.
+
+            o   Square brackets ("[" and "]") are used to indicate the
+                presence  of  a  domain-literal, which the appropriate
+                name-domain  is  to  use  directly,  bypassing  normal
+                name-resolution mechanisms.
+
+     3.4.7.  CASE INDEPENDENCE
+
+        Except as noted, alphabetic strings may be represented in  any
+        combination of upper and lower case.  The only syntactic units
+
+
+
+
+
+
+
+
+     August 13, 1982              - 14 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        which requires preservation of case information are:
+
+                    -  text
+                    -  qtext
+                    -  dtext
+                    -  ctext
+                    -  quoted-pair
+                    -  local-part, except "Postmaster"
+
+        When matching any other syntactic unit, case is to be ignored.
+        For  example, the field-names "From", "FROM", "from", and even
+        "FroM" are semantically equal and should all be treated ident-
+        ically.
+
+        When generating these units, any mix of upper and  lower  case
+        alphabetic  characters  may  be  used.  The case shown in this
+        specification is suggested for message-creating processes.
+
+        Note:  The reserved local-part address unit, "Postmaster",  is
+               an  exception.   When  the  value "Postmaster" is being
+               interpreted, it must be  accepted  in  any  mixture  of
+               case, including "POSTMASTER", and "postmaster".
+
+     3.4.8.  FOLDING LONG HEADER FIELDS
+
+        Each header field may be represented on exactly one line  con-
+        sisting  of the name of the field and its body, and terminated
+        by a CRLF; this is what the parser sees.  For readability, the
+        field-body  portion of long header fields may be "folded" onto
+        multiple lines of the actual field.  "Long" is commonly inter-
+        preted  to  mean greater than 65 or 72 characters.  The former
+        length serves as a limit, when the message is to be viewed  on
+        most  simple terminals which use simple display software; how-
+        ever, the limit is not imposed by this standard.
+
+        Note:  Some display software often can selectively fold lines,
+               to  suit  the display terminal.  In such cases, sender-
+               provided  folding  can  interfere  with   the   display
+               software.
+
+     3.4.9.  BACKSPACE CHARACTERS
+
+        ASCII BS characters (Backspace, decimal 8) may be included  in
+        texts and quoted-strings to effect overstriking.  However, any
+        use of backspaces which effects an overstrike to the  left  of
+        the beginning of the text or quoted-string is prohibited.
+
+
+
+
+
+     August 13, 1982              - 15 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     3.4.10.  NETWORK-SPECIFIC TRANSFORMATIONS
+
+        During transmission through heterogeneous networks, it may  be
+        necessary  to  force data to conform to a network's local con-
+        ventions.  For example, it may be required that a CR  be  fol-
+        lowed  either by LF, making a CRLF, or by <null>, if the CR is
+        to stand alone).  Such transformations are reversed, when  the
+        message exits that network.
+
+        When  crossing  network  boundaries,  the  message  should  be
+        treated  as  passing  through  two modules.  It will enter the
+        first module containing whatever network-specific  transforma-
+        tions  that  were  necessary  to  permit migration through the
+        "current" network.  It then passes through the modules:
+
+            o   Transformation Reversal
+
+                The "current" network's idiosyncracies are removed and
+                the  message  is returned to the canonical form speci-
+                fied in this standard.
+
+            o   Transformation
+
+                The "next" network's local idiosyncracies are  imposed
+                on the message.
+
+                                ------------------
+                    From   ==>  | Remove Net-A   |
+                    Net-A       | idiosyncracies |
+                                ------------------
+                                       ||
+                                       \/
+                                  Conformance
+                                  with standard
+                                       ||
+                                       \/
+                                ------------------
+                                | Impose Net-B   |  ==>  To
+                                | idiosyncracies |       Net-B
+                                ------------------
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 16 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     4.  MESSAGE SPECIFICATION
+
+     4.1.  SYNTAX
+
+     Note:  Due to an artifact of the notational conventions, the syn-
+            tax  indicates that, when present, some fields, must be in
+            a particular order.  Header fields  are  NOT  required  to
+            occur  in  any  particular  order, except that the message
+            body must occur AFTER  the  headers.   It  is  recommended
+            that,  if  present,  headers be sent in the order "Return-
+            Path", "Received", "Date",  "From",  "Subject",  "Sender",
+            "To", "cc", etc.
+
+            This specification permits multiple  occurrences  of  most
+            fields.   Except  as  noted,  their  interpretation is not
+            specified here, and their use is discouraged.
+
+          The following syntax for the bodies of various fields should
+     be  thought  of  as  describing  each field body as a single long
+     string (or line).  The "Lexical Analysis of Message"  section  on
+     "Long  Header Fields", above, indicates how such long strings can
+     be represented on more than one line in  the  actual  transmitted
+     message.
+
+     message     =  fields *( CRLF *text )       ; Everything after
+                                                 ;  first null line
+                                                 ;  is message body
+
+     fields      =    dates                      ; Creation time,
+                      source                     ;  author id & one
+                    1*destination                ;  address required
+                     *optional-field             ;  others optional
+
+     source      = [  trace ]                    ; net traversals
+                      originator                 ; original mail
+                   [  resent ]                   ; forwarded
+
+     trace       =    return                     ; path to sender
+                    1*received                   ; receipt tags
+
+     return      =  "Return-path" ":" route-addr ; return address
+
+     received    =  "Received"    ":"            ; one per relay
+                       ["from" domain]           ; sending host
+                       ["by"   domain]           ; receiving host
+                       ["via"  atom]             ; physical path
+                      *("with" atom)             ; link/mail protocol
+                       ["id"   msg-id]           ; receiver msg id
+                       ["for"  addr-spec]        ; initial form
+
+
+     August 13, 1982              - 17 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+                        ";"    date-time         ; time received
+
+     originator  =   authentic                   ; authenticated addr
+                   [ "Reply-To"   ":" 1#address] )
+
+     authentic   =   "From"       ":"   mailbox  ; Single author
+                 / ( "Sender"     ":"   mailbox  ; Actual submittor
+                     "From"       ":" 1#mailbox) ; Multiple authors
+                                                 ;  or not sender
+
+     resent      =   resent-authentic
+                   [ "Resent-Reply-To"  ":" 1#address] )
+
+     resent-authentic =
+                 =   "Resent-From"      ":"   mailbox
+                 / ( "Resent-Sender"    ":"   mailbox
+                     "Resent-From"      ":" 1#mailbox  )
+
+     dates       =   orig-date                   ; Original
+                   [ resent-date ]               ; Forwarded
+
+     orig-date   =  "Date"        ":"   date-time
+
+     resent-date =  "Resent-Date" ":"   date-time
+
+     destination =  "To"          ":" 1#address  ; Primary
+                 /  "Resent-To"   ":" 1#address
+                 /  "cc"          ":" 1#address  ; Secondary
+                 /  "Resent-cc"   ":" 1#address
+                 /  "bcc"         ":"  #address  ; Blind carbon
+                 /  "Resent-bcc"  ":"  #address
+
+     optional-field =
+                 /  "Message-ID"        ":"   msg-id
+                 /  "Resent-Message-ID" ":"   msg-id
+                 /  "In-Reply-To"       ":"  *(phrase / msg-id)
+                 /  "References"        ":"  *(phrase / msg-id)
+                 /  "Keywords"          ":"  #phrase
+                 /  "Subject"           ":"  *text
+                 /  "Comments"          ":"  *text
+                 /  "Encrypted"         ":" 1#2word
+                 /  extension-field              ; To be defined
+                 /  user-defined-field           ; May be pre-empted
+
+     msg-id      =  "<" addr-spec ">"            ; Unique message id
+
+
+
+
+
+
+     August 13, 1982              - 18 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     extension-field =
+                   <Any field which is defined in a document
+                    published as a formal extension to this
+                    specification; none will have names beginning
+                    with the string "X-">
+
+     user-defined-field =
+                   <Any field which has not been defined
+                    in this specification or published as an
+                    extension to this specification; names for
+                    such fields must be unique and may be
+                    pre-empted by published extensions>
+
+     4.2.  FORWARDING
+
+          Some systems permit mail recipients to  forward  a  message,
+     retaining  the original headers, by adding some new fields.  This
+     standard supports such a service, through the "Resent-" prefix to
+     field names.
+
+          Whenever the string "Resent-" begins a field name, the field
+     has  the  same  semantics as a field whose name does not have the
+     prefix.  However, the message is assumed to have  been  forwarded
+     by  an original recipient who attached the "Resent-" field.  This
+     new field is treated as being more recent  than  the  equivalent,
+     original  field.   For  example, the "Resent-From", indicates the
+     person that forwarded the message, whereas the "From" field indi-
+     cates the original author.
+
+          Use of such precedence  information  depends  upon  partici-
+     pants'  communication needs.  For example, this standard does not
+     dictate when a "Resent-From:" address should receive replies,  in
+     lieu of sending them to the "From:" address.
+
+     Note:  In general, the "Resent-" fields should be treated as con-
+            taining  a  set  of information that is independent of the
+            set of original fields.  Information for  one  set  should
+            not  automatically be taken from the other.  The interpre-
+            tation of multiple "Resent-" fields, of the same type,  is
+            undefined.
+
+          In the remainder of this specification, occurrence of  legal
+     "Resent-"  fields  are treated identically with the occurrence of
+
+
+
+
+
+
+
+
+     August 13, 1982              - 19 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     fields whose names do not contain this prefix.
+
+     4.3.  TRACE FIELDS
+
+          Trace information is used to provide an audit trail of  mes-
+     sage  handling.   In  addition,  it indicates a route back to the
+     sender of the message.
+
+          The list of known "via" and  "with"  values  are  registered
+     with  the  Network  Information  Center, SRI International, Menlo
+     Park, California.
+
+     4.3.1.  RETURN-PATH
+
+        This field  is  added  by  the  final  transport  system  that
+        delivers  the message to its recipient.  The field is intended
+        to contain definitive information about the address and  route
+        back to the message's originator.
+
+        Note:  The "Reply-To" field is added  by  the  originator  and
+               serves  to  direct  replies,  whereas the "Return-Path"
+               field is used to identify a path back to  the  origina-
+               tor.
+
+        While the syntax  indicates  that  a  route  specification  is
+        optional,  every attempt should be made to provide that infor-
+        mation in this field.
+
+     4.3.2.  RECEIVED
+
+        A copy of this field is added by each transport  service  that
+        relays the message.  The information in the field can be quite
+        useful for tracing transport problems.
+
+        The names of the sending  and  receiving  hosts  and  time-of-
+        receipt may be specified.  The "via" parameter may be used, to
+        indicate what physical mechanism the message  was  sent  over,
+        such  as  Arpanet or Phonenet, and the "with" parameter may be
+        used to indicate the mail-,  or  connection-,  level  protocol
+        that  was  used, such as the SMTP mail protocol, or X.25 tran-
+        sport protocol.
+
+        Note:  Several "with" parameters may  be  included,  to  fully
+               specify the set of protocols that were used.
+
+        Some transport services queue mail; the internal message iden-
+        tifier that is assigned to the message may be noted, using the
+        "id" parameter.  When the  sending  host  uses  a  destination
+        address specification that the receiving host reinterprets, by
+
+
+     August 13, 1982              - 20 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        expansion or transformation, the receiving host  may  wish  to
+        record  the original specification, using the "for" parameter.
+        For example, when a copy of mail is sent to the  member  of  a
+        distribution  list,  this  parameter may be used to record the
+        original address that was used to specify the list.
+
+     4.4.  ORIGINATOR FIELDS
+
+          The standard allows only a subset of the combinations possi-
+     ble  with the From, Sender, Reply-To, Resent-From, Resent-Sender,
+     and Resent-Reply-To fields.  The limitation is intentional.
+
+     4.4.1.  FROM / RESENT-FROM
+
+        This field contains the identity of the person(s)  who  wished
+        this  message to be sent.  The message-creation process should
+        default this field  to  be  a  single,  authenticated  machine
+        address,  indicating  the  AGENT  (person,  system or process)
+        entering the message.  If this is not done, the "Sender" field
+        MUST  be  present.  If the "From" field IS defaulted this way,
+        the "Sender" field is  optional  and  is  redundant  with  the
+        "From"  field.   In  all  cases, addresses in the "From" field
+        must be machine-usable (addr-specs) and may not contain  named
+        lists (groups).
+
+     4.4.2.  SENDER / RESENT-SENDER
+
+        This field contains the authenticated identity  of  the  AGENT
+        (person,  system  or  process)  that sends the message.  It is
+        intended for use when the sender is not the author of the mes-
+        sage,  or  to  indicate  who among a group of authors actually
+        sent the message.  If the contents of the "Sender" field would
+        be  completely  redundant  with  the  "From"  field,  then the
+        "Sender" field need not be present and its use is  discouraged
+        (though  still legal).  In particular, the "Sender" field MUST
+        be present if it is NOT the same as the "From" Field.
+
+        The Sender mailbox  specification  includes  a  word  sequence
+        which  must correspond to a specific agent (i.e., a human user
+        or a computer program) rather than a standard  address.   This
+        indicates  the  expectation  that  the field will identify the
+        single AGENT (person,  system,  or  process)  responsible  for
+        sending  the mail and not simply include the name of a mailbox
+        from which the mail was sent.  For example in the  case  of  a
+        shared login name, the name, by itself, would not be adequate.
+        The local-part address unit, which refers to  this  agent,  is
+        expected to be a computer system term, and not (for example) a
+        generalized person reference which can  be  used  outside  the
+        network text message context.
+
+
+     August 13, 1982              - 21 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        Since the critical function served by the  "Sender"  field  is
+        identification  of  the agent responsible for sending mail and
+        since computer programs cannot be held accountable  for  their
+        behavior, it is strongly recommended that when a computer pro-
+        gram generates a message, the HUMAN  who  is  responsible  for
+        that program be referenced as part of the "Sender" field mail-
+        box specification.
+
+     4.4.3.  REPLY-TO / RESENT-REPLY-TO
+
+        This field provides a general  mechanism  for  indicating  any
+        mailbox(es)  to which responses are to be sent.  Three typical
+        uses for this feature can  be  distinguished.   In  the  first
+        case,  the  author(s) may not have regular machine-based mail-
+        boxes and therefore wish(es) to indicate an alternate  machine
+        address.   In  the  second case, an author may wish additional
+        persons to be made aware of, or responsible for,  replies.   A
+        somewhat  different  use  may be of some help to "text message
+        teleconferencing" groups equipped with automatic  distribution
+        services:   include the address of that service in the "Reply-
+        To" field of all messages  submitted  to  the  teleconference;
+        then  participants  can  "reply"  to conference submissions to
+        guarantee the correct distribution of any submission of  their
+        own.
+
+        Note:  The "Return-Path" field is added by the mail  transport
+               service,  at the time of final deliver.  It is intended
+               to identify a path back to the orginator  of  the  mes-
+               sage.   The  "Reply-To"  field  is added by the message
+               originator and is intended to direct replies.
+
+     4.4.4.  AUTOMATIC USE OF FROM / SENDER / REPLY-TO
+
+        For systems which automatically  generate  address  lists  for
+        replies to messages, the following recommendations are made:
+
+            o   The "Sender" field mailbox should be sent  notices  of
+                any  problems in transport or delivery of the original
+                messages.  If there is no  "Sender"  field,  then  the
+                "From" field mailbox should be used.
+
+            o   The  "Sender"  field  mailbox  should  NEVER  be  used
+                automatically, in a recipient's reply message.
+
+            o   If the "Reply-To" field exists, then the reply  should
+                go to the addresses indicated in that field and not to
+                the address(es) indicated in the "From" field.
+
+
+
+
+     August 13, 1982              - 22 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+            o   If there is a "From" field, but no  "Reply-To"  field,
+                the  reply should be sent to the address(es) indicated
+                in the "From" field.
+
+        Sometimes, a recipient may actually wish to  communicate  with
+        the  person  that  initiated  the  message  transfer.  In such
+        cases, it is reasonable to use the "Sender" address.
+
+        This recommendation is intended  only  for  automated  use  of
+        originator-fields  and is not intended to suggest that replies
+        may not also be sent to other recipients of messages.   It  is
+        up  to  the  respective  mail-handling programs to decide what
+        additional facilities will be provided.
+
+        Examples are provided in Appendix A.
+
+     4.5.  RECEIVER FIELDS
+
+     4.5.1.  TO / RESENT-TO
+
+        This field contains the identity of the primary recipients  of
+        the message.
+
+     4.5.2.  CC / RESENT-CC
+
+        This field contains the identity of  the  secondary  (informa-
+        tional) recipients of the message.
+
+     4.5.3.  BCC / RESENT-BCC
+
+        This field contains the identity of additional  recipients  of
+        the  message.   The contents of this field are not included in
+        copies of the message sent to the primary and secondary  reci-
+        pients.   Some  systems  may choose to include the text of the
+        "Bcc" field only in the author(s)'s  copy,  while  others  may
+        also include it in the text sent to all those indicated in the
+        "Bcc" list.
+
+     4.6.  REFERENCE FIELDS
+
+     4.6.1.  MESSAGE-ID / RESENT-MESSAGE-ID
+
+             This field contains a unique identifier  (the  local-part
+        address  unit)  which  refers to THIS version of THIS message.
+        The uniqueness of the message identifier is guaranteed by  the
+        host  which  generates  it.  This identifier is intended to be
+        machine readable and not necessarily meaningful to humans.   A
+        message  identifier pertains to exactly one instantiation of a
+        particular message; subsequent revisions to the message should
+
+
+     August 13, 1982              - 23 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        each receive new message identifiers.
+
+     4.6.2.  IN-REPLY-TO
+
+             The contents of this field identify  previous  correspon-
+        dence  which this message answers.  Note that if message iden-
+        tifiers are used in this  field,  they  must  use  the  msg-id
+        specification format.
+
+     4.6.3.  REFERENCES
+
+             The contents of this field identify other  correspondence
+        which  this message references.  Note that if message identif-
+        iers are used, they must use the msg-id specification format.
+
+     4.6.4.  KEYWORDS
+
+             This field contains keywords  or  phrases,  separated  by
+        commas.
+
+     4.7.  OTHER FIELDS
+
+     4.7.1.  SUBJECT
+
+             This is intended to provide a summary,  or  indicate  the
+        nature, of the message.
+
+     4.7.2.  COMMENTS
+
+             Permits adding text comments  onto  the  message  without
+        disturbing the contents of the message's body.
+
+     4.7.3.  ENCRYPTED
+
+             Sometimes,  data  encryption  is  used  to  increase  the
+        privacy  of  message  contents.   If the body of a message has
+        been encrypted, to keep its contents private, the  "Encrypted"
+        field  can be used to note the fact and to indicate the nature
+        of the encryption.  The first <word> parameter  indicates  the
+        software  used  to  encrypt the body, and the second, optional
+        <word> is intended to  aid  the  recipient  in  selecting  the
+        proper  decryption  key.   This  code word may be viewed as an
+        index to a table of keys held by the recipient.
+
+        Note:  Unfortunately, headers must contain envelope,  as  well
+               as  contents,  information.  Consequently, it is neces-
+               sary that they remain unencrypted, so that  mail  tran-
+               sport   services   may   access   them.   Since  names,
+               addresses, and "Subject"  field  contents  may  contain
+
+
+     August 13, 1982              - 24 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+               sensitive  information,  this  requirement limits total
+               message privacy.
+
+             Names of encryption software are registered with the Net-
+        work  Information Center, SRI International, Menlo Park, Cali-
+        fornia.
+
+     4.7.4.  EXTENSION-FIELD
+
+             A limited number of common fields have  been  defined  in
+        this  document.   As  network mail requirements dictate, addi-
+        tional fields may be standardized.   To  provide  user-defined
+        fields  with  a  measure  of  safety,  in name selection, such
+        extension-fields will never have names  that  begin  with  the
+        string "X-".
+
+             Names of Extension-fields are registered with the Network
+        Information Center, SRI International, Menlo Park, California.
+
+     4.7.5.  USER-DEFINED-FIELD
+
+             Individual users of network mail are free to  define  and
+        use  additional  header  fields.   Such fields must have names
+        which are not already used in the current specification or  in
+        any definitions of extension-fields, and the overall syntax of
+        these user-defined-fields must conform to this specification's
+        rules   for   delimiting  and  folding  fields.   Due  to  the
+        extension-field  publishing  process,  the  name  of  a  user-
+        defined-field may be pre-empted
+
+        Note:  The prefatory string "X-" will never  be  used  in  the
+               names  of Extension-fields.  This provides user-defined
+               fields with a protected set of names.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 25 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     5.  DATE AND TIME SPECIFICATION
+
+     5.1.  SYNTAX
+
+     date-time   =  [ day "," ] date time        ; dd mm yy
+                                                 ;  hh:mm:ss zzz
+
+     day         =  "Mon"  / "Tue" /  "Wed"  / "Thu"
+                 /  "Fri"  / "Sat" /  "Sun"
+
+     date        =  1*2DIGIT month 2DIGIT        ; day month year
+                                                 ;  e.g. 20 Jun 82
+
+     month       =  "Jan"  /  "Feb" /  "Mar"  /  "Apr"
+                 /  "May"  /  "Jun" /  "Jul"  /  "Aug"
+                 /  "Sep"  /  "Oct" /  "Nov"  /  "Dec"
+
+     time        =  hour zone                    ; ANSI and Military
+
+     hour        =  2DIGIT ":" 2DIGIT [":" 2DIGIT]
+                                                 ; 00:00:00 - 23:59:59
+
+     zone        =  "UT"  / "GMT"                ; Universal Time
+                                                 ; North American : UT
+                 /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
+                 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
+                 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
+                 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
+                 /  1ALPHA                       ; Military: Z = UT;
+                                                 ;  A:-1; (J not used)
+                                                 ;  M:-12; N:+1; Y:+12
+                 / ( ("+" / "-") 4DIGIT )        ; Local differential
+                                                 ;  hours+min. (HHMM)
+
+     5.2.  SEMANTICS
+
+          If included, day-of-week must be the day implied by the date
+     specification.
+
+          Time zone may be indicated in several ways.  "UT" is Univer-
+     sal  Time  (formerly called "Greenwich Mean Time"); "GMT" is per-
+     mitted as a reference to Universal Time.  The  military  standard
+     uses  a  single  character for each zone.  "Z" is Universal Time.
+     "A" indicates one hour earlier, and "M" indicates 12  hours  ear-
+     lier;  "N"  is  one  hour  later, and "Y" is 12 hours later.  The
+     letter "J" is not used.  The other remaining two forms are  taken
+     from ANSI standard X3.51-1975.  One allows explicit indication of
+     the amount of offset from UT; the other uses  common  3-character
+     strings for indicating time zones in North America.
+
+
+     August 13, 1982              - 26 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     6.  ADDRESS SPECIFICATION
+
+     6.1.  SYNTAX
+
+     address     =  mailbox                      ; one addressee
+                 /  group                        ; named list
+
+     group       =  phrase ":" [#mailbox] ";"
+
+     mailbox     =  addr-spec                    ; simple address
+                 /  phrase route-addr            ; name & addr-spec
+
+     route-addr  =  "<" [route] addr-spec ">"
+
+     route       =  1#("@" domain) ":"           ; path-relative
+
+     addr-spec   =  local-part "@" domain        ; global address
+
+     local-part  =  word *("." word)             ; uninterpreted
+                                                 ; case-preserved
+
+     domain      =  sub-domain *("." sub-domain)
+
+     sub-domain  =  domain-ref / domain-literal
+
+     domain-ref  =  atom                         ; symbolic reference
+
+     6.2.  SEMANTICS
+
+          A mailbox receives mail.  It is a  conceptual  entity  which
+     does  not necessarily pertain to file storage.  For example, some
+     sites may choose to print mail on their line printer and  deliver
+     the output to the addressee's desk.
+
+          A mailbox specification comprises a person, system  or  pro-
+     cess name reference, a domain-dependent string, and a name-domain
+     reference.  The name reference is optional and is usually used to
+     indicate  the  human name of a recipient.  The name-domain refer-
+     ence specifies a sequence of sub-domains.   The  domain-dependent
+     string is uninterpreted, except by the final sub-domain; the rest
+     of the mail service merely transmits it as a literal string.
+
+     6.2.1.  DOMAINS
+
+        A name-domain is a set of registered (mail)  names.   A  name-
+        domain  specification  resolves  to  a subordinate name-domain
+        specification  or  to  a  terminal  domain-dependent   string.
+        Hence,  domain  specification  is  extensible,  permitting any
+        number of registration levels.
+
+
+     August 13, 1982              - 27 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        Name-domains model a global, logical, hierarchical  addressing
+        scheme.   The  model is logical, in that an address specifica-
+        tion is related to name registration and  is  not  necessarily
+        tied  to  transmission  path.   The  model's  hierarchy  is  a
+        directed graph, called an in-tree, such that there is a single
+        path  from  the root of the tree to any node in the hierarchy.
+        If more than one path actually exists, they are considered  to
+        be different addresses.
+
+        The root node is common to all addresses; consequently, it  is
+        not  referenced.   Its  children  constitute "top-level" name-
+        domains.  Usually, a service has access to its own full domain
+        specification and to the names of all top-level name-domains.
+
+        The "top" of the domain addressing hierarchy -- a child of the
+        root  --  is  indicated  by  the right-most field, in a domain
+        specification.  Its child is specified to the left, its  child
+        to the left, and so on.
+
+        Some groups provide formal registration services;  these  con-
+        stitute   name-domains   that  are  independent  logically  of
+        specific machines.  In addition, networks and machines  impli-
+        citly  compose name-domains, since their membership usually is
+        registered in name tables.
+
+        In the case of formal registration, an organization implements
+        a  (distributed)  data base which provides an address-to-route
+        mapping service for addresses of the form:
+
+                         person@registry.organization
+
+        Note that "organization" is a logical  entity,  separate  from
+        any particular communication network.
+
+        A mechanism for accessing "organization" is universally avail-
+        able.   That mechanism, in turn, seeks an instantiation of the
+        registry; its location is not indicated in the address specif-
+        ication.   It  is assumed that the system which operates under
+        the name "organization" knows how to find a subordinate regis-
+        try.  The registry will then use the "person" string to deter-
+        mine where to send the mail specification.
+
+        The latter,  network-oriented  case  permits  simple,  direct,
+        attachment-related address specification, such as:
+
+                              user@host.network
+
+        Once the network is accessed, it is expected  that  a  message
+        will  go  directly  to the host and that the host will resolve
+
+
+     August 13, 1982              - 28 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        the user name, placing the message in the user's mailbox.
+
+     6.2.2.  ABBREVIATED DOMAIN SPECIFICATION
+
+        Since any number of  levels  is  possible  within  the  domain
+        hierarchy,  specification  of  a  fully  qualified address can
+        become inconvenient.  This standard permits abbreviated domain
+        specification, in a special case:
+
+            For the address of  the  sender,  call  the  left-most
+            sub-domain  Level  N.   In a header address, if all of
+            the sub-domains above (i.e., to the right of) Level  N
+            are  the same as those of the sender, then they do not
+            have to appear in the specification.   Otherwise,  the
+            address must be fully qualified.
+
+            This feature is subject  to  approval  by  local  sub-
+            domains.   Individual  sub-domains  may  require their
+            member systems, which originate mail, to provide  full
+            domain  specification only.  When permitted, abbrevia-
+            tions may be present  only  while  the  message  stays
+            within the sub-domain of the sender.
+
+            Use of this mechanism requires the sender's sub-domain
+            to reserve the names of all top-level domains, so that
+            full specifications can be distinguished from abbrevi-
+            ated specifications.
+
+        For example, if a sender's address is:
+
+                 sender@registry-A.registry-1.organization-X
+
+        and one recipient's address is:
+
+                recipient@registry-B.registry-1.organization-X
+
+        and another's is:
+
+                recipient@registry-C.registry-2.organization-X
+
+        then ".registry-1.organization-X" need not be specified in the
+        the  message,  but  "registry-C.registry-2"  DOES  have  to be
+        specified.  That is, the first two addresses may  be  abbrevi-
+        ated, but the third address must be fully specified.
+
+        When a message crosses a domain boundary, all  addresses  must
+        be  specified  in  the  full format, ending with the top-level
+        name-domain in the right-most field.  It is the responsibility
+        of  mail  forwarding services to ensure that addresses conform
+
+
+     August 13, 1982              - 29 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        with this requirement.  In the case of abbreviated  addresses,
+        the  relaying  service must make the necessary expansions.  It
+        should be noted that it often is difficult for such a  service
+        to locate all occurrences of address abbreviations.  For exam-
+        ple, it will not be possible to find such abbreviations within
+        the  body  of  the  message.   The "Return-Path" field can aid
+        recipients in recovering from these errors.
+
+        Note:  When passing any portion of an addr-spec onto a process
+               which  does  not interpret data according to this stan-
+               dard (e.g., mail protocol servers).  There must  be  NO
+               LWSP-chars  preceding  or  following the at-sign or any
+               delimiting period ("."), such as  shown  in  the  above
+               examples,   and   only  ONE  SPACE  between  contiguous
+               <word>s.
+
+     6.2.3.  DOMAIN TERMS
+
+        A domain-ref must be THE official name of a registry, network,
+        or  host.   It  is  a  symbolic  reference, within a name sub-
+        domain.  At times, it is necessary to bypass standard  mechan-
+        isms  for  resolving  such  references,  using  more primitive
+        information, such as a network host address  rather  than  its
+        associated host name.
+
+        To permit such references, this standard provides the  domain-
+        literal  construct.   Its contents must conform with the needs
+        of the sub-domain in which it is interpreted.
+
+        Domain-literals which refer to domains within the ARPA  Inter-
+        net  specify  32-bit  Internet addresses, in four 8-bit fields
+        noted in decimal, as described in Request for  Comments  #820,
+        "Assigned Numbers."  For example:
+
+                                 [10.0.3.19]
+
+        Note:  THE USE OF DOMAIN-LITERALS IS STRONGLY DISCOURAGED.  It
+               is  permitted  only  as  a means of bypassing temporary
+               system limitations, such as name tables which  are  not
+               complete.
+
+        The names of "top-level" domains, and  the  names  of  domains
+        under  in  the  ARPA Internet, are registered with the Network
+        Information Center, SRI International, Menlo Park, California.
+
+     6.2.4.  DOMAIN-DEPENDENT LOCAL STRING
+
+        The local-part of an  addr-spec  in  a  mailbox  specification
+        (i.e.,  the  host's  name for the mailbox) is understood to be
+
+
+     August 13, 1982              - 30 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        whatever the receiving mail protocol server allows.  For exam-
+        ple,  some systems do not understand mailbox references of the
+        form "P. D. Q. Bach", but others do.
+
+        This specification treats periods (".") as lexical separators.
+        Hence,  their  presence  in  local-parts which are not quoted-
+        strings, is detected.   However,  such  occurrences  carry  NO
+        semantics.  That is, if a local-part has periods within it, an
+        address parser will divide the local-part into several tokens,
+        but  the  sequence  of  tokens will be treated as one uninter-
+        preted unit.  The sequence  will  be  re-assembled,  when  the
+        address is passed outside of the system such as to a mail pro-
+        tocol service.
+
+        For example, the address:
+
+                           First.Last@Registry.Org
+
+        is legal and does not require the local-part to be  surrounded
+        with  quotation-marks.   (However,  "First  Last" DOES require
+        quoting.)  The local-part of the address, when passed  outside
+        of  the  mail  system,  within  the  Registry.Org  domain,  is
+        "First.Last", again without quotation marks.
+
+     6.2.5.  BALANCING LOCAL-PART AND DOMAIN
+
+        In some cases, the boundary between local-part and domain  can
+        be  flexible.  The local-part may be a simple string, which is
+        used for the final determination of the  recipient's  mailbox.
+        All  other  levels  of  reference  are, therefore, part of the
+        domain.
+
+        For some systems, in the case of abbreviated reference to  the
+        local  and  subordinate  sub-domains,  it  may  be possible to
+        specify only one reference within the domain  part  and  place
+        the  other,  subordinate  name-domain  references  within  the
+        local-part.  This would appear as:
+
+                        mailbox.sub1.sub2@this-domain
+
+        Such a specification would be acceptable  to  address  parsers
+        which  conform  to  RFC  #733,  but  do not support this newer
+        Internet standard.  While contrary to the intent of this stan-
+        dard, the form is legal.
+
+        Also, some sub-domains have a specification syntax which  does
+        not conform to this standard.  For example:
+
+                      sub-net.mailbox@sub-domain.domain
+
+
+     August 13, 1982              - 31 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        uses a different parsing  sequence  for  local-part  than  for
+        domain.
+
+        Note:  As a rule,  the  domain  specification  should  contain
+               fields  which  are  encoded  according to the syntax of
+               this standard and which contain  generally-standardized
+               information.   The local-part specification should con-
+               tain only that portion of the  address  which  deviates
+               from the form or intention of the domain field.
+
+     6.2.6.  MULTIPLE MAILBOXES
+
+        An individual may have several mailboxes and wish  to  receive
+        mail  at  whatever  mailbox  is  convenient  for the sender to
+        access.  This standard does not provide a means of  specifying
+        "any member of" a list of mailboxes.
+
+        A set of individuals may wish to receive mail as a single unit
+        (i.e.,  a  distribution  list).  The <group> construct permits
+        specification of such a list.  Recipient mailboxes are  speci-
+        fied  within  the  bracketed  part (":" - ";").  A copy of the
+        transmitted message is to be  sent  to  each  mailbox  listed.
+        This  standard  does  not  permit  recursive  specification of
+        groups within groups.
+
+        While a list must be named, it is not required that  the  con-
+        tents  of  the  list be included.  In this case, the <address>
+        serves only as an indication of group distribution  and  would
+        appear in the form:
+
+                                    name:;
+
+        Some mail  services  may  provide  a  group-list  distribution
+        facility,  accepting  a single mailbox reference, expanding it
+        to the full distribution list, and relaying the  mail  to  the
+        list's  members.   This standard provides no additional syntax
+        for indicating such a  service.   Using  the  <group>  address
+        alternative,  while listing one mailbox in it, can mean either
+        that the mailbox reference will be expanded to a list or  that
+        there is a group with one member.
+
+     6.2.7.  EXPLICIT PATH SPECIFICATION
+
+        At times, a  message  originator  may  wish  to  indicate  the
+        transmission  path  that  a  message  should  follow.  This is
+        called source routing.  The normal addressing scheme, used  in
+        an  addr-spec,  is  carefully separated from such information;
+        the <route> portion of a route-addr is provided for such occa-
+        sions.  It specifies the sequence of hosts and/or transmission
+
+
+     August 13, 1982              - 32 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        services that are  to  be  traversed.   Both  domain-refs  and
+        domain-literals may be used.
+
+        Note:  The use of source routing is discouraged.   Unless  the
+               sender has special need of path restriction, the choice
+               of transmission route should be left to the mail  tran-
+               sport service.
+
+     6.3.  RESERVED ADDRESS
+
+          It often is necessary to send mail to a site, without  know-
+     ing  any  of its valid addresses.  For example, there may be mail
+     system dysfunctions, or a user may wish to find  out  a  person's
+     correct address, at that site.
+
+          This standard specifies a single, reserved  mailbox  address
+     (local-part)  which  is  to  be valid at each site.  Mail sent to
+     that address is to be routed to  a  person  responsible  for  the
+     site's mail system or to a person with responsibility for general
+     site operation.  The name of the reserved local-part address is:
+
+                                Postmaster
+
+     so that "Postmaster@domain" is required to be valid.
+
+     Note:  This reserved local-part must be  matched  without  sensi-
+            tivity to alphabetic case, so that "POSTMASTER", "postmas-
+            ter", and even "poStmASteR" is to be accepted.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 33 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     7.  BIBLIOGRAPHY
+
+
+     ANSI.  "USA Standard Code  for  Information  Interchange,"  X3.4.
+        American  National Standards Institute: New York (1968).  Also
+        in:  Feinler, E.  and J. Postel, eds., "ARPANET Protocol Hand-
+        book", NIC 7104.
+
+     ANSI.  "Representations of Universal Time, Local  Time  Differen-
+        tials,  and United States Time Zone References for Information
+        Interchange," X3.51-1975.  American National Standards  Insti-
+        tute:  New York (1975).
+
+     Bemer, R.W., "Time and the Computer."  In:  Interface  Age  (Feb.
+        1979).
+
+     Bennett, C.J.  "JNT Mail Protocol".  Joint Network Team,  Ruther-
+        ford and Appleton Laboratory:  Didcot, England.
+
+     Bhushan, A.K., Pogran, K.T., Tomlinson,  R.S.,  and  White,  J.E.
+        "Standardizing  Network  Mail  Headers,"   ARPANET Request for
+        Comments No. 561, Network Information Center  No.  18516;  SRI
+        International:  Menlo Park (September 1973).
+
+     Birrell, A.D., Levin, R.,  Needham,  R.M.,  and  Schroeder,  M.D.
+        "Grapevine:  An Exercise in Distributed Computing," Communica-
+        tions of the ACM 25, 4 (April 1982), 260-274.
+
+     Crocker,  D.H.,  Vittal,  J.J.,  Pogran,  K.T.,  Henderson,  D.A.
+        "Standard  for  the  Format  of  ARPA  Network  Text Message,"
+        ARPANET Request for  Comments  No.  733,  Network  Information
+        Center  No.  41952.   SRI International:  Menlo Park (November
+        1977).
+
+     Feinler, E.J. and Postel, J.B.  ARPANET Protocol  Handbook,  Net-
+        work  Information  Center  No.  7104   (NTIS AD A003890).  SRI
+        International:  Menlo Park (April 1976).
+
+     Harary, F.   "Graph  Theory".   Addison-Wesley:   Reading,  Mass.
+        (1969).
+
+     Levin, R. and Schroeder, M.  "Transport  of  Electronic  Messages
+        through  a  Network,"   TeleInformatics  79, pp. 29-33.  North
+        Holland (1979).  Also  as  Xerox  Palo  Alto  Research  Center
+        Technical Report CSL-79-4.
+
+     Myer, T.H. and Henderson, D.A.  "Message Transmission  Protocol,"
+        ARPANET  Request  for  Comments,  No. 680, Network Information
+        Center No. 32116.  SRI International:  Menlo Park (1975).
+
+
+     August 13, 1982              - 34 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     NBS.  "Specification of Message Format for Computer Based Message
+        Systems, Recommended Federal Information Processing Standard."
+        National  Bureau   of   Standards:    Gaithersburg,   Maryland
+        (October 1981).
+
+     NIC.  Internet Protocol Transition Workbook.  Network Information
+        Center,   SRI-International,  Menlo  Park,  California  (March
+        1982).
+
+     Oppen, D.C. and Dalal, Y.K.  "The Clearinghouse:  A Decentralized
+        Agent  for  Locating  Named  Objects in a Distributed Environ-
+        ment," OPD-T8103.  Xerox Office Products Division:  Palo Alto,
+        CA. (October 1981).
+
+     Postel, J.B.  "Assigned Numbers,"  ARPANET Request for  Comments,
+        No. 820.  SRI International:  Menlo Park (August 1982).
+
+     Postel, J.B.  "Simple Mail Transfer  Protocol,"  ARPANET  Request
+        for Comments, No. 821.  SRI International:  Menlo Park (August
+        1982).
+
+     Shoch, J.F.  "Internetwork naming, addressing  and  routing,"  in
+        Proc. 17th IEEE Computer Society International Conference, pp.
+        72-79, Sept. 1978, IEEE Cat. No. 78 CH 1388-8C.
+
+     Su, Z. and Postel, J.  "The Domain Naming Convention for Internet
+        User  Applications,"  ARPANET  Request  for Comments, No. 819.
+        SRI International:  Menlo Park (August 1982).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 35 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+                                 APPENDIX
+
+
+     A.  EXAMPLES
+
+     A.1.  ADDRESSES
+
+     A.1.1.  Alfred Neuman <Neuman@BBN-TENEXA>
+
+     A.1.2.  Neuman@BBN-TENEXA
+
+             These two "Alfred Neuman" examples have identical  seman-
+        tics, as far as the operation of the local host's mail sending
+        (distribution) program (also sometimes  called  its  "mailer")
+        and  the remote host's mail protocol server are concerned.  In
+        the first example, the  "Alfred  Neuman"  is  ignored  by  the
+        mailer,  as "Neuman@BBN-TENEXA" completely specifies the reci-
+        pient.  The second example contains  no  superfluous  informa-
+        tion,  and,  again,  "Neuman@BBN-TENEXA" is the intended reci-
+        pient.
+
+        Note:  When the message crosses name-domain  boundaries,  then
+               these specifications must be changed, so as to indicate
+               the remainder of the hierarchy, starting with  the  top
+               level.
+
+     A.1.3.  "George, Ted" <Shared@Group.Arpanet>
+
+             This form might be used to indicate that a single mailbox
+        is  shared  by several users.  The quoted string is ignored by
+        the originating host's mailer, because  "Shared@Group.Arpanet"
+        completely specifies the destination mailbox.
+
+     A.1.4.  Wilt . (the  Stilt) Chamberlain@NBA.US
+
+             The "(the  Stilt)" is a comment, which is NOT included in
+        the  destination  mailbox  address  handed  to the originating
+        system's mailer.  The local-part of the address is the  string
+        "Wilt.Chamberlain", with NO space between the first and second
+        words.
+
+     A.1.5.  Address Lists
+
+     Gourmets:  Pompous Person <WhoZiWhatZit@Cordon-Bleu>,
+                Childs@WGBH.Boston, Galloping Gourmet@
+                ANT.Down-Under (Australian National Television),
+                Cheapie@Discount-Liquors;,
+       Cruisers:  Port@Portugal, Jones@SEA;,
+         Another@Somewhere.SomeOrg
+
+
+     August 13, 1982              - 36 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        This group list example points out the use of comments and the
+        mixing of addresses and groups.
+
+     A.2.  ORIGINATOR ITEMS
+
+     A.2.1.  Author-sent
+
+             George Jones logs into his host  as  "Jones".   He  sends
+        mail himself.
+
+            From:  Jones@Group.Org
+
+        or
+
+            From:  George Jones <Jones@Group.Org>
+
+     A.2.2.  Secretary-sent
+
+             George Jones logs in as Jones on his  host.   His  secre-
+        tary,  who logs in as Secy sends mail for him.  Replies to the
+        mail should go to George.
+
+            From:    George Jones <Jones@Group>
+            Sender:  Secy@Other-Group
+
+     A.2.3.  Secretary-sent, for user of shared directory
+
+             George Jones' secretary sends mail  for  George.  Replies
+        should go to George.
+
+            From:     George Jones<Shared@Group.Org>
+            Sender:   Secy@Other-Group
+
+        Note that there need not be a space between  "Jones"  and  the
+        "<",  but  adding a space enhances readability (as is the case
+        in other examples.
+
+     A.2.4.  Committee activity, with one author
+
+             George is a member of a committee.  He wishes to have any
+        replies to his message go to all committee members.
+
+            From:     George Jones <Jones@Host.Net>
+            Sender:   Jones@Host
+            Reply-To: The Committee: Jones@Host.Net,
+                                     Smith@Other.Org,
+                                     Doe@Somewhere-Else;
+
+        Note  that  if  George  had  not  included  himself   in   the
+
+
+     August 13, 1982              - 37 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+        enumeration  of  The  Committee,  he  would not have gotten an
+        implicit reply; the presence of the  "Reply-to"  field  SUPER-
+        SEDES the sending of a reply to the person named in the "From"
+        field.
+
+     A.2.5.  Secretary acting as full agent of author
+
+             George Jones asks his secretary  (Secy@Host)  to  send  a
+        message for him in his capacity as Group.  He wants his secre-
+        tary to handle all replies.
+
+            From:     George Jones <Group@Host>
+            Sender:   Secy@Host
+            Reply-To: Secy@Host
+
+     A.2.6.  Agent for user without online mailbox
+
+             A friend  of  George's,  Sarah,  is  visiting.   George's
+        secretary  sends  some  mail to a friend of Sarah in computer-
+        land.  Replies should go to George, whose mailbox is Jones  at
+        Registry.
+
+            From:     Sarah Friendly <Secy@Registry>
+            Sender:   Secy-Name <Secy@Registry>
+            Reply-To: Jones@Registry.
+
+     A.2.7.  Agent for member of a committee
+
+             George's secretary sends out a message which was authored
+        jointly by all the members of a committee.  Note that the name
+        of the committee cannot be specified, since <group> names  are
+        not permitted in the From field.
+
+            From:   Jones@Host,
+                    Smith@Other-Host,
+                    Doe@Somewhere-Else
+            Sender: Secy@SHost
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 38 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     A.3.  COMPLETE HEADERS
+
+     A.3.1.  Minimum required
+
+     Date:     26 Aug 76 1429 EDT        Date:     26 Aug 76 1429 EDT
+     From:     Jones@Registry.Org   or   From:     Jones@Registry.Org
+     Bcc:                                To:       Smith@Registry.Org
+
+        Note that the "Bcc" field may be empty, while the  "To"  field
+        is required to have at least one address.
+
+     A.3.2.  Using some of the additional fields
+
+     Date:     26 Aug 76 1430 EDT
+     From:     George Jones<Group@Host>
+     Sender:   Secy@SHOST
+     To:       "Al Neuman"@Mad-Host,
+               Sam.Irving@Other-Host
+     Message-ID:  <some.string@SHOST>
+
+     A.3.3.  About as complex as you're going to get
+
+     Date     :  27 Aug 76 0932 PDT
+     From     :  Ken Davis <KDavis@This-Host.This-net>
+     Subject  :  Re: The Syntax in the RFC
+     Sender   :  KSecy@Other-Host
+     Reply-To :  Sam.Irving@Reg.Organization
+     To       :  George Jones <Group@Some-Reg.An-Org>,
+                 Al.Neuman@MAD.Publisher
+     cc       :  Important folk:
+                   Tom Softwood <Balsa@Tree.Root>,
+                   "Sam Irving"@Other-Host;,
+                 Standard Distribution:
+                   /main/davis/people/standard@Other-Host,
+                   "<Jones>standard.dist.3"@Tops-20-Host>;
+     Comment  :  Sam is away on business. He asked me to handle
+                 his mail for him.  He'll be able to provide  a
+                 more  accurate  explanation  when  he  returns
+                 next week.
+     In-Reply-To: <some.string@DBM.Group>, George's message
+     X-Special-action:  This is a sample of user-defined field-
+                 names.  There could also be a field-name
+                 "Special-action", but its name might later be
+                 preempted
+     Message-ID: <4231.629.XYzi-What@Other-Host>
+
+
+
+
+
+
+     August 13, 1982              - 39 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     B.  SIMPLE FIELD PARSING
+
+          Some mail-reading software systems may wish to perform  only
+     minimal  processing,  ignoring  the internal syntax of structured
+     field-bodies and treating them the  same  as  unstructured-field-
+     bodies.  Such software will need only to distinguish:
+
+         o   Header fields from the message body,
+
+         o   Beginnings of fields from lines which continue fields,
+
+         o   Field-names from field-contents.
+
+          The abbreviated set of syntactic rules  which  follows  will
+     suffice  for  this  purpose.  It describes a limited view of mes-
+     sages and is a subset of the syntactic rules provided in the main
+     part of this specification.  One small exception is that the con-
+     tents of field-bodies consist only of text:
+
+     B.1.  SYNTAX
+
+
+     message         =   *field *(CRLF *text)
+
+     field           =    field-name ":" [field-body] CRLF
+
+     field-name      =  1*<any CHAR, excluding CTLs, SPACE, and ":">
+
+     field-body      =   *text [CRLF LWSP-char field-body]
+
+
+     B.2.  SEMANTICS
+
+          Headers occur before the message body and are terminated  by
+     a null line (i.e., two contiguous CRLFs).
+
+          A line which continues a header field begins with a SPACE or
+     HTAB  character,  while  a  line  beginning a field starts with a
+     printable character which is not a colon.
+
+          A field-name consists of one or  more  printable  characters
+     (excluding  colon,  space, and control-characters).  A field-name
+     MUST be contained on one line.  Upper and lower case are not dis-
+     tinguished when comparing field-names.
+
+
+
+
+
+
+
+     August 13, 1982              - 40 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     C.  DIFFERENCES FROM RFC #733
+
+          The following summarizes the differences between this  stan-
+     dard  and the one specified in Arpanet Request for Comments #733,
+     "Standard for the Format of ARPA  Network  Text  Messages".   The
+     differences  are  listed  in the order of their occurrence in the
+     current specification.
+
+     C.1.  FIELD DEFINITIONS
+
+     C.1.1.  FIELD NAMES
+
+        These now must be a sequence of  printable  characters.   They
+        may not contain any LWSP-chars.
+
+     C.2.  LEXICAL TOKENS
+
+     C.2.1.  SPECIALS
+
+        The characters period ("."), left-square  bracket  ("["),  and
+        right-square  bracket ("]") have been added.  For presentation
+        purposes, and when passing a specification to  a  system  that
+        does  not conform to this standard, periods are to be contigu-
+        ous with their surrounding lexical tokens.   No  linear-white-
+        space  is  permitted  between them.  The presence of one LWSP-
+        char between other tokens is still directed.
+
+     C.2.2.  ATOM
+
+        Atoms may not contain SPACE.
+
+     C.2.3.  SPECIAL TEXT
+
+        ctext and qtext have had backslash ("\") added to the list  of
+        prohibited characters.
+
+     C.2.4.  DOMAINS
+
+        The lexical tokens  <domain-literal>  and  <dtext>  have  been
+        added.
+
+     C.3.  MESSAGE SPECIFICATION
+
+     C.3.1.  TRACE
+
+        The "Return-path:" and "Received:" fields have been specified.
+
+
+
+
+
+     August 13, 1982              - 41 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     C.3.2.  FROM
+
+        The "From" field must contain machine-usable addresses  (addr-
+        spec).   Multiple  addresses may be specified, but named-lists
+        (groups) may not.
+
+     C.3.3.  RESENT
+
+        The meta-construct of prefacing field names  with  the  string
+        "Resent-"  has been added, to indicate that a message has been
+        forwarded by an intermediate recipient.
+
+     C.3.4.  DESTINATION
+
+        A message must contain at least one destination address field.
+        "To" and "CC" are required to contain at least one address.
+
+     C.3.5.  IN-REPLY-TO
+
+        The field-body is no longer a comma-separated list, although a
+        sequence is still permitted.
+
+     C.3.6.  REFERENCE
+
+        The field-body is no longer a comma-separated list, although a
+        sequence is still permitted.
+
+     C.3.7.  ENCRYPTED
+
+        A field has been specified that permits  senders  to  indicate
+        that the body of a message has been encrypted.
+
+     C.3.8.  EXTENSION-FIELD
+
+        Extension fields are prohibited from beginning with the  char-
+        acters "X-".
+
+     C.4.  DATE AND TIME SPECIFICATION
+
+     C.4.1.  SIMPLIFICATION
+
+        Fewer optional forms are permitted  and  the  list  of  three-
+        letter time zones has been shortened.
+
+     C.5.  ADDRESS SPECIFICATION
+
+
+
+
+
+
+     August 13, 1982              - 42 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     C.5.1.  ADDRESS
+
+        The use of quoted-string, and the ":"-atom-":" construct, have
+        been  removed.   An  address  now  is  either a single mailbox
+        reference or is a named list of addresses.  The  latter  indi-
+        cates a group distribution.
+
+     C.5.2.  GROUPS
+
+        Group lists are now required to to have a name.   Group  lists
+        may not be nested.
+
+     C.5.3.  MAILBOX
+
+        A mailbox specification  may  indicate  a  person's  name,  as
+        before.   Such  a  named  list  no longer may specify multiple
+        mailboxes and may not be nested.
+
+     C.5.4.  ROUTE ADDRESSING
+
+        Addresses now are taken to be absolute, global specifications,
+        independent  of transmission paths.  The <route> construct has
+        been provided, to permit explicit specification  of  transmis-
+        sion  path.   RFC  #733's  use  of multiple at-signs ("@") was
+        intended as a general syntax  for  indicating  routing  and/or
+        hierarchical addressing.  The current standard separates these
+        specifications and only one at-sign is permitted.
+
+     C.5.5.  AT-SIGN
+
+        The string " at " no longer is used as an  address  delimiter.
+        Only at-sign ("@") serves the function.
+
+     C.5.6.  DOMAINS
+
+        Hierarchical, logical name-domains have been added.
+
+     C.6.  RESERVED ADDRESS
+
+     The local-part "Postmaster" has been reserved, so that users  can
+     be guaranteed at least one valid address at a site.
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 43 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     D.  ALPHABETICAL LISTING OF SYNTAX RULES
+
+     address     =  mailbox                      ; one addressee
+                 /  group                        ; named list
+     addr-spec   =  local-part "@" domain        ; global address
+     ALPHA       =  <any ASCII alphabetic character>
+                                                 ; (101-132, 65.- 90.)
+                                                 ; (141-172, 97.-122.)
+     atom        =  1*<any CHAR except specials, SPACE and CTLs>
+     authentic   =   "From"       ":"   mailbox  ; Single author
+                 / ( "Sender"     ":"   mailbox  ; Actual submittor
+                     "From"       ":" 1#mailbox) ; Multiple authors
+                                                 ;  or not sender
+     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
+     comment     =  "(" *(ctext / quoted-pair / comment) ")"
+     CR          =  <ASCII CR, carriage return>  ; (     15,      13.)
+     CRLF        =  CR LF
+     ctext       =  <any CHAR excluding "(",     ; => may be folded
+                     ")", "\" & CR, & including
+                     linear-white-space>
+     CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
+                     character and DEL>          ; (    177,     127.)
+     date        =  1*2DIGIT month 2DIGIT        ; day month year
+                                                 ;  e.g. 20 Jun 82
+     dates       =   orig-date                   ; Original
+                   [ resent-date ]               ; Forwarded
+     date-time   =  [ day "," ] date time        ; dd mm yy
+                                                 ;  hh:mm:ss zzz
+     day         =  "Mon"  / "Tue" /  "Wed"  / "Thu"
+                 /  "Fri"  / "Sat" /  "Sun"
+     delimiters  =  specials / linear-white-space / comment
+     destination =  "To"          ":" 1#address  ; Primary
+                 /  "Resent-To"   ":" 1#address
+                 /  "cc"          ":" 1#address  ; Secondary
+                 /  "Resent-cc"   ":" 1#address
+                 /  "bcc"         ":"  #address  ; Blind carbon
+                 /  "Resent-bcc"  ":"  #address
+     DIGIT       =  <any ASCII decimal digit>    ; ( 60- 71, 48.- 57.)
+     domain      =  sub-domain *("." sub-domain)
+     domain-literal =  "[" *(dtext / quoted-pair) "]"
+     domain-ref  =  atom                         ; symbolic reference
+     dtext       =  <any CHAR excluding "[",     ; => may be folded
+                     "]", "\" & CR, & including
+                     linear-white-space>
+     extension-field =
+                   <Any field which is defined in a document
+                    published as a formal extension to this
+                    specification; none will have names beginning
+                    with the string "X-">
+
+
+     August 13, 1982              - 44 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     field       =  field-name ":" [ field-body ] CRLF
+     fields      =    dates                      ; Creation time,
+                      source                     ;  author id & one
+                    1*destination                ;  address required
+                     *optional-field             ;  others optional
+     field-body  =  field-body-contents
+                    [CRLF LWSP-char field-body]
+     field-body-contents =
+                   <the ASCII characters making up the field-body, as
+                    defined in the following sections, and consisting
+                    of combinations of atom, quoted-string, and
+                    specials tokens, or else consisting of texts>
+     field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">
+     group       =  phrase ":" [#mailbox] ";"
+     hour        =  2DIGIT ":" 2DIGIT [":" 2DIGIT]
+                                                 ; 00:00:00 - 23:59:59
+     HTAB        =  <ASCII HT, horizontal-tab>   ; (     11,       9.)
+     LF          =  <ASCII LF, linefeed>         ; (     12,      10.)
+     linear-white-space =  1*([CRLF] LWSP-char)  ; semantics = SPACE
+                                                 ; CRLF => folding
+     local-part  =  word *("." word)             ; uninterpreted
+                                                 ; case-preserved
+     LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE
+     mailbox     =  addr-spec                    ; simple address
+                 /  phrase route-addr            ; name & addr-spec
+     message     =  fields *( CRLF *text )       ; Everything after
+                                                 ;  first null line
+                                                 ;  is message body
+     month       =  "Jan"  /  "Feb" /  "Mar"  /  "Apr"
+                 /  "May"  /  "Jun" /  "Jul"  /  "Aug"
+                 /  "Sep"  /  "Oct" /  "Nov"  /  "Dec"
+     msg-id      =  "<" addr-spec ">"            ; Unique message id
+     optional-field =
+                 /  "Message-ID"        ":"   msg-id
+                 /  "Resent-Message-ID" ":"   msg-id
+                 /  "In-Reply-To"       ":"  *(phrase / msg-id)
+                 /  "References"        ":"  *(phrase / msg-id)
+                 /  "Keywords"          ":"  #phrase
+                 /  "Subject"           ":"  *text
+                 /  "Comments"          ":"  *text
+                 /  "Encrypted"         ":" 1#2word
+                 /  extension-field              ; To be defined
+                 /  user-defined-field           ; May be pre-empted
+     orig-date   =  "Date"        ":"   date-time
+     originator  =   authentic                   ; authenticated addr
+                   [ "Reply-To"   ":" 1#address] )
+     phrase      =  1*word                       ; Sequence of words
+
+
+
+
+     August 13, 1982              - 45 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     qtext       =  <any CHAR excepting <">,     ; => may be folded
+                     "\" & CR, and including
+                     linear-white-space>
+     quoted-pair =  "\" CHAR                     ; may quote any char
+     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
+                                                 ;   quoted chars.
+     received    =  "Received"    ":"            ; one per relay
+                       ["from" domain]           ; sending host
+                       ["by"   domain]           ; receiving host
+                       ["via"  atom]             ; physical path
+                      *("with" atom)             ; link/mail protocol
+                       ["id"   msg-id]           ; receiver msg id
+                       ["for"  addr-spec]        ; initial form
+                        ";"    date-time         ; time received
+
+     resent      =   resent-authentic
+                   [ "Resent-Reply-To"  ":" 1#address] )
+     resent-authentic =
+                 =   "Resent-From"      ":"   mailbox
+                 / ( "Resent-Sender"    ":"   mailbox
+                     "Resent-From"      ":" 1#mailbox  )
+     resent-date =  "Resent-Date" ":"   date-time
+     return      =  "Return-path" ":" route-addr ; return address
+     route       =  1#("@" domain) ":"           ; path-relative
+     route-addr  =  "<" [route] addr-spec ">"
+     source      = [  trace ]                    ; net traversals
+                      originator                 ; original mail
+                   [  resent ]                   ; forwarded
+     SPACE       =  <ASCII SP, space>            ; (     40,      32.)
+     specials    =  "(" / ")" / "<" / ">" / "@"  ; Must be in quoted-
+                 /  "," / ";" / ":" / "\" / <">  ;  string, to use
+                 /  "." / "[" / "]"              ;  within a word.
+     sub-domain  =  domain-ref / domain-literal
+     text        =  <any CHAR, including bare    ; => atoms, specials,
+                     CR & bare LF, but NOT       ;  comments and
+                     including CRLF>             ;  quoted-strings are
+                                                 ;  NOT recognized.
+     time        =  hour zone                    ; ANSI and Military
+     trace       =    return                     ; path to sender
+                    1*received                   ; receipt tags
+     user-defined-field =
+                   <Any field which has not been defined
+                    in this specification or published as an
+                    extension to this specification; names for
+                    such fields must be unique and may be
+                    pre-empted by published extensions>
+     word        =  atom / quoted-string
+
+
+
+
+     August 13, 1982              - 46 -                      RFC #822
+
+
+ 
+     Standard for ARPA Internet Text Messages
+
+
+     zone        =  "UT"  / "GMT"                ; Universal Time
+                                                 ; North American : UT
+                 /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
+                 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
+                 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
+                 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
+                 /  1ALPHA                       ; Military: Z = UT;
+     <">         =  <ASCII quote mark>           ; (     42,      34.)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+     August 13, 1982              - 47 -                      RFC #822
+
diff --git a/doc/uri.scm.doc b/doc/uri.scm.doc
new file mode 100644
index 0000000..ff44a8d
--- /dev/null
+++ b/doc/uri.scm.doc
@@ -0,0 +1,150 @@
+This file documents names specified in uri.scm.
+
+
+
+
+NOTES
+
+URIs are of following syntax:
+
+[scheme] : path [? search ] [# fragmentid]
+
+Parts in [] may be ommitted. The last part is usually referred to as
+fragid in this document. 
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+char-set
+uri-reserved
+
+A list of reserved characters (semicolon, slash, hash, question mark,
+double colon and space).
+
+procedure 
+parse-uri uri-string --> (scheme, path, search, frag-id)
+
+Multiple-value return: scheme, path, search, frag-id, in this
+order. scheme, search and frag-id are either #f or a string. path is a
+nonempty list of strings. An empty path is a list containing the empty
+string. parse-uri tries to be tolerant of the various ways people build broken URIs out there on the Net (so it is not absolutely conform with RFC 1630).
+
+
+procedure
+unescape-uri string [start [end]] --> string
+
+Unescapes a string. This procedure should only be used *after* the url
+(!)  was parsed, since unescaping may introduce characters that blow
+up the parse (that's why escape sequences are used in URIs ;).
+Escape-sequences are of following scheme: %hh where h is a hexadecimal
+digit. E.g. %20 is space (ASCII character 32).
+
+
+procedure
+hex-digit? character --> boolean
+
+Returns #t if character is a hexadecimal digit (i.e., one of 1-9, a-f,
+A-F), #f otherwise.
+
+
+procedure
+hexchar->int character --> number
+
+Translates the given character to an integer, p.e. (hexchar->int \#a)
+=> 10.
+
+
+procedure
+int->hexchar integer --> character
+
+Translates the given integer from range 1-15 into an hexadecimal
+character (uses uppercase letters), p.e. (int->hexchar 14) => E. 
+
+
+char-set
+uri-escaped-chars
+
+A set of characters that are escaped in URIs. These are the following
+characters: dollar ($), minus (-), underscore (_), at (@), dot (.),
+and-sign (&), exclamation mark (!), asterisk (*), backslash (\),
+double quote ("), single quote ('), open brace ((), close brace ()),
+comma (,) plus (+) and all other characters that are neither letters
+nor digits (such as space and control characters).
+
+
+procedure
+escape-uri string [escaped-chars] --> string
+
+Escapes characters of string that are given with escaped-chars.
+escaped-chars default to uri-escaped-chars. Be careful with using this
+procedure to chunks of text with syntactically meaningful reserved
+characters (e.g., paths with URI slashes or colons) -- they'll be
+escaped, and lose their special meaning. E.g. it would be a mistake to
+apply escape-uri to "//lcs.mit.edu:8001/foo/bar.html" because the
+slashes and colons would be escaped. Note that esacpe-uri doesn't
+check this as it would lose his meaning.
+
+
+procedure
+resolve-uri cscheme cp scheme p --> (scheme, path)
+
+Sorry, I can't figure out what resolve-uri is inteded to do. Perhaps
+I find it out later.
+
+The code seems to have a bug: In the body of receive, there's a
+loop. j should, according to the comment, count sequential /. But j
+counts nothing in the body. Either zero is added ((lp (cdr cp-tail)
+(cons (car cp-tail) rhead) (+ j 0))) or j is set to 1 ((lp (cdr
+cp-tail) (cons (car cp-tail) rhead) 1))). Nevertheless, j is expected
+to reach value numsl that can be larger than one. So what? I am
+confused.
+
+
+procedure
+rev-append list-a list-b --> list
+
+Performs a (append (reverse list-a) list-b). The comment says it
+should be defined in a list package but I am wondering how often this
+will be used.
+
+
+procedure
+split-uri-path uri start end --> list
+
+Splits uri at /'s. Only the substring given with start (inclusive) and
+end (exclusive) is considered. Start and end - 1 have to be within the
+range of the uri-string.  Otherwise an index-out-of-range exception
+will be raised. Example: (split-uri-path "foo/bar/colon" 4 11) ==>
+'("bar" "col")
+
+
+procedure
+simplify-uri-path path --> list
+
+Removes "." and ".." entries from path. The result is a (maybe empty)
+list representing a path that does not contain any "." or "..". The
+list can only be empty if the path did not start with "/" (for the
+rare occasion someone wants to simplify a relative path). The result
+is #f if the path tries to back up past root, for example by "/.." or
+"/foo/../.." or just "..". "//" may occur somewhere in the path
+referring to root but not being backed up.
+Examples: 
+(simplify-uri-path (split-uri-path "/foo/bar/baz/.." 0 15))
+==> '("" "foo" "bar")  
+
+(simplify-uri-path (split-uri-path "foo/bar/baz/../../.." 0 20))
+==> '()
+
+(simplify-uri-path (split-uri-path "/foo/../.." 0 10))
+==> #f          ; tried to back up root
+
+(simplify-uri-path (split-uri-path "foo/bar//" 0 9))
+==> '("")       ; "//" refers to root
+
+(simplify-uri-path (split-uri-path "foo/bar/" 0 8))
+==> '("")       ; last "/" also refers to root
+
+(simplify-uri-path (split-uri-path "/foo/bar//baz/../.." 0 19))
+==> #f          ; tries to back up root
diff --git a/doc/url.scm.doc b/doc/url.scm.doc
new file mode 100644
index 0000000..4819ca4
--- /dev/null
+++ b/doc/url.scm.doc
@@ -0,0 +1,69 @@
+This file documents names defined in url.scm
+
+
+
+
+NOTES
+
+
+
+
+DEFINITIONS AND DESCRIPTIONS
+
+
+userhost                           record
+
+A record containing the fields user, password, host and port. Created
+by parsing a string like //<user>:<password>@<host>:<port>/. The
+record describes path-prefixes of the form
+//<user>:<password>@<host>:<port>/ These are frequently used as the
+initial prefix of URL's describing Internet resources.
+
+
+parse-userhost path default
+
+Parse a URI path (a list representing a path, not a string!) into a
+userhost record. Default values are taken from the userhost record
+DEFAULT except for the host. Returns a userhost record if it wins, and
+#f if it cannot parse the path. It is an error if the specified path
+does not begin with '//..' like noted at userhost.
+
+
+userhost-escaped-chars             list
+
+The union of uri-escaped-chars and the characters '@' and ':'. Used
+for the unparser.
+
+
+userhost->string userhost          procedure
+
+Unparses a userhost record to a string.
+
+
+http-url                           record
+
+Record containing the fields userhost (a userhost record), path (a
+path list), search and frag-id. The PATH slot of this record is the
+URL's path split at slashes, e.g., "foo/bar//baz/" => ("foo" "bar" ""
+"baz" ""). These elements are in raw, unescaped format. To convert
+back to a string, use (uri-path-list->path (map escape-uri pathlist)).
+
+
+parse-http-url path search frag-id       procedure
+
+Returns a http-url record. path, search and frag-id are results of a
+parse-uri call on the initial uri. See there (uri.scm) for further
+details. search and frag-id are stored as they are. This parser
+decodes the path elements. It is an error if the path specifies an
+user or a password as this is not allowd at http-urls.
+
+
+default-http-userhost                    record
+
+A userhost record that specifies the port as 80 and anything else as
+#f.
+
+
+http-url->string http-url
+
+Unparses the given http-url to a string.