Files in doc and doc/html added :).
This commit is contained in:
		
							parent
							
								
									812784c6bf
								
							
						
					
					
						commit
						13b5e5e2d7
					
				|  | @ -0,0 +1,85 @@ | |||
| <HTML> | ||||
| <HEAD> | ||||
| <TITLE>The Scheme Underground Network Package</TITLE> | ||||
| </HEAD> | ||||
| 
 | ||||
| <BODY> | ||||
| <H1>The Scheme Underground Network Package</H1> | ||||
| I have written a set of libraries for doing Net hacking from Scheme/scsh. | ||||
| It includes: | ||||
| <DL> | ||||
| <DT> An smtp client library. | ||||
| <DD> Forge mail from the comfort of your own Scheme process. | ||||
| 
 | ||||
| <DT> rfc822 header library | ||||
| <DD> Read email-style headers. Useful in several contexts (smtp, http, etc.) | ||||
| 
 | ||||
| <DT> Simple structured HTML output library | ||||
| <DD> Balanced delimiters, etc. | ||||
| 
 | ||||
| <DT> The SU Web server | ||||
| <DD> This is a complete implementation of an HTTP 1.0 server in Scheme. | ||||
|      The server contains other standalone packages that may separately be of  | ||||
|      use: | ||||
|      <UL> | ||||
|      <LI> URI and URL parsers and unparsers. | ||||
|      <LI> A library to help writing CGI scripts in Scheme. | ||||
|      <LI> Server extensions for interfacing to CGI scripts. | ||||
|      <LI> Server extensions for uploading Scheme code. | ||||
|      </UL> | ||||
|     The server has three main design goals: | ||||
|     <DL> | ||||
|     <DT> Extensibility | ||||
|     <DD> The server is in fact nothing but extensions, using a mechanism | ||||
| 	 called "path handlers" to define URL-specific services. It has a toolkit | ||||
| 	 of services that can be used as-is, extended or built upon. | ||||
| 	 User extensions have exactly the same status as the base services. | ||||
| 
 | ||||
| 	<P> | ||||
| 	The extension mechanism allows for easy implementation of new services | ||||
| 	without the overhead of the CGI interface. Since the server is written | ||||
| 	on top of the Scheme shell, the full set of Unix system calls and | ||||
| 	program tools is available to the implementor. | ||||
| 
 | ||||
|     <DT> Mobile code | ||||
|     <DD> The server allows Scheme code to be uploaded for direct execution | ||||
| 	 inside the server. The server has complete control over the code, | ||||
| 	 and can safely execute it in restricted environments that do not | ||||
| 	 provide access to potentially dangerous primitives (such as the | ||||
| 	 "delete file" procedure.) | ||||
| 
 | ||||
| 
 | ||||
|     <DT> Clarity | ||||
|     <DD> I wrote this server to help myself understand the Web. It is voluminously | ||||
| 	 commented, and I hope it will prove to be an aid in understanding the | ||||
| 	 low-level details of the Web protocols. | ||||
|     </DL> | ||||
| 
 | ||||
|     <P> | ||||
|     The S.U. server has the ability to upload code from Web clients and  | ||||
|     execute that code on behalf of the client in a protected environment. | ||||
| 
 | ||||
|     <P> | ||||
|     Some <A HREF="su-httpd.html">simple documentation</A> on the server | ||||
|     is available. | ||||
| 
 | ||||
| </DL> | ||||
| 
 | ||||
| <H2>Obtaining the system</H2> | ||||
| The network code is available by | ||||
| <A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz">ftp</A>. | ||||
| To run the server, you need our 0.4 release of  | ||||
| <A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A> | ||||
| which has just been released. | ||||
| 
 | ||||
| Beyond actually running the server, | ||||
| the separate parser libraries and other utilites may be of use as separate | ||||
| modules. | ||||
| 
 | ||||
| <ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A> | ||||
|        / <A HREF="plan-file">shivers@ai.mit.edu</A></ADDRESS> | ||||
| 
 | ||||
| </BODY> | ||||
| </HTML> | ||||
| 
 | ||||
| 
 | ||||
|  | @ -0,0 +1,482 @@ | |||
| <!-- check for *..* emphasis, etc., i.e., e.g. --> | ||||
| <HTML> | ||||
| <HEAD> | ||||
| <TITLE>The Scheme Underground Web system</TITLE> | ||||
| </HEAD> | ||||
| 
 | ||||
| <BODY> | ||||
| <H1>The Scheme Underground Web System</H1> | ||||
| 
 | ||||
| <ADDRESS><A HREF="http://www.ai.mit.edu/people/shivers/">Olin Shivers</A> | ||||
|        / <A HREF="plan-file">shivers@ai.mit.edu</A> | ||||
| </ADDRESS> | ||||
| July 1995 | ||||
| 
 | ||||
| <BLOCKQUOTE> | ||||
| Note: Netscape typesets description lists in a manner that makes the | ||||
| procedure descriptions below blur together, even in the absence of the | ||||
| HTML COMPACT attribute. You may just wish to print out a simple | ||||
| <A HREF="su-httpd.txt">ASCII version</A> of this note, instead. | ||||
| </BLOCKQUOTE> | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H2>Introduction</H2> | ||||
| 
 | ||||
| The | ||||
| <A HREF="http://www.ai.mit.edu/projects/su/su.html">Scheme underground</A> | ||||
| Web system is a package of | ||||
| <A HREF="http://www-swiss.ai.mit.edu/scheme-home.html">Scheme</A> | ||||
| code that provides | ||||
| utilities for interacting with the | ||||
| <A HREF="http://www.w3.org/">World-Wide Web</A>. | ||||
| This includes: | ||||
| <UL> | ||||
| <LI>  A Web server. | ||||
| <LI>  URI and URL parsers and un-parsers. | ||||
| <LI>  RFC822-style header parsers. | ||||
| <LI>  Code for performing structured html output | ||||
| <LI>  Code to assist in writing CGI Scheme programs | ||||
|       that can be used by any CGI-compliant HTTP server | ||||
|       (such as NCSA's httpd, or the S.U. Web server). | ||||
| </UL> | ||||
| 
 | ||||
|  <P> | ||||
| The code can be obtained via | ||||
| <A HREF="ftp://ftp-swiss.ai.mit.edu/pub/scsh/contrib/net/net.tar.gz"> | ||||
| anonymous ftp</A> | ||||
| and is implemented in | ||||
| <A HREF="http://www-swiss.ai.mit.edu/~jar/s48.html">Scheme 48</A>, | ||||
| using the system calls and support procedures of | ||||
| <A HREF="http://www-swiss.ai.mit.edu/scsh/scsh.html">scsh</A>, | ||||
| the Scheme Shell. | ||||
| The code was written to be clear and modifiable -- | ||||
| it is voluminously commented and all non-R4RS dependencies are | ||||
| described at the beginning of each source file. | ||||
| 
 | ||||
|  <P> | ||||
| I do not have the time to write detailed documentation for these packages. | ||||
| However, they are very thoroughly commented, and I strongly recommend | ||||
| reading the source files; they were written to be read, and the source | ||||
| code comments should provide a clear description of the system. | ||||
| The remainder of this note gives an overview of the server's basic | ||||
| architecture and interfaces. | ||||
| 
 | ||||
| <H2>The Scheme Underground Web Server</H2> | ||||
| 
 | ||||
| The server was designed with three principle goals in mind: | ||||
| <DL> | ||||
| <DT> Extensibility | ||||
| <DD> The server is designed to make it easy to extend the basic | ||||
|      functionality.  In fact, the server is nothing but extensions.  There is | ||||
|      no distinction between the set of basic services provided by the server | ||||
|      implementation and user extensions -- they are both implemented in | ||||
|      Scheme, and have equal status. The design is "turtles all the way down." | ||||
| 
 | ||||
| 
 | ||||
| <DT> Mobile code | ||||
| <DD> Because the server is written in Scheme 48, it is simple to use the | ||||
|      Scheme 48 module system to upload programs to the server for safe | ||||
|      execution within a protected, server-chosen environment. The server | ||||
|      comes with a simple example upload service to demonstrate this | ||||
|      capability. | ||||
| 
 | ||||
| 
 | ||||
| <DT> Clarity of implementation | ||||
| <DD> Because the server is written in a high-level language, it should make | ||||
|      for a clearer exposition of the HTTP protocol and the associated URL | ||||
|      and URI notations than one written in a low-level language such as C. | ||||
|      This also should help to make the server easy to modify and adapt to | ||||
|      different uses. | ||||
| </DL> | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Basic server structure</H3> | ||||
| 
 | ||||
| The Web server is started by calling the <CODE>httpd</CODE> procedure, | ||||
| which takes one required and two optional arguments: | ||||
| <PRE> | ||||
|     (httpd <VAR>path-handler</VAR> [<VAR>port</VAR> <VAR>working-directory</VAR>]) | ||||
| </PRE> | ||||
| 
 | ||||
| The server accepts connections from the given port, which defaults to 80. | ||||
| The server runs with the working directory set to the given value, | ||||
| which defaults to | ||||
| <PRE> | ||||
|     /usr/local/etc/httpd | ||||
| </PRE> | ||||
| 
 | ||||
| 
 | ||||
|  <P> | ||||
| The server's basic loop is to wait on the port for a connection from an HTTP | ||||
| client. When it receives a connection, it reads in and parses the request into | ||||
| a special request data structure. Then the server forks a child process, who | ||||
| binds the current I/O ports to the connection socket, and then hands off to | ||||
| the top-level path handler (the first argument to <CODE>httpd</CODE>). | ||||
| The path-handler procedure is responsible for actually serving the request -- | ||||
| it can be any arbitrary computation. | ||||
| Its output goes directly back to the HTTP client that sent the request. | ||||
| 
 | ||||
|  <P> | ||||
| Before calling the path handler to service the request, the HTTP server | ||||
| installs an error handler that fields any uncaught error, sends an | ||||
| error reply to the client, and aborts the request transaction. Hence | ||||
| any error caused by a path-handler will be handled in a reasonable and | ||||
| robust fashion. | ||||
| 
 | ||||
|  <P> | ||||
| The basic server loop, and the associated request data structure are the fixed | ||||
| architecture of the S.U. Web server; its flexibility lies in the notion of | ||||
| path handlers. | ||||
| 
 | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Path handlers</H3> | ||||
| 
 | ||||
| A path handler is a procedure taking two arguments: | ||||
| <PRE> | ||||
|     (path-handler <VAR>path</VAR> <VAR>req</VAR>) | ||||
| </PRE> | ||||
| 
 | ||||
| 
 | ||||
| The <VAR>req</VAR> argument is a request record giving all the details of the | ||||
| client's request; it has the following structure: | ||||
| <PRE> | ||||
|     (define-record request | ||||
|       method		; A string such as "GET", "PUT", etc. | ||||
|       uri		; The escaped URI string as read from request line. | ||||
|       url		; An http URL record (see url.scm). | ||||
|       version		; A (major . minor) integer pair. | ||||
|       headers		; An rfc822 header alist (see rfc822.scm). | ||||
|       socket)		; The socket connected to the client. | ||||
| </PRE> | ||||
| 
 | ||||
| The <VAR>path</VAR> argument is the URL's path, | ||||
| parsed and split at slashes into a string list. | ||||
| For example, if the Web client dereferences URL | ||||
| <PRE> | ||||
|     http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz | ||||
| </PRE> | ||||
| then the server would pass the following path to the top-level handler: | ||||
| <PRE> | ||||
|     ("h" "shivers" "code" "web.tar.gz") | ||||
| </PRE> | ||||
| 
 | ||||
|  <P> | ||||
| The path argument's pre-parsed representation as a string list makes it easy | ||||
| for the path handler to implement recursive operations dispatch on URL paths. | ||||
| 
 | ||||
|  <P> | ||||
| Path handlers can do anything they like to respond to HTTP requests; they have | ||||
| the full range of Scheme to implement the desired functionality.  When | ||||
| handling HTTP requests that have an associated entity body (such as POST), the | ||||
| body should be read from the current input port. Path handlers should in all | ||||
| cases write their reply to the current output port. Path handlers should | ||||
| <EM>not</EM> perform I/O on the request record's socket. | ||||
| Path handlers are frequently called recursively, and doing I/O directly to the | ||||
| socket might bypass a filtering or other processing step interposed on the | ||||
| current I/O ports by some superior path handler. | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Basic path handlers</H3> | ||||
| 
 | ||||
| Although the user can write any path-handler he likes, the S.U. server comes | ||||
| with a useful toolbox of basic path handlers that can be used and built upon: | ||||
| 
 | ||||
| <DL> | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(alist-path-dispatcher <VAR>ph-alist</VAR> <VAR>default-ph</VAR>) -> <VAR>path-handler</VAR> | ||||
| </CODE> | ||||
| <DD> | ||||
|     This procedure takes a string->path-handler alist, and a default | ||||
|     path handler, and returns a handler that dispatches on its path argument. | ||||
|     When the new path handler is applied to a path | ||||
|     <CODE>("foo" "bar" "baz")</CODE>, | ||||
|     it uses the first element of the path -- <CODE>"foo"</CODE> -- to | ||||
|     index into the alist. | ||||
|     If it finds an associated path handler in the alist, it | ||||
|     hands the request off to that handler, passing it the tail of the | ||||
|     path, <CODE>("bar" "baz")</CODE>. | ||||
|     On the other hand, if the path is empty, or the alist search does | ||||
|     not yield a hit, we hand off to the default path handler, | ||||
|     passing it the entire original path, <CODE>("foo" "bar" "baz")</CODE>. | ||||
| 
 | ||||
|     <P> | ||||
|     This procedure is how you say: "If the first element of the URL's path | ||||
|     is `foo', do X; if it's `bar', do Y; otherwise, do Z." If one takes | ||||
|     an object-oriented view of the process, an alist path-handler does | ||||
|     method lookup on the requested operation, dispatching off to the | ||||
|     appropriate method defined for the URL. | ||||
| 
 | ||||
|     <P> | ||||
|     The slash-delimited URI path structure implies an associated | ||||
|     tree of names. The path-handler system and the alist dispatcher | ||||
|     allow you to procedurally define the server's response to any arbitrary | ||||
|     subtree of the path space. | ||||
| 
 | ||||
|     <P> | ||||
|     Example: <br> | ||||
|     A typical top-level path handler is | ||||
| 
 | ||||
| <PRE> | ||||
|   (define ph | ||||
|     (alist-path-dispatcher | ||||
| 	`(("h"       . ,(home-dir-handler "public_html")) | ||||
| 	  ("cgi-bin" . ,(cgi-handler "/usr/local/etc/httpd/cgi-bin")) | ||||
| 	  ("seval"   . ,seval-handler)) | ||||
| 	(rooted-file-handler "/usr/local/etc/httpd/htdocs"))) | ||||
| </PRE> | ||||
| 
 | ||||
|     This means: | ||||
| <UL> | ||||
| <LI> If the path looks like <CODE>("h" "shivers" "code" "web.tar.gz")</CODE>, | ||||
|      pass the path <CODE>("shivers" "code" "web.tar.gz")</CODE> to a | ||||
|      home-directory path handler. | ||||
| 
 | ||||
| 
 | ||||
| <LI> If the path looks like <CODE>("cgi-bin" "calendar")</CODE>, | ||||
|      pass <CODE>("calendar")</CODE> off to the CGI path handler. | ||||
| 
 | ||||
| 
 | ||||
| <LI> If the path looks like <CODE>("seval" ...)</CODE>, | ||||
|      the tail of the path is passed off to the code-uploading seval | ||||
|      path handler. | ||||
| 
 | ||||
| <LI> Otherwise, the whole path is passed to a rooted file handler, who | ||||
|      will convert it into a filename, rooted at | ||||
|      <CODE>/usr/local/etc/httpd/htdocs</CODE>, and serve that file. | ||||
| </UL> | ||||
| 
 | ||||
| 
 | ||||
| <DT> <CODE>(home-dir-handler <VAR>subdir</VAR>) -> | ||||
|            <VAR>path-handler</CODE></VAR> | ||||
| <DD> | ||||
|     This procedure builds a path handler that does basic file serving | ||||
|     out of home directories. If the resulting path handler is passed | ||||
|     a path of <CODE>(<VAR>user</VAR> . <VAR>file-path</VAR>)</CODE>, | ||||
|     then it serves the file | ||||
| <PRE> | ||||
|     <VAR>user's-home-directory</VAR>/<VAR>subdir</VAR>/<VAR>file-path</VAR> | ||||
| </PRE> | ||||
|     The path handler only handles GET requests; the filename is not | ||||
|     allowed to contain <CODE>..</CODE> elements. | ||||
| 
 | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(tilde-home-dir-handler <VAR>subdir</VAR> <VAR>default-path-handler</VAR>) | ||||
|        -> <VAR>path-handler</VAR> | ||||
| </CODE> | ||||
| <DD> | ||||
|     This path handler examines the car of the path. If it is a string | ||||
|     beginning with a tilde, <em>e.g.</em>, "<CODE>~ziggy</CODE>", | ||||
|     then the string is taken | ||||
|     to mean a home directory, and the request is served similarly to a | ||||
|     <CODE>home-dir-handler</CODE> path handler. | ||||
|     Otherwise, the request is passed off | ||||
|     in its entirety to the default path handler. | ||||
| 
 | ||||
|     <P> | ||||
|     This procedure is useful for implementing servers that provide the | ||||
|     semantics of the NCSA httpd server. | ||||
| 
 | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(cgi-handler <VAR>cgi-directory</VAR>) -> <VAR>path-handler</VAR> | ||||
| </CODE> | ||||
| <DD> | ||||
|     This procedure returns a path-handler that passes the request off to some | ||||
|     program using the CGI interface. The script name is taken from the | ||||
|     car of the path; it is checked for occurrences of <CODE>..</CODE>'s. | ||||
|     If the path is | ||||
| <PRE> | ||||
|     ("my-prog" "foo" "bar") | ||||
| </PRE> | ||||
|     then the program executed is | ||||
| <PRE> | ||||
|     <VAR>cgi-directory</VAR>/my-prog | ||||
| </PRE> | ||||
|     <P> | ||||
|     When the CGI path handler builds the process environment for the | ||||
|     CGI script, several elements | ||||
|     (<em>e.g.</em>, <CODE>$PATH</CODE> and <CODE>$SERVER_SOFTWARE</CODE>) | ||||
|     are request-invariant, and can be computed at server start-up time. | ||||
|     This can be done by calling | ||||
| <PRE> | ||||
|     (initialise-request-invariant-cgi-env) | ||||
| </PRE> | ||||
|     when the server starts up. This is <EM>not</EM> necessary, | ||||
|     but will make CGI requests a little faster. | ||||
| 
 | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(rooted-file-handler <VAR>root-dir</VAR>) -> <VAR>path-handler</VAR> | ||||
| </CODE> | ||||
| <DD> | ||||
|     Returns a path handler that serves files from a particular root | ||||
|     in the file system. Only the GET operation is provided. The path | ||||
|     argument passed to the handler is converted into a filename, | ||||
|     and appended to <VAR>root-dir</VAR>. | ||||
|     The file name is checked for <CODE>..</CODE> components, | ||||
|     and the transaction is aborted if it does. Otherwise, the file is | ||||
|     served to the client. | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(null-path-handler <VAR>path</VAR> <VAR>req</VAR>)</CODE> | ||||
| <DD> | ||||
|     This path handler is useful as a default handler. It handles no requests, | ||||
|     always returning a "404 Not found" reply to the client. | ||||
| 
 | ||||
| </DL> | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>HTTP errors</H3> | ||||
| 
 | ||||
| Authors of path-handlers need to be able to handle errors in a reasonably | ||||
| simple fashion. The S.U. Web server provides a set of error conditions that | ||||
| correspond to the error replies in the HTTP protocol. These errors can be | ||||
| raised with the <CODE>http-error</CODE> procedure. | ||||
| When the server runs a path handler, | ||||
| it runs it in the context of an error handler that catches these errors, | ||||
| sends an error reply to the client, and closes the transaction. | ||||
| 
 | ||||
| <DL> | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(http-error <VAR>reply-code</VAR> <VAR>req</VAR> [<VAR>extra</VAR> ...])</CODE> | ||||
| <DD> | ||||
|     This raises an http error condition. The reply code is one of the | ||||
|     numeric HTTP error reply codes, which are bound to the variables | ||||
|     <CODE>http-reply/ok</CODE>, <CODE>http-reply/not-found</CODE>, | ||||
|     <CODE>http-reply/bad-request</CODE>, and so | ||||
|     forth. The <VAR>req</VAR> argument is the request record that caused | ||||
|     the error. | ||||
|     Any following <VAR>extra</VAR> args are passed along for | ||||
|     informational purposes. | ||||
|     Different HTTP errors take different types of extra arguments. | ||||
|     For example, the "301 moved permanently" and "302 moved temporarily" | ||||
|     replies use the first two <VAR>extra</VAR> values as the | ||||
|     <CODE>URI:</CODE> and <CODE>Location:</CODE> | ||||
|     fields in the reply header, respectively. See the clauses of the | ||||
|     <CODE>send-http-error-reply</CODE> procedure for details. | ||||
| 
 | ||||
| 
 | ||||
| <DT> | ||||
| <CODE>(send-http-error-reply <VAR>reply-code</VAR> <VAR>request</VAR> | ||||
|                              [<VAR>extra</VAR> ...]) | ||||
| </CODE> | ||||
| <DD> | ||||
|     This procedure writes an error reply out to the current output | ||||
|     port. If an error occurs during this process, it is caught, and | ||||
|     the procedure silently returns. The http server's standard error | ||||
|     handler passes all http errors raised during path-handler execution | ||||
|     to this procedure to generate the error reply before aborting the | ||||
|     request transaction. | ||||
| </DL> | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Simple directory generation</H3> | ||||
| 
 | ||||
| Most path-handlers that serve files to clients eventually call an internal | ||||
| procedure named <CODE>file-serve</CODE>, | ||||
| which implements a simple directory-generation service using the | ||||
| following rules: | ||||
| <UL> | ||||
| <LI> If the filename has the <EM>form</EM> of a directory | ||||
|      (<EM>i.e.</EM>, it ends with a slash), | ||||
|      then <CODE>file-serve</CODE> actually looks for a | ||||
|      file named "<CODE>index.html</CODE>" in that directory. | ||||
| 
 | ||||
| <LI> If the filename names a directory, but is not in directory form | ||||
|       (<EM>i.e.</EM>, it doesn't end in a slash, | ||||
|       as in "<CODE>/usr/include</CODE>" or "<CODE>/usr/raj</CODE>"), | ||||
|       then <CODE>file-serve</CODE> sends back a "301 moved permanently" | ||||
|       message, | ||||
|       redirecting the client to a slash-terminated version of the original | ||||
|       URL. For example, the URL | ||||
| <PRE> | ||||
|     http://clark.lcs.mit.edu/~shivers | ||||
| </PRE> | ||||
|       would be redirected to | ||||
| <PRE> | ||||
|     http://clark.lcs.mit.edu/~shivers/ | ||||
| </PRE> | ||||
| 
 | ||||
| <LI> If the filename names a regular file, it is served to the client. | ||||
| </UL> | ||||
| 
 | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Support procs</H3> | ||||
| 
 | ||||
| The source files contain a host of support procedures which will be of utility | ||||
| to anyone writing a custom path-handler. Read the files first. | ||||
| 
 | ||||
| 
 | ||||
| <!----------------------------------------------------------------------------> | ||||
| <H3>Losing</H3> | ||||
| 
 | ||||
| Be aware of two Unix problems, which may require workarounds: | ||||
| <OL> | ||||
| 
 | ||||
| <LI> | ||||
|    NeXTSTEP's Posix implementation of the <CODE>getpwnam()</CODE> routine | ||||
|    will silently tell you that every user has uid 0. This means | ||||
|    that if your server, running as root, does a | ||||
| <PRE> | ||||
|     (set-uid (user->uid "nobody")) | ||||
| </PRE> | ||||
|    it will essentially do a | ||||
| <PRE> | ||||
|     (set-uid 0) | ||||
| </PRE> | ||||
|    and you will thus still be running as root. | ||||
| 
 | ||||
|    <P> | ||||
|    The fix is to manually find out who user nobody is (he's -2 on my | ||||
|    system), and to hard-wire this into the server: | ||||
| <PRE> | ||||
|     (set-uid -2) | ||||
| </PRE> | ||||
|    This problem is NeXTSTEP specific. If you are using not using NeXTSTEP, | ||||
|    no problem. | ||||
| 
 | ||||
| 
 | ||||
| <LI> | ||||
|    On NeXTSTEP, the ip-address->host-name translation routine | ||||
|    (in C, <CODE>gethostbyaddr()</CODE>; in scsh, | ||||
|    <CODE>(host-info addr)</CODE>) does not | ||||
|    use the DNS system; it goes through NeXT's propietary Netinfo | ||||
|    system, and may not return a fully-qualified domain name. For | ||||
|    example, on my system, I get "amelia-earhart", when I want | ||||
|    "amelia-earhart.lcs.mit.edu". Since the server uses this name | ||||
|    to construct redirection URL's to be sent back to the Web client, | ||||
|    they need to be FQDN's. | ||||
| 
 | ||||
|    <P> | ||||
|    This problem may occur on other OS's; | ||||
|    I cannot determine if <CODE>gethostbyaddr()</CODE> | ||||
|    is required to return a FQDN or not. (I would appreciate hearing the | ||||
|    answer if you know; my local Internet guru's couldn't tell me.) | ||||
| 
 | ||||
|    <P> | ||||
|    If your system doesn't give you a complete Internet address when | ||||
|    you say | ||||
| <PRE> | ||||
|     (host-info:name (host-info (system-name))) | ||||
| </PRE> | ||||
|    then you have this problem. | ||||
| 
 | ||||
|    <P> | ||||
|    The server has a workaround. There is a procedure exported from | ||||
|    the httpd-core package: | ||||
| <PRE> | ||||
|     (set-my-fqdn name) | ||||
| </PRE> | ||||
|    Call this to crow-bar the server's idea of its own Internet host name | ||||
|    before running the server, and all will be well. | ||||
| </OL> | ||||
| 
 | ||||
| </BODY> | ||||
| </HTML> | ||||
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							|  | @ -0,0 +1,161 @@ | |||
| This file documents names defined in rfc822.scm: | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| NOTES | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| A note on line-terminators: | ||||
| 
 | ||||
| Line-terminating sequences are always a drag, because there's no | ||||
| agreement on them -- the Net protocols and DOS use cr/lf; Unix uses | ||||
| lf; the Mac uses cr. One one hand, you'd like to use the code for all | ||||
| of the above, on the other, you'd also like to use the code for strict | ||||
| applications that need definitely not to recognise bare cr's or lf's | ||||
| as terminators. | ||||
| 
 | ||||
| RFC 822 requires a cr/lf (carriage-return/line-feed) pair to terminate | ||||
| lines of text. On the other hand, careful perusal of the text shows up | ||||
| some ambiguities (there are maybe three or four of these, and I'm too | ||||
| lazy to write them all down). Furthermore, it is an unfortunate fact | ||||
| that many Unix apps separate lines of RFC 822 text with simple | ||||
| linefeeds (e.g., messages kept in /usr/spool/mail). As a result, this | ||||
| code takes a broad-minded view of line-terminators: lines can be | ||||
| terminated by either cr/lf or just lf, and either terminating sequence | ||||
| is trimmed. | ||||
| 
 | ||||
| If you need stricter parsing, you can call the lower-level procedure | ||||
| %READ-RFC-822-FIELD and %READ-RFC822-HEADERS procs. They take the | ||||
| read-line procedure as an extra parameter. This means that you can | ||||
| pass in a procedure that recognises only cr/lf's, or only cr's (for a | ||||
| Mac app, perhaps), and you can determine whether or not the | ||||
| terminators get trimmed. However, your read-line procedure must | ||||
| indicate the header-terminating empty line by returning *either* the | ||||
| empty string or the two-char string cr/lf (or the EOF object). | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| DEFINITIONS AND DESCRIPTIONS | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| (read-rfc822-field [port]) | ||||
| (%read-rfc822-field read-line port) | ||||
| 
 | ||||
| Read one field from the port, and return two values [NAME BODY]: | ||||
| 
 | ||||
|  - NAME	 Symbol such as 'subject or 'to. The field name is converted | ||||
|          to a symbol using the Scheme implementation's preferred | ||||
|          case. If the implementation reads symbols in a case-sensitive | ||||
|          fashion (e.g., scsh), lowercase is used. This means you can | ||||
|          compare these symbols to quoted constants using EQ?. When | ||||
|          printing these field names out, it looks best if you capitalise | ||||
|          them with (CAPITALIZE-STRING (SYMBOL->STRING FIELD-NAME)). | ||||
| 
 | ||||
|  - BODY	 List of strings which are the field's body, e.g.  | ||||
|          ("shivers@lcs.mit.edu"). Each list element is one line from | ||||
|          the field's body, so if the field spreads out over three lines, | ||||
|          then the body is a list of three strings. The terminating | ||||
|          cr/lf's are trimmed from each string. A leading space or a | ||||
|          leading horizontal tab is also trimmed, but one and onyl one. | ||||
| 
 | ||||
| When there are no more fields -- EOF or a blank line has terminated | ||||
| the header section -- then the procedure returns [#f #f]. | ||||
|   | ||||
| The %READ-RFC822-FIELD variant allows you to specify your own | ||||
| read-line procedure. The one used by READ-RFC822-FIELD terminates | ||||
| lines with either cr/lf or just lf, and it trims the terminator from | ||||
| the line. Your read-line procedure should trim the terminator of the | ||||
| line, so an empty line is returned as an empty string. | ||||
| 
 | ||||
| The procedures raise an error if the syntax of the read field (the | ||||
| line returned by the read-line-function) is illegal (RFC822 illegal). | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| read-rfc822-headers [port] | ||||
| %read-rfc822-headers read-line port | ||||
| 
 | ||||
| Read in and parse up a section of text that looks like the header | ||||
| portion of an RFC 822 message. Return an alist mapping a field name (a | ||||
| symbol such as 'date or 'subject) to a list of field bodies -- one for | ||||
| each occurence of the field in the header. So if there are five | ||||
| "Received-by:" fields in the header, the alist maps 'received-by to a | ||||
| five element list. Each body is in turn represented by a list of | ||||
| strings -- one for each line of the field. So a field spread across | ||||
| three lines would produce a three element body. | ||||
| 
 | ||||
| The %READ-RFC822-HEADERS variant allows you to specify your own | ||||
| read-line procedure. See notes (A note on line-terminators) above for | ||||
| reasons why. | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| rejoin-header-lines alist [seperator]  | ||||
| 
 | ||||
| Takes a field alist such as is returned by READ-RFC822-HEADERS and | ||||
| returns an equivalent alist. Each body (string list) in the input | ||||
| alist is joined into a single list in the output alist. SEPARATOR is | ||||
| the string used to join these elements together; it defaults to a | ||||
| single space " ", but can usefully be "\n" or "\r\n". | ||||
| 
 | ||||
| To rejoin a single body list, use scsh's JOIN-STRINGS procedure. | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| For the following definitions' examples, let's use this set of of | ||||
| RFC822 headers: | ||||
|      From: shivers | ||||
|      To: ziggy, | ||||
|        newts | ||||
|      To: gjs, tk | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| get-header-all headers name | ||||
| 
 | ||||
| returns all entries or #f, p.e. | ||||
| (get-header-all hdrs 'to)   -> ((" ziggy," " newts") (" gjs, tk")) | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| get-header-lines headers name | ||||
| 
 | ||||
| returns all lines of the first entry or #f, p.e. | ||||
| (get-header-lines hdrs 'to) -> (" ziggy," " newts") | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| get-headers headers name [seperator] | ||||
| 
 | ||||
| returns the first entry with the lines joined together by seperator | ||||
| (newline by default (\n)), p.e. | ||||
| (get-header hdrs 'to)       -> "ziggy,\n newts" | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| htab | ||||
| 
 | ||||
| is the horizontal tab (ascii-code 9) | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| string->symbol-pref | ||||
| 
 | ||||
| is a procedure that takes a string and converts it to a symbol | ||||
| using the Scheme implementation's preferred case. The preferred case | ||||
| is recognized by a doing a symbol->string conversion of 'a. | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| DESIREABLE FUNCTIONALITIES | ||||
| 
 | ||||
|  - Unfolding long lines. | ||||
|  - Lexing structured fields. | ||||
|  - Unlexing structured fields into canonical form. | ||||
|  - Parsing and unparsing dates. | ||||
|  - Parsing and unparsing addresses. | ||||
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							|  | @ -0,0 +1,150 @@ | |||
| This file documents names specified in uri.scm. | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| NOTES | ||||
| 
 | ||||
| URIs are of following syntax: | ||||
| 
 | ||||
| [scheme] : path [? search ] [# fragmentid] | ||||
| 
 | ||||
| Parts in [] may be ommitted. The last part is usually referred to as | ||||
| fragid in this document.  | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| DEFINITIONS AND DESCRIPTIONS | ||||
| 
 | ||||
| 
 | ||||
| char-set | ||||
| uri-reserved | ||||
| 
 | ||||
| A list of reserved characters (semicolon, slash, hash, question mark, | ||||
| double colon and space). | ||||
| 
 | ||||
| procedure  | ||||
| parse-uri uri-string --> (scheme, path, search, frag-id) | ||||
| 
 | ||||
| Multiple-value return: scheme, path, search, frag-id, in this | ||||
| order. scheme, search and frag-id are either #f or a string. path is a | ||||
| nonempty list of strings. An empty path is a list containing the empty | ||||
| string. parse-uri tries to be tolerant of the various ways people build broken URIs out there on the Net (so it is not absolutely conform with RFC 1630). | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| unescape-uri string [start [end]] --> string | ||||
| 
 | ||||
| Unescapes a string. This procedure should only be used *after* the url | ||||
| (!)  was parsed, since unescaping may introduce characters that blow | ||||
| up the parse (that's why escape sequences are used in URIs ;). | ||||
| Escape-sequences are of following scheme: %hh where h is a hexadecimal | ||||
| digit. E.g. %20 is space (ASCII character 32). | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| hex-digit? character --> boolean | ||||
| 
 | ||||
| Returns #t if character is a hexadecimal digit (i.e., one of 1-9, a-f, | ||||
| A-F), #f otherwise. | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| hexchar->int character --> number | ||||
| 
 | ||||
| Translates the given character to an integer, p.e. (hexchar->int \#a) | ||||
| => 10. | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| int->hexchar integer --> character | ||||
| 
 | ||||
| Translates the given integer from range 1-15 into an hexadecimal | ||||
| character (uses uppercase letters), p.e. (int->hexchar 14) => E.  | ||||
| 
 | ||||
| 
 | ||||
| char-set | ||||
| uri-escaped-chars | ||||
| 
 | ||||
| A set of characters that are escaped in URIs. These are the following | ||||
| characters: dollar ($), minus (-), underscore (_), at (@), dot (.), | ||||
| and-sign (&), exclamation mark (!), asterisk (*), backslash (\), | ||||
| double quote ("), single quote ('), open brace ((), close brace ()), | ||||
| comma (,) plus (+) and all other characters that are neither letters | ||||
| nor digits (such as space and control characters). | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| escape-uri string [escaped-chars] --> string | ||||
| 
 | ||||
| Escapes characters of string that are given with escaped-chars. | ||||
| escaped-chars default to uri-escaped-chars. Be careful with using this | ||||
| procedure to chunks of text with syntactically meaningful reserved | ||||
| characters (e.g., paths with URI slashes or colons) -- they'll be | ||||
| escaped, and lose their special meaning. E.g. it would be a mistake to | ||||
| apply escape-uri to "//lcs.mit.edu:8001/foo/bar.html" because the | ||||
| slashes and colons would be escaped. Note that esacpe-uri doesn't | ||||
| check this as it would lose his meaning. | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| resolve-uri cscheme cp scheme p --> (scheme, path) | ||||
| 
 | ||||
| Sorry, I can't figure out what resolve-uri is inteded to do. Perhaps | ||||
| I find it out later. | ||||
| 
 | ||||
| The code seems to have a bug: In the body of receive, there's a | ||||
| loop. j should, according to the comment, count sequential /. But j | ||||
| counts nothing in the body. Either zero is added ((lp (cdr cp-tail) | ||||
| (cons (car cp-tail) rhead) (+ j 0))) or j is set to 1 ((lp (cdr | ||||
| cp-tail) (cons (car cp-tail) rhead) 1))). Nevertheless, j is expected | ||||
| to reach value numsl that can be larger than one. So what? I am | ||||
| confused. | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| rev-append list-a list-b --> list | ||||
| 
 | ||||
| Performs a (append (reverse list-a) list-b). The comment says it | ||||
| should be defined in a list package but I am wondering how often this | ||||
| will be used. | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| split-uri-path uri start end --> list | ||||
| 
 | ||||
| Splits uri at /'s. Only the substring given with start (inclusive) and | ||||
| end (exclusive) is considered. Start and end - 1 have to be within the | ||||
| range of the uri-string.  Otherwise an index-out-of-range exception | ||||
| will be raised. Example: (split-uri-path "foo/bar/colon" 4 11) ==> | ||||
| '("bar" "col") | ||||
| 
 | ||||
| 
 | ||||
| procedure | ||||
| simplify-uri-path path --> list | ||||
| 
 | ||||
| Removes "." and ".." entries from path. The result is a (maybe empty) | ||||
| list representing a path that does not contain any "." or "..". The | ||||
| list can only be empty if the path did not start with "/" (for the | ||||
| rare occasion someone wants to simplify a relative path). The result | ||||
| is #f if the path tries to back up past root, for example by "/.." or | ||||
| "/foo/../.." or just "..". "//" may occur somewhere in the path | ||||
| referring to root but not being backed up. | ||||
| Examples:  | ||||
| (simplify-uri-path (split-uri-path "/foo/bar/baz/.." 0 15)) | ||||
| ==> '("" "foo" "bar")   | ||||
| 
 | ||||
| (simplify-uri-path (split-uri-path "foo/bar/baz/../../.." 0 20)) | ||||
| ==> '() | ||||
| 
 | ||||
| (simplify-uri-path (split-uri-path "/foo/../.." 0 10)) | ||||
| ==> #f          ; tried to back up root | ||||
| 
 | ||||
| (simplify-uri-path (split-uri-path "foo/bar//" 0 9)) | ||||
| ==> '("")       ; "//" refers to root | ||||
| 
 | ||||
| (simplify-uri-path (split-uri-path "foo/bar/" 0 8)) | ||||
| ==> '("")       ; last "/" also refers to root | ||||
| 
 | ||||
| (simplify-uri-path (split-uri-path "/foo/bar//baz/../.." 0 19)) | ||||
| ==> #f          ; tries to back up root | ||||
|  | @ -0,0 +1,69 @@ | |||
| This file documents names defined in url.scm | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| NOTES | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| DEFINITIONS AND DESCRIPTIONS | ||||
| 
 | ||||
| 
 | ||||
| userhost                           record | ||||
| 
 | ||||
| A record containing the fields user, password, host and port. Created | ||||
| by parsing a string like //<user>:<password>@<host>:<port>/. The | ||||
| record describes path-prefixes of the form | ||||
| //<user>:<password>@<host>:<port>/ These are frequently used as the | ||||
| initial prefix of URL's describing Internet resources. | ||||
| 
 | ||||
| 
 | ||||
| parse-userhost path default | ||||
| 
 | ||||
| Parse a URI path (a list representing a path, not a string!) into a | ||||
| userhost record. Default values are taken from the userhost record | ||||
| DEFAULT except for the host. Returns a userhost record if it wins, and | ||||
| #f if it cannot parse the path. It is an error if the specified path | ||||
| does not begin with '//..' like noted at userhost. | ||||
| 
 | ||||
| 
 | ||||
| userhost-escaped-chars             list | ||||
| 
 | ||||
| The union of uri-escaped-chars and the characters '@' and ':'. Used | ||||
| for the unparser. | ||||
| 
 | ||||
| 
 | ||||
| userhost->string userhost          procedure | ||||
| 
 | ||||
| Unparses a userhost record to a string. | ||||
| 
 | ||||
| 
 | ||||
| http-url                           record | ||||
| 
 | ||||
| Record containing the fields userhost (a userhost record), path (a | ||||
| path list), search and frag-id. The PATH slot of this record is the | ||||
| URL's path split at slashes, e.g., "foo/bar//baz/" => ("foo" "bar" "" | ||||
| "baz" ""). These elements are in raw, unescaped format. To convert | ||||
| back to a string, use (uri-path-list->path (map escape-uri pathlist)). | ||||
| 
 | ||||
| 
 | ||||
| parse-http-url path search frag-id       procedure | ||||
| 
 | ||||
| Returns a http-url record. path, search and frag-id are results of a | ||||
| parse-uri call on the initial uri. See there (uri.scm) for further | ||||
| details. search and frag-id are stored as they are. This parser | ||||
| decodes the path elements. It is an error if the path specifies an | ||||
| user or a password as this is not allowd at http-urls. | ||||
| 
 | ||||
| 
 | ||||
| default-http-userhost                    record | ||||
| 
 | ||||
| A userhost record that specifies the port as 80 and anything else as | ||||
| #f. | ||||
| 
 | ||||
| 
 | ||||
| http-url->string http-url | ||||
| 
 | ||||
| Unparses the given http-url to a string. | ||||
		Loading…
	
		Reference in New Issue
	
	 interp
						interp