Long obsolete.

2003-01-22 10:51:50 +00:00 · 2003-01-22 10:51:50 +00:00 · 4898196703
parent a95181e4bc
commit 4898196703
1 changed files with 0 additions and 352 deletions
--- a/scheme/httpd/su-httpd.txt
+++ b/scheme/httpd/su-httpd.txt
@ -1,352 +0,0 @@
-The Scheme Underground Web system
-Olin Shivers
-7/95
-Additions by Mike Sperber, 10/96
-
-The Scheme underground Web system is a package of Scheme code that provides
-utilities for interacting with the World-Wide Web.  This includes:
-
-    - A Web server.
-    - URI and URL parsers and un-parsers.
-    - RFC822-style header parsers.
-    - Code for performing structured html output
-    - Code to assist in writing CGI Scheme programs
-      that can be used by any CGI-compliant HTTP server
-      (such as NCSA's httpd, or the S.U. Web server).
-
-The code can be obtained via anonymous ftp and is implemented in Scheme 48,
-using the system calls and support procedures of scsh, the Scheme Shell.  The
-code was written to be clear and modifiable -- it is voluminously commented
-and all non-R4RS dependencies are described at the beginning of each source
-file.
- 
-I do not have the time to write detailed documentation for these packages.
-However, they are very thoroughly commented, and I strongly recommend reading
-the source files; they were written to be read, and the source code comments
-should provide a clear description of the system.  The remainder of this note
-gives an overview of the server's basic architecture and interfaces.
-
-
-* The Scheme Underground Web Server
-The server was designed with three principle goals in mind:
-
-    - Extensibility 
-      The server is designed to make it easy to extend the basic
-      functionality.  In fact, the server is nothing but extensions.  There is
-      no distinction between the set of basic services provided by the server
-      implementation and user extensions -- they are both implemented in
-      Scheme, and have equal status. The design is "turtles all the way down."
-
-    - Mobile code
-      Because the server is written in Scheme 48, it is simple to use the
-      Scheme 48 module system to upload programs to the server for safe
-      execution within a protected, server-chosen environment. The server
-      comes with a simple example upload service to demonstrate this
-      capability.
-
-    - Clarity of implementation
-      Because the server is written in a high-level language, it should make
-      for a clearer exposition of the HTTP protocol and the associated URL
-      and URI notations than one written in a low-level language such as C.
-      This also should help to make the server easy to modify and adapt to
-      different uses.
-
-
-** Basic server structure
-
-The Web server is started by calling the HTTPD procedure, which takes 
-one required and two optional arguments:
-
-    (httpd path-handler [port working-directory])
-
-The server accepts connections from the given port, which defaults to 80.
-The server runs with the working directory set to the given value,
-which defaults to 
-    /usr/local/etc/httpd
- 
-The server's basic loop is to wait on the port for a connection from an HTTP
-client. When it receives a connection, it reads in and parses the request into
-a special request data structure. Then the server forks a child process, who
-binds the current I/O ports to the connection socket, and then hands off to
-the top-level path handler (the first argument to httpd).  The path-handler
-procedure is responsible for actually serving the request -- it can be any
-arbitrary computation.  Its output goes directly back to the HTTP client that
-sent the request.
- 
-Before calling the path handler to service the request, the HTTP server
-installs an error handler that fields any uncaught error, sends an
-error reply to the client, and aborts the request transaction. Hence
-any error caused by a path-handler will be handled in a reasonable and
-robust fashion.
-
-The basic server loop, and the associated request data structure are the fixed
-architecture of the S.U. Web server; its flexibility lies in the notion of
-path handlers.
-
-
-** Path handlers
-
-A path handler is a procedure taking two arguments:
-
-    (path-handler path req)
-
-The REQ argument is a request record giving all the details of the
-client's request; it has the following structure:
-
-    (define-record request
-      method		; A string such as "GET", "PUT", etc.
-      uri		; The escaped URI string as read from request line.
-      url		; An http URL record (see url.scm).
-      version		; A (major . minor) integer pair.
-      headers		; An rfc822 header alist (see rfc822.scm).
-      socket)		; The socket connected to the client.
-
-
-The PATH argument is the URL's path, parsed and split at slashes into a string
-list.  For example, if the Web client dereferences URL
-
-    http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz
-
-then the server would pass the following path to the top-level handler:
-
-    ("h" "shivers" "code" "web.tar.gz")
- 
-The path argument's pre-parsed representation as a string list makes it easy
-for the path handler to implement recursive operations dispatch on URL paths.
- 
-Path handlers can do anything they like to respond to HTTP requests; they have
-the full range of Scheme to implement the desired functionality.  When
-handling HTTP requests that have an associated entity body (such as POST), the
-body should be read from the current input port. Path handlers should in all
-cases write their reply to the current output port. Path handlers should *not*
-perform I/O on the request record's socket.  Path handlers are frequently
-called recursively, and doing I/O directly to the socket might bypass a
-filtering or other processing step interposed on the current I/O ports by some
-superior path handler.
-
-
-*** Basic path handlers
-
-Although the user can write any path-handler he likes, the S.U. server comes
-with a useful toolbox of basic path handlers that can be used and built upon:
- 
-(alist-path-dispatcher ph-alist default-ph) -> path-handler
-    This procedure takes a string->path-handler alist, and a default 
-    path handler, and returns a handler that dispatches on its path argument.
-    When the new path handler is applied to a path ("foo" "bar" "baz"), 
-    it uses the first element of the path -- "foo" -- to index into 
-    the alist.  If it finds an associated path handler in the alist, it
-    hands the request off to that handler, passing it the tail of the path,
-    ("bar" "baz").  On the other hand, if the path is empty, or the alist
-    search does not yield a hit, we hand off to the default path handler, 
-    passing it the entire original path, ("foo" "bar" "baz").
-
-    This procedure is how you say: "If the first element of the URL's path
-    is `foo', do X; if it's `bar', do Y; otherwise, do Z." If one takes
-    an object-oriented view of the process, an alist path-handler does
-    method lookup on the requested operation, dispatching off to the
-    appropriate method defined for the URL.
-
-    The slash-delimited URI path structure implies an associated
-    tree of names. The path-handler system and the alist dispatcher
-    allow you to procedurally define the server's response to any 
-    arbitrary subtree of the path space.
-
-    Example: 
-    A typical top-level path handler is
-
-      (define ph
-	(alist-path-dispatcher
-	    `(("h"       . ,(home-dir-handler "public_html"))
-	      ("cgi-bin" . ,(cgi-handler "/usr/local/etc/httpd/cgi-bin"))
-	      ("seval"   . ,seval-handler))
-	    (rooted-file-handler "/usr/local/etc/httpd/htdocs")))
-
-
-    This means:
-    - If the path looks like ("h" "shivers" "code" "web.tar.gz"),
-      pass the path ("shivers" "code" "web.tar.gz") to a
-      home-directory path handler.
-
-    - If the path looks like ("cgi-bin" "calendar"), 
-      pass ("calendar") off to the CGI path handler.
-
-    - If the path looks like ("seval" ...), the tail of the path
-      is passed off to the code-uploading seval path handler.
-
-    - Otherwise, the whole path is passed to a rooted file handler, who
-      will convert it into a filename, rooted at /usr/local/etc/httpd/htdocs,
-      and serve that file.
-
-
-(home-dir-handler subdir) -> path-handler
-    This procedure builds a path handler that does basic file serving
-    out of home directories. If the resulting path handler is passed
-    a path of (<user> . <file-path>), then it serves the file 
-        <user's-home-directory>/<subdir>/<file-path>
-    The path handler only handles GET requests; the filename is not
-    allowed to contain .. elements.
-
- 
-(tilde-home-dir-handler subdir default-path-handler) -> path-handler
-    This path handler examines the car of the path. If it is a string
-    beginning with a tilde, e.g., "~ziggy", then the string is taken to
-    mean a home directory, and the request is served similarly to a
-    HOME-DIR-HANDLER path handler.  Otherwise, the request is passed off in
-    its entirety to the default path handler.
-    
-    This procedure is useful for implementing servers that provide the
-    semantics of the NCSA httpd server.
-
-
-(cgi-handler cgi-directory) -> path-handler
-    This procedure returns a path-handler that passes the request off to some
-    program using the CGI interface. The script name is taken from the
-    car of the path; it is checked for occurrences of ..'s. If the path is
-        ("my-prog" "foo" "bar")
-    then the program executed is
-        <cgi-directory>/my-prog
-    
-    When the CGI path handler builds the process environment for the
-    CGI script, several elements (e.g., $PATH and $SERVER_SOFTWARE)
-    are request-invariant, and can be computed at server start-up time.
-    This can be done by calling
-        (initialise-request-invariant-cgi-env)
-    when the server starts up. This is *not* necessary, but will make CGI
-    requests a little faster.
-
- 
-(rooted-file-handler root-dir) -> path-handler
-    Returns a path handler that serves files from a particular root in the
-    file system. Only the GET operation is provided. The path argument
-    passed to the handler is converted into a filename, and appended to
-    ROOT-DIR.  The file name is checked for .. components, and the
-    transaction is aborted if it does. Otherwise, the file is served to the
-    client.
-
-
-(rooted-file-or-directory-handler root-dir icon-name) -> path-handler
-    The same as rooted-file-handler, except it can also serve
-    directory index listings for directories that do not contain a
-    file index.html.  ICON-NAME is an object describing how to get at
-    the various icons required for generating directory listings.  It
-    uses the icons provided by CERN httpd 3.0.  ICON-NAME can either
-    be a string which is used as a prefix for generating the icon
-    URLs.  If it is a procedure, it should accept an icon tag (read
-    httpd-handlers.scm for reference) and return an icon name.  If it
-    is neither, it will just use the plain icon name, which is almost
-    guaranteed not to work.
-
- 
-(null-path-handler path req)
-    This path handler is useful as a default handler. It handles no requests,
-    always returning a "404 Not found" reply to the client.
-
-
-** HTTP errors
-
-Authors of path-handlers need to be able to handle errors in a reasonably
-simple fashion. The S.U. Web server provides a set of error conditions that
-correspond to the error replies in the HTTP protocol. These errors can be
-raised with the HTTP-ERROR procedure.  When the server runs a path handler,
-it runs it in the context of an error handler that catches these errors,
-sends an error reply to the client, and closes the transaction.
- 
-(http-error reply-code req [extra ...])
-    This raises an http error condition. The reply code is one of the
-    numeric HTTP error reply codes, which are bound to the variables
-    HTTP-REPLY/OK, HTTP-REPLY/NOT-FOUND, HTTP-REPLY/BAD-REQUEST, and so
-    forth. The REQ argument is the request record that caused the error.
-    Any following EXTRA args are passed along for informational purposes.
-    Different HTTP errors take different types of extra arguments.  For
-    example, the "301 moved permanently" and "302 moved temporarily"
-    replies use the first two extra values as the URI: and Location: fields
-    in the reply header, respectively. See the clauses of the
-    SEND-HTTP-ERROR-REPLY procedure for details.
- 
-(send-http-error-reply reply-code request [extra ...])
-    This procedure writes an error reply out to the current output
-    port. If an error occurs during this process, it is caught, and
-    the procedure silently returns. The http server's standard error
-    handler passes all http errors raised during path-handler execution
-    to this procedure to generate the error reply before aborting the
-    request transaction.
-
-
-** Simple directory generation
-
-Most path-handlers that serve files to clients eventually call an internal
-procedure named FILE-SERVE, which implements a simple directory-generation
-service using the following rules:
-
-    - If the filename has the *form* of a directory (i.e., it ends with a
-      slash), then FILE-SERVE actually looks for a file named "index.html"
-      in that directory.
-
-    - If the filename names a directory, but is not in directory form
-      (i.e., it doesn't end in a slash, as in "/usr/include" or "/usr/raj"),
-      then FILE-SERVE sends back a "301 moved permanently" message,
-      redirecting the client to a slash-terminated version of the original
-      URL. For example, the URL 
-    	http://clark.lcs.mit.edu/~shivers
-      would be redirected to 
-    	http://clark.lcs.mit.edu/~shivers/
-
-    - If the filename names a regular file, it is served to the client.
-
-
-** Support procs
-
-The source files contain a host of support procedures which will be of utility
-to anyone writing a custom path-handler. Read the files first.
-
-** Local customization
-
-   The http-core package exports a procedure:
-
-    (set-server/admin! admin-name)
-
-   which allows you to set the name of the site administrator.  If you
-   don't set this, Olin may get unwanted mail and visit
-   disproportionate violence on you in return.
-
-   There is a procedure exported from the httpd-core package:
-
-    (set-my-fqdn! name)
-
-   Call this to crow-bar the server's idea of its own Internet host
-   name before running the server, and all will be well.
-
-   You may want this for one of several reasons. On NeXTSTEP and on
-   systems that do DNS via NIS/Yellow Pages, you only get an
-   unqualified hostname.  Also, in case of aliased names, you just
-   might get the wrong one.  Furthermore, you may get screwed in the
-   presence of a server accelerator such as Squid.
-
-   There is a similar procedure in httpd-core:
-
-    (set-my-port! portnum)
-
-   Call this to set the local port of your server.  This may be
-   important to get redirection right in the presence of a web server
-   accelerator.
-
-** Losing
-
-Be aware of certain Unix problems which may require workarounds:
-1. NeXTSTEP's Posix implementation of the getpwnam() routine
-   will silently tell you that every user has uid 0. This means
-   that if your server, running as root, does a
-   	(set-uid (user->uid "nobody"))
-   it will essentially do a
-   	(set-uid 0)
-   and you will thus still be running as root.
-   
-   The fix is to manually find out who user nobody is (he's -2 on my
-   system), and to hard-wire this into the server:
-   	(set-uid -2)
-   This problem is NeXTSTEP specific. If you are not using NeXTSTEP,
-   no problem.
-
-
-