From 0bbb13664e0481112dffa86cd3ac614720d9d6a9 Mon Sep 17 00:00:00 2001 From: sperber Date: Tue, 22 Apr 2003 12:45:30 +0000 Subject: [PATCH] Long obsolete. --- doc/html/index.html | 85 -------- doc/html/su-httpd.html | 482 ----------------------------------------- 2 files changed, 567 deletions(-) delete mode 100644 doc/html/index.html delete mode 100644 doc/html/su-httpd.html diff --git a/doc/html/index.html b/doc/html/index.html deleted file mode 100644 index 0b8e359..0000000 --- a/doc/html/index.html +++ /dev/null @@ -1,85 +0,0 @@ - - -The Scheme Underground Network Package - - - -

The Scheme Underground Network Package

-I have written a set of libraries for doing Net hacking from Scheme/scsh. -It includes: -
-
An smtp client library. -
Forge mail from the comfort of your own Scheme process. - -
rfc822 header library -
Read email-style headers. Useful in several contexts (smtp, http, etc.) - -
Simple structured HTML output library -
Balanced delimiters, etc. - -
The SU Web server -
This is a complete implementation of an HTTP 1.0 server in Scheme. - The server contains other standalone packages that may separately be of - use: -
    -
  • URI and URL parsers and unparsers. -
  • A library to help writing CGI scripts in Scheme. -
  • Server extensions for interfacing to CGI scripts. -
  • Server extensions for uploading Scheme code. -
- The server has three main design goals: -
-
Extensibility -
The server is in fact nothing but extensions, using a mechanism - called "path handlers" to define URL-specific services. It has a toolkit - of services that can be used as-is, extended or built upon. - User extensions have exactly the same status as the base services. - -

- The extension mechanism allows for easy implementation of new services - without the overhead of the CGI interface. Since the server is written - on top of the Scheme shell, the full set of Unix system calls and - program tools is available to the implementor. - -

Mobile code -
The server allows Scheme code to be uploaded for direct execution - inside the server. The server has complete control over the code, - and can safely execute it in restricted environments that do not - provide access to potentially dangerous primitives (such as the - "delete file" procedure.) - - -
Clarity -
I wrote this server to help myself understand the Web. It is voluminously - commented, and I hope it will prove to be an aid in understanding the - low-level details of the Web protocols. -
- -

- The S.U. server has the ability to upload code from Web clients and - execute that code on behalf of the client in a protected environment. - -

- Some simple documentation on the server - is available. - -

- -

Obtaining the system

-The network code is available by -ftp. -To run the server, you need our 0.4 release of -scsh -which has just been released. - -Beyond actually running the server, -the separate parser libraries and other utilites may be of use as separate -modules. - -
Olin Shivers - / shivers@ai.mit.edu
- - - - - diff --git a/doc/html/su-httpd.html b/doc/html/su-httpd.html deleted file mode 100644 index 356aa37..0000000 --- a/doc/html/su-httpd.html +++ /dev/null @@ -1,482 +0,0 @@ - - - -The Scheme Underground Web system - - - -

The Scheme Underground Web System

- -
Olin Shivers - / shivers@ai.mit.edu -
-July 1995 - -
-Note: Netscape typesets description lists in a manner that makes the -procedure descriptions below blur together, even in the absence of the -HTML COMPACT attribute. You may just wish to print out a simple -ASCII version of this note, instead. -
- - - - -

Introduction

- -The -Scheme underground -Web system is a package of -Scheme -code that provides -utilities for interacting with the -World-Wide Web. -This includes: - - -

-The code can be obtained via - -anonymous ftp -and is implemented in -Scheme 48, -using the system calls and support procedures of -scsh, -the Scheme Shell. -The code was written to be clear and modifiable -- -it is voluminously commented and all non-R4RS dependencies are -described at the beginning of each source file. - -

-I do not have the time to write detailed documentation for these packages. -However, they are very thoroughly commented, and I strongly recommend -reading the source files; they were written to be read, and the source -code comments should provide a clear description of the system. -The remainder of this note gives an overview of the server's basic -architecture and interfaces. - -

The Scheme Underground Web Server

- -The server was designed with three principle goals in mind: -
-
Extensibility -
The server is designed to make it easy to extend the basic - functionality. In fact, the server is nothing but extensions. There is - no distinction between the set of basic services provided by the server - implementation and user extensions -- they are both implemented in - Scheme, and have equal status. The design is "turtles all the way down." - - -
Mobile code -
Because the server is written in Scheme 48, it is simple to use the - Scheme 48 module system to upload programs to the server for safe - execution within a protected, server-chosen environment. The server - comes with a simple example upload service to demonstrate this - capability. - - -
Clarity of implementation -
Because the server is written in a high-level language, it should make - for a clearer exposition of the HTTP protocol and the associated URL - and URI notations than one written in a low-level language such as C. - This also should help to make the server easy to modify and adapt to - different uses. -
- - -

Basic server structure

- -The Web server is started by calling the httpd procedure, -which takes one required and two optional arguments: -
-    (httpd path-handler [port working-directory])
-
- -The server accepts connections from the given port, which defaults to 80. -The server runs with the working directory set to the given value, -which defaults to -
-    /usr/local/etc/httpd
-
- - -

-The server's basic loop is to wait on the port for a connection from an HTTP -client. When it receives a connection, it reads in and parses the request into -a special request data structure. Then the server forks a child process, who -binds the current I/O ports to the connection socket, and then hands off to -the top-level path handler (the first argument to httpd). -The path-handler procedure is responsible for actually serving the request -- -it can be any arbitrary computation. -Its output goes directly back to the HTTP client that sent the request. - -

-Before calling the path handler to service the request, the HTTP server -installs an error handler that fields any uncaught error, sends an -error reply to the client, and aborts the request transaction. Hence -any error caused by a path-handler will be handled in a reasonable and -robust fashion. - -

-The basic server loop, and the associated request data structure are the fixed -architecture of the S.U. Web server; its flexibility lies in the notion of -path handlers. - - - -

Path handlers

- -A path handler is a procedure taking two arguments: -
-    (path-handler path req)
-
- - -The req argument is a request record giving all the details of the -client's request; it has the following structure: -
-    (define-record request
-      method		; A string such as "GET", "PUT", etc.
-      uri		; The escaped URI string as read from request line.
-      url		; An http URL record (see url.scm).
-      version		; A (major . minor) integer pair.
-      headers		; An rfc822 header alist (see rfc822.scm).
-      socket)		; The socket connected to the client.
-
- -The path argument is the URL's path, -parsed and split at slashes into a string list. -For example, if the Web client dereferences URL -
-    http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz
-
-then the server would pass the following path to the top-level handler: -
-    ("h" "shivers" "code" "web.tar.gz")
-
- -

-The path argument's pre-parsed representation as a string list makes it easy -for the path handler to implement recursive operations dispatch on URL paths. - -

-Path handlers can do anything they like to respond to HTTP requests; they have -the full range of Scheme to implement the desired functionality. When -handling HTTP requests that have an associated entity body (such as POST), the -body should be read from the current input port. Path handlers should in all -cases write their reply to the current output port. Path handlers should -not perform I/O on the request record's socket. -Path handlers are frequently called recursively, and doing I/O directly to the -socket might bypass a filtering or other processing step interposed on the -current I/O ports by some superior path handler. - - -

Basic path handlers

- -Although the user can write any path-handler he likes, the S.U. server comes -with a useful toolbox of basic path handlers that can be used and built upon: - -
- -
-(alist-path-dispatcher ph-alist default-ph) -> path-handler - -
- This procedure takes a string->path-handler alist, and a default - path handler, and returns a handler that dispatches on its path argument. - When the new path handler is applied to a path - ("foo" "bar" "baz"), - it uses the first element of the path -- "foo" -- to - index into the alist. - If it finds an associated path handler in the alist, it - hands the request off to that handler, passing it the tail of the - path, ("bar" "baz"). - On the other hand, if the path is empty, or the alist search does - not yield a hit, we hand off to the default path handler, - passing it the entire original path, ("foo" "bar" "baz"). - -

- This procedure is how you say: "If the first element of the URL's path - is `foo', do X; if it's `bar', do Y; otherwise, do Z." If one takes - an object-oriented view of the process, an alist path-handler does - method lookup on the requested operation, dispatching off to the - appropriate method defined for the URL. - -

- The slash-delimited URI path structure implies an associated - tree of names. The path-handler system and the alist dispatcher - allow you to procedurally define the server's response to any arbitrary - subtree of the path space. - -

- Example:
- A typical top-level path handler is - -

-  (define ph
-    (alist-path-dispatcher
-	`(("h"       . ,(home-dir-handler "public_html"))
-	  ("cgi-bin" . ,(cgi-handler "/usr/local/etc/httpd/cgi-bin"))
-	  ("seval"   . ,seval-handler))
-	(rooted-file-handler "/usr/local/etc/httpd/htdocs")))
-
- - This means: -
    -
  • If the path looks like ("h" "shivers" "code" "web.tar.gz"), - pass the path ("shivers" "code" "web.tar.gz") to a - home-directory path handler. - - -
  • If the path looks like ("cgi-bin" "calendar"), - pass ("calendar") off to the CGI path handler. - - -
  • If the path looks like ("seval" ...), - the tail of the path is passed off to the code-uploading seval - path handler. - -
  • Otherwise, the whole path is passed to a rooted file handler, who - will convert it into a filename, rooted at - /usr/local/etc/httpd/htdocs, and serve that file. -
- - -
(home-dir-handler subdir) -> - path-handler -
- This procedure builds a path handler that does basic file serving - out of home directories. If the resulting path handler is passed - a path of (user . file-path), - then it serves the file -
-    user's-home-directory/subdir/file-path
-
- The path handler only handles GET requests; the filename is not - allowed to contain .. elements. - - -
-(tilde-home-dir-handler subdir default-path-handler) - -> path-handler - -
- This path handler examines the car of the path. If it is a string - beginning with a tilde, e.g., "~ziggy", - then the string is taken - to mean a home directory, and the request is served similarly to a - home-dir-handler path handler. - Otherwise, the request is passed off - in its entirety to the default path handler. - -

- This procedure is useful for implementing servers that provide the - semantics of the NCSA httpd server. - - -

-(cgi-handler cgi-directory) -> path-handler - -
- This procedure returns a path-handler that passes the request off to some - program using the CGI interface. The script name is taken from the - car of the path; it is checked for occurrences of ..'s. - If the path is -
-    ("my-prog" "foo" "bar")
-
- then the program executed is -
-    cgi-directory/my-prog
-
-

- When the CGI path handler builds the process environment for the - CGI script, several elements - (e.g., $PATH and $SERVER_SOFTWARE) - are request-invariant, and can be computed at server start-up time. - This can be done by calling -

-    (initialise-request-invariant-cgi-env)
-
- when the server starts up. This is not necessary, - but will make CGI requests a little faster. - - -
-(rooted-file-handler root-dir) -> path-handler - -
- Returns a path handler that serves files from a particular root - in the file system. Only the GET operation is provided. The path - argument passed to the handler is converted into a filename, - and appended to root-dir. - The file name is checked for .. components, - and the transaction is aborted if it does. Otherwise, the file is - served to the client. - -
-(null-path-handler path req) -
- This path handler is useful as a default handler. It handles no requests, - always returning a "404 Not found" reply to the client. - -
- - -

HTTP errors

- -Authors of path-handlers need to be able to handle errors in a reasonably -simple fashion. The S.U. Web server provides a set of error conditions that -correspond to the error replies in the HTTP protocol. These errors can be -raised with the http-error procedure. -When the server runs a path handler, -it runs it in the context of an error handler that catches these errors, -sends an error reply to the client, and closes the transaction. - -
- -
-(http-error reply-code req [extra ...]) -
- This raises an http error condition. The reply code is one of the - numeric HTTP error reply codes, which are bound to the variables - http-reply/ok, http-reply/not-found, - http-reply/bad-request, and so - forth. The req argument is the request record that caused - the error. - Any following extra args are passed along for - informational purposes. - Different HTTP errors take different types of extra arguments. - For example, the "301 moved permanently" and "302 moved temporarily" - replies use the first two extra values as the - URI: and Location: - fields in the reply header, respectively. See the clauses of the - send-http-error-reply procedure for details. - - -
-(send-http-error-reply reply-code request - [extra ...]) - -
- This procedure writes an error reply out to the current output - port. If an error occurs during this process, it is caught, and - the procedure silently returns. The http server's standard error - handler passes all http errors raised during path-handler execution - to this procedure to generate the error reply before aborting the - request transaction. -
- - -

Simple directory generation

- -Most path-handlers that serve files to clients eventually call an internal -procedure named file-serve, -which implements a simple directory-generation service using the -following rules: - - - - -

Support procs

- -The source files contain a host of support procedures which will be of utility -to anyone writing a custom path-handler. Read the files first. - - - -

Losing

- -Be aware of two Unix problems, which may require workarounds: -
    - -
  1. - NeXTSTEP's Posix implementation of the getpwnam() routine - will silently tell you that every user has uid 0. This means - that if your server, running as root, does a -
    -    (set-uid (user->uid "nobody"))
    -
    - it will essentially do a -
    -    (set-uid 0)
    -
    - and you will thus still be running as root. - -

    - The fix is to manually find out who user nobody is (he's -2 on my - system), and to hard-wire this into the server: -

    -    (set-uid -2)
    -
    - This problem is NeXTSTEP specific. If you are using not using NeXTSTEP, - no problem. - - -
  2. - On NeXTSTEP, the ip-address->host-name translation routine - (in C, gethostbyaddr(); in scsh, - (host-info addr)) does not - use the DNS system; it goes through NeXT's propietary Netinfo - system, and may not return a fully-qualified domain name. For - example, on my system, I get "amelia-earhart", when I want - "amelia-earhart.lcs.mit.edu". Since the server uses this name - to construct redirection URL's to be sent back to the Web client, - they need to be FQDN's. - -

    - This problem may occur on other OS's; - I cannot determine if gethostbyaddr() - is required to return a FQDN or not. (I would appreciate hearing the - answer if you know; my local Internet guru's couldn't tell me.) - -

    - If your system doesn't give you a complete Internet address when - you say -

    -    (host-info:name (host-info (system-name)))
    -
    - then you have this problem. - -

    - The server has a workaround. There is a procedure exported from - the httpd-core package: -

    -    (set-my-fqdn name)
    -
    - Call this to crow-bar the server's idea of its own Internet host name - before running the server, and all will be well. -
- - -