First stab at rudimentary up-to-date documentation for the HTTP
server.
This commit is contained in:
parent
9eeb665323
commit
020f8264f1
|
@ -1,256 +1,411 @@
|
|||
\chapter{HTTP server}\label{cha:httpd}
|
||||
%
|
||||
\begin{description}
|
||||
\item[Used files:] httpd/core.scm, httpd/handlers.scm, httpd/options.scm,
|
||||
\item[Name of the packages:] httpd-core, httpd-basic-handler, httpd-make-options
|
||||
\end{description}
|
||||
There are also some other files and packages that are used internally.
|
||||
%
|
||||
The SUnet HTTP Server is a complete industrial-strength implementation
|
||||
of the HTTP 1.0 protocol. It is highly configurable and allows the writing
|
||||
of dynamic web pages that run inside the server without going through
|
||||
complicated and slow protocols like CGI or Fast/CGI.
|
||||
|
||||
The SUnet web system is a collection of packages of \Scheme code that
|
||||
provides utilities for interacting with the World-Wide Web. This
|
||||
includes:
|
||||
\begin{itemize}
|
||||
\item A Web server.
|
||||
\item URI and URL parsers and un-parsers (see Chapters \ref{cha:uri}
|
||||
and \ref{cha:url}).
|
||||
\item RFC822-style header parsers (see Chapter \ref{cha:rfc822}).
|
||||
\item Code for performing structured html output
|
||||
\item Code to assist in writing CGI \Scheme programs that can be used by
|
||||
any CGI-compliant HTTP server (such as NCSA's httpd, or the SUnet
|
||||
Web server).
|
||||
\end{itemize}
|
||||
\section{Starting and configuring the server}
|
||||
|
||||
The server has three main design goals:
|
||||
\begin{description}
|
||||
\item[Extensibility]
|
||||
The server is in fact nothing but extensions, using a mechanism
|
||||
called ``path handlers'' to define URL-specific services. It has a
|
||||
toolkit of services that can be used as-is, extended or built
|
||||
upon. User extensions have exactly the same status as the base
|
||||
services.
|
||||
All procedures described in this section are exported by the
|
||||
\texttt{httpd} structure.
|
||||
|
||||
The extension mechanism allows for easy implementation of new
|
||||
services without the overhead of the CGI interface. Since the
|
||||
server is written on top of the Scheme shell, the full set of Unix
|
||||
system calls and program tools is available to the implementor.
|
||||
|
||||
\item[Mobile code]
|
||||
The server allows Scheme code to be uploaded for direct execution
|
||||
inside the server. The server has complete control over the code,
|
||||
and can safely execute it in restricted environments that do not
|
||||
provide access to potentially dangerous primitives (such as the
|
||||
``delete file'' procedure.)
|
||||
|
||||
\item[Clarity]
|
||||
I\footnote{That's Olin Shivers (\ex{shivers@ai.mit.edu},
|
||||
\ex{http://www.\ob{}ai.\ob{}mit.\ob{}edu/\ob{}people/\ob{}shivers/}).
|
||||
For the rest of the documentation, if not mentioned otherwise,
|
||||
`I' refers to him.} wrote this server to help myself understand
|
||||
the Web. It is voluminously commented, and I hope it will prove to
|
||||
be an aid in understanding the low-level details of the Web
|
||||
protocols.
|
||||
|
||||
The SUnet web server has the ability to upload code from Web clients
|
||||
and execute that code on behalf of the client in a protected
|
||||
environment.
|
||||
|
||||
Some simple documentation on the server is available.
|
||||
\end{description}
|
||||
|
||||
\section{Basic server structure}
|
||||
|
||||
The Web server is started by calling the httpd procedure, which takes
|
||||
one argument, a \ex{httpd\=options}-record:
|
||||
The Web server is started by calling the \ex{httpd} procedure, which takes
|
||||
one argument, an options value:
|
||||
|
||||
\defun{httpd}{options}{\noreturn}
|
||||
\begin{desc}
|
||||
This procedure starts the server. The various \semvar{options} can
|
||||
be set via the options transformers that are explained below.
|
||||
This procedure starts the server. The \var{options} argument
|
||||
specifies various configuration parameters, explained below.
|
||||
|
||||
The server's basic loop is to wait on the port for a connection from
|
||||
an HTTP client. When it receives a connection, it reads in and
|
||||
parses the request into a special request data structure. Then the
|
||||
server forks a thread, who binds the current I/O ports to the
|
||||
server forks a thread which binds the current I/O ports to the
|
||||
connection socket, and then hands off to the top-level
|
||||
\semvar{path-handler} (the first argument to httpd). The
|
||||
\semvar{path-handler} procedure is responsible for actually serving
|
||||
the request -- it can be any arbitrary computation. Its output goes
|
||||
request handler (which must be specified in the options). The
|
||||
request handler is responsible for actually serving
|
||||
the request---it can be any arbitrary computation. Its output goes
|
||||
directly back to the HTTP client that sent the request.
|
||||
|
||||
Before calling the path handler to service the request, the HTTP
|
||||
Before calling the request handler to service the request, the HTTP
|
||||
server installs an error handler that fields any uncaught error,
|
||||
sends an error reply to the client, and aborts the request
|
||||
transaction. Hence any error caused by a path-handler will be
|
||||
transaction. Hence any error caused by a request handler will be
|
||||
handled in a reasonable and robust fashion.
|
||||
\end{desc}
|
||||
%
|
||||
The options argument can be constructed through a number of procedures
|
||||
with names of the form \texttt{with-\ldots}. Each of these procedures
|
||||
either creates a fresh options value or adds a configuration parameter
|
||||
to an old options argument. The configuration parameter value is
|
||||
always the first argument, the (old) options value the optional second
|
||||
one. Here they are:
|
||||
|
||||
The basic server loop, and the associated request data structure are
|
||||
the fixed architecture of the SUnet Web server; its flexibility lies
|
||||
in the notion of path handlers.
|
||||
\defun{with-port}{port [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies the port on which the server listens. Defaults to 80.
|
||||
\end{desc}
|
||||
|
||||
\defun{with-port}{port \ovar{options}}{options}
|
||||
\defunx{with-root-directory}{root-directory
|
||||
\ovar{options}}{options}
|
||||
\defunx{with-fqdn}{fqdn \ovar{options}}{options}
|
||||
\defunx{with-reported-port}{reported-port
|
||||
\ovar{options}}{options}
|
||||
\defunx{with-path-handler}{path-handler
|
||||
\ovar{options}}{options}
|
||||
\defunx{with-server-admin}{mail-address
|
||||
\ovar{options}}{options}
|
||||
\defunx{with-simultaneous-requests}{requests
|
||||
\ovar{options}}{options}
|
||||
\defunx{with-logfile}{logfile \ovar{options}}{options}
|
||||
\defunx{with-syslog?}{syslog? \ovar{options}}{options}
|
||||
\defunx{with-resolve-ip?}{resolve-ip? \ovar{options}}{options}
|
||||
\defun{with-root-directory}{root-directory [options]}{options}
|
||||
\begin{desc}
|
||||
As noted above, these transformers set the options for the web
|
||||
server. Every transformer changes one aspect of the
|
||||
\semvar{options} (for the \ex{httpd}). If this optional argument is missing, the
|
||||
default values are used. These are the following:
|
||||
This specifies the current directory of the server. Note that this
|
||||
is \emph{not} the document root directory. Defaults to \texttt{/}.
|
||||
\end{desc}
|
||||
|
||||
\begin{tabular}{ll}
|
||||
\bf{transformer} & \bf{default value} \\
|
||||
\defun{with-fqdn}{fqdn [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies the fully-qualified domain name the server uses in
|
||||
automatically generated replies, or \ex{\#f} if the server should
|
||||
query DNS for the fully-qualified domain name.. Defaults to \ex{\#f}.
|
||||
\end{desc}
|
||||
|
||||
\defun{with-reported-port}{reported-port [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies the port number the server uses in automatically
|
||||
generated replies or \ex{\#f} if the reported port is the same as
|
||||
the port the server is listening on. (This is useful if you're
|
||||
running the server through an accelerating proxy.) Defaults to
|
||||
\ex{\#f}.
|
||||
\end{desc}
|
||||
|
||||
\defun{with-server-admin}{mail-address [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies the email address of the server administrator the
|
||||
server uses in automatically generated replies. Defaults to \ex{\#f}.
|
||||
\end{desc}
|
||||
|
||||
\defun{with-icon-name}{icon-name [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies how to generate the links to various decorative icons
|
||||
for the listings. It can either be a procedure which gets passed an
|
||||
icon tag (a symbol) and is expected to return a link pointing to the icon. If
|
||||
it is a string, that is taken as prefix to which the icon tag are
|
||||
appended. If \ex{\#f}, just the plain file names will be used. Defaults to \ex{\#f}.
|
||||
|
||||
The valid icon tags, together with the default names of their icon
|
||||
files, are:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|}
|
||||
\hline
|
||||
\ex{with\=port} & 80 \\
|
||||
\ex{with\=root\=directory} & ``\ex{/}'' \\
|
||||
\ex{with\=fqdn} & \sharpf \\
|
||||
\ex{with\=reported-port} & \sharpf \\
|
||||
\ex{with\=path\=handler} & \sharpf \\
|
||||
\ex{with\=server\=admin} & \sharpf \\
|
||||
\ex{with\=simultaneous\=requests} & \sharpf \\
|
||||
\ex{with\=logfile} & ``\ex{/logfile.log}''\\
|
||||
\ex{with\=syslog?} & \sharpt \\
|
||||
\ex{with\=resolve\=ip?} & \sharpt
|
||||
\texttt{directory} & \texttt{directory.xbm}\\\hline
|
||||
\texttt{text} & \texttt{text.xbm}\\\hline
|
||||
\texttt{doc} & \texttt{doc.xbm}\\\hline
|
||||
\texttt{image} & \texttt{image.xbm}\\\hline
|
||||
\texttt{movie} & \texttt{movie.xbm}\\\hline
|
||||
\texttt{audio} & \texttt{sound.xbm}\\\hline
|
||||
\texttt{archive} & \texttt{tar.xbm}\\\hline
|
||||
\texttt{compressed} & \texttt{compressed.xbm}\\\hline
|
||||
\texttt{uu} & \texttt{uu.xbm}\\\hline
|
||||
\texttt{binhex} & \texttt{binhex.xbm}\\\hline
|
||||
\texttt{binary} & \texttt{binary.xbm}\\\hline
|
||||
\texttt{blank} & \texttt{blank.xbm}\\\hline
|
||||
\texttt{back} & \texttt{back.xbm}\\\hline
|
||||
unknown & \texttt{unknown.xbm}\\\hline
|
||||
\end{tabular}
|
||||
|
||||
% that can be found in the \ex{httpd\=make\=options}-structure:
|
||||
% \ex{with\=port}, \ex{with\=root\=directory}, \ex{with\=fqdn},
|
||||
% \ex{with\=reported-port}, \ex{with\=path\=handler},
|
||||
% \ex{with\=server\=admin}, \ex{with\=simultaneous-requests},
|
||||
% \ex{with\=logfile}, \ex{with\=syslog?} that set the port the server
|
||||
% is listening to, the root-directory of the server, the FQDN of the
|
||||
% server, the port the server assumes it is listening to, the
|
||||
% path-handler of the server (see below), the mail-address of the
|
||||
% server-admin, the maximum number of simultaneous handled requests,
|
||||
% the name of the file or the port logging in the Common Log Format
|
||||
% (CLF) is output to and if the server shall create syslog messages,
|
||||
% respectively. The port defaults to 80, the root directory defaults
|
||||
% to ``\ex{/}'', the mail address of the server-admin defaults to
|
||||
% ``\ex{sperber@\ob{}informatik.\ob{}uni\=tuebingen.\ob{}de}'',
|
||||
% \FIXME{Why does the server admin mail address have
|
||||
% sperber@informatik... as default value?}logging is done to
|
||||
% ``\ex{httpd.log}'' and syslog is enabled. All other options default
|
||||
% to \sharpf.
|
||||
|
||||
For example
|
||||
\begin{alltt}
|
||||
(httpd (with-path-handler
|
||||
(rooted-file-handler "/usr/local/etc/httpd")
|
||||
(with-root-directory "/usr/local/etc/httpd")))
|
||||
\end{alltt}
|
||||
|
||||
starts the server on port 80 with
|
||||
``\ex{/usr/\ob{}local/\ob{}etc/\ob{}httpd}'' as root directory and
|
||||
lets it serve any file out from this directory.
|
||||
\ex{rooted\=file\=handler} creates a path handler and is explained
|
||||
below. You see, the transformers are used nested. So, every
|
||||
transformer changes one aspect of the options that the following
|
||||
transformer returns and the last transformer (here:
|
||||
\ex{with\=root\=directory}) changes an aspect of the default values
|
||||
|
||||
|
||||
\semvar{port} is the port the server is listening to,
|
||||
\semvar{root-directory} is the directory in the file system the
|
||||
server uses as root, \semvar{fqdn} is the fully qualified domain
|
||||
name the server reports, \semvar{reported-port} is the port the
|
||||
server reports it is listening to and \semvar{server-admin} is the
|
||||
mail address of the server admin. \semvar{requests} denote the
|
||||
maximum number of allowed simultaneous requests to the server.
|
||||
\sharpf\ means infinite. \semvar{logfile} is either a string, then
|
||||
it is the file name of the logfile, or a port, where the log entries
|
||||
are written to, or \sharpf, that means no logging is made. The
|
||||
logfile is in Common Log Format (CLF). To allow rotation of
|
||||
logfiles, the server will reopen the logfile when it receives the
|
||||
signal \texttt{USR1}. \semvar{syslog?} tells the server to write
|
||||
syslog messages (\sharpt) or not (\sharpf).
|
||||
Example icons can be found as part of the CERN httpd distribution
|
||||
at \url{http://www.w3.org/pub/WWW/Daemon/}.
|
||||
\end{center}
|
||||
\end{desc}
|
||||
|
||||
\section{Path handlers}
|
||||
\label{httpd:path-handlers}
|
||||
|
||||
A path handler is a procedure taking two arguments:
|
||||
\defun{path-handler}{path req}{value}
|
||||
\defun{with-request-handler}{request-handler [options]}{options}
|
||||
\begin{desc}
|
||||
The \semvar{req} argument is a request record giving all the details
|
||||
of the client's request; it has the following structure: \FIXME{Make
|
||||
the record's structure a table}
|
||||
\begin{alltt}
|
||||
(define-record request
|
||||
method ; A string such as "GET", "PUT", etc.
|
||||
uri ; The escaped URI string as read from request line.
|
||||
url ; An http URL record (see url.scm).
|
||||
version ; A (major . minor) integer pair.
|
||||
headers ; An rfc822 header alist (see rfc822.scm).
|
||||
socket) ; The socket connected to the client.
|
||||
\end{alltt}
|
||||
|
||||
The \semvar{path} argument is the URL's path, parsed and split at
|
||||
slashes into a string list. For example, if the Web client
|
||||
dereferences URL
|
||||
\codex{http://\ob{}clark.\ob{}lcs.\ob{}mit.\ob{}edu:\ob{}8001/\ob{}h/\ob{}shi\ob{}vers/\ob{}co\ob{}de/\ob{}web.\ob{}tar.\ob{}gz}
|
||||
then the server would pass the following path to the top-level
|
||||
handler: \ex{("h"\ob{} "shivers"\ob{} "code"\ob{}
|
||||
"web.\ob{}tar.\ob{}gz")}
|
||||
|
||||
The \semvar{path} argument's pre-parsed representation as a string
|
||||
list makes it easy for the path handler to implement recursive
|
||||
operations dispatch on URL paths.
|
||||
This specifies the request handler of the server to which the server
|
||||
delegates the actual work. More on that subject below in
|
||||
Section~\ref{httpd:request-handlers}. This parameter must be specified.
|
||||
\end{desc}
|
||||
|
||||
Path handlers can do anything they like to respond to HTTP requests;
|
||||
they have the full range of Scheme to implement the desired
|
||||
functionality. When handling HTTP requests that have an associated
|
||||
entity body (such as POST), the body should be read from the current
|
||||
input port. Path handlers should in all cases write their reply to the
|
||||
current output port. Path handlers should not perform I/O on the
|
||||
request record's socket. Path handlers are frequently called
|
||||
recursively, and doing I/O directly to the socket might bypass a
|
||||
filtering or other processing step interposed on the current I/O ports
|
||||
by some superior path handler.
|
||||
\defun{with-simultaneous-requests}{requests [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies a limit on the number of simultaneous requests the
|
||||
server servers. If that limit is exceeded during operation, the
|
||||
server will hold off on new requests until the number of
|
||||
simultaneous requests has sunk below the limit again. If this
|
||||
parameter is \ex{\#f}, no limit is imposed. Defaults to \ex{\#f}.
|
||||
\end{desc}
|
||||
|
||||
\section{Basic path handlers}
|
||||
\defun{with-logfile}{logfile [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies the name of a log file for the server where it writes
|
||||
Common Log Format logging information. It can also be a port in
|
||||
which case the information is logged to that port, or \ex{\#f} for
|
||||
no logging. Defaults to \ex{\#f}.
|
||||
|
||||
Although the user can write any path-handler he likes, the SUnet web server
|
||||
comes with a useful toolbox of basic path handlers that can be used
|
||||
and built upon (exported by the \ex{httpd\=basic\=handlers}-structure):
|
||||
To allow rotation of logfiles, the server re-opens the logfile
|
||||
whenever it receives a \texttt{USR1} signal.
|
||||
\end{desc}
|
||||
|
||||
\begin{defundesc}{alist-path-dispatcher}{ph-alist default-ph}{path-handler}
|
||||
This procedure takes a \ex{string->\ob{}path\=handler} alist, and a
|
||||
default path handler, and returns a handler that dispatches on its
|
||||
path argument. When the new path handler is applied to a path
|
||||
\ex{("foo"\ob{} "bar"\ob{} "baz")}, it uses the first element of
|
||||
the path -- ``\ex{foo}'' -- to index into the alist. If it finds an
|
||||
associated path handler in the alist, it hands the request off to
|
||||
that handler, passing it the tail of the path, \ex{("bar"\ob{}
|
||||
"baz")}. On the other hand, if the path is empty, or the alist
|
||||
search does not yield a hit, we hand off to the default path
|
||||
handler, passing it the entire original path, \ex{("foo"\ob{}
|
||||
"bar"\ob{} "baz")}.
|
||||
\defun{with-syslog?}{syslog? [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies whether the server will log information about
|
||||
incoming to the Unix syslog facility. Defaults to \ex{\#t}.
|
||||
\end{desc}
|
||||
|
||||
\defun{with-resolve-ip?}{resolve-ip? [options]}{options}
|
||||
\begin{desc}
|
||||
This specifies whether the server writes the domain names rather
|
||||
than numerical IPs to the output log it produces. Defaults to
|
||||
\ex{\#t}.
|
||||
\end{desc}
|
||||
|
||||
To avoid paranthitis, the \ex{make-httpd-options} procedure eases the
|
||||
construction of the options argument:
|
||||
|
||||
\defun{make-httpd-options}{transformer value \ldots}{options}
|
||||
\begin{desc}
|
||||
This constructs an options value from an argument list of parameter
|
||||
transformers and parameter values. The arguments come in pairs,
|
||||
each an option transformer from the list above, and a value for that
|
||||
parameter. \ex{Make-httpd-options} returns the resulting options value.
|
||||
\end{desc}
|
||||
|
||||
For example,
|
||||
\begin{alltt}
|
||||
(httpd (make-httpd-options
|
||||
with-request-handler (rooted-file-handler "/usr/local/etc/httpd")
|
||||
with-root-directory "/usr/local/etc/httpd"))
|
||||
\end{alltt}
|
||||
%
|
||||
starts the server on port 80 with
|
||||
\ex{/usr/local/etc/httpd} as its root directory and
|
||||
lets it serve any file out from this directory.
|
||||
% #### note about rooted-file-handler
|
||||
|
||||
|
||||
\section{Requests}
|
||||
\label{httpd:requests}
|
||||
|
||||
Request handlers operate on \textit{requests} which contain the
|
||||
information needed to generate a page. The relevant procedures to
|
||||
dissect requests are defined in the \texttt{httpd-requests} structure:
|
||||
|
||||
\defun{request?}{value}{boolean}
|
||||
\defunx{request-method}{request}{string}
|
||||
\defunx{request-uri}{request}{string}
|
||||
\defunx{request-url}{request}{url}
|
||||
\defunx{request-version}{request}{pair}
|
||||
\defunx{request-headers}{request}{list}
|
||||
\defunx{request-socket}{request}{socket}
|
||||
\begin{desc}
|
||||
The procedure inspect request values. \ex{Request?} is a predicate
|
||||
for requests. \ex{Request-method} extracts the method of the HTTP
|
||||
request; it's a string such as \verb|"GET"|, \verb|"PUT"|.
|
||||
\ex{Request-uri} returns the escaped URI string as read from request
|
||||
line. \ex{Request-url} returns an HTTP URL value (see the
|
||||
description of the \ex{url} structure in \ref{secchap:url}).
|
||||
\ex{Request-version} returns \verb|(major . minor)| integer pair
|
||||
representing the version specified in the HTTP request.
|
||||
\ex{Request-headers} returns an association lists of header field
|
||||
names and their values, each represented by a list of strings, one
|
||||
for each line. \ex{Request-socket} returns the the socket connected
|
||||
to the client.\footnote{Request handlers should not perform I/O on the
|
||||
request record's socket. Request handlers are frequently called
|
||||
recursively, and doing I/O directly to the socket might bypass a
|
||||
filtering or other processing step interposed on the current I/O ports
|
||||
by some superior request handler.}
|
||||
\end{desc}
|
||||
|
||||
\section{Responses}
|
||||
\label{sec:http-responses}
|
||||
|
||||
A path handler must return a \textit{response} value representing the
|
||||
content to be sent to the client. The machinery presented here for
|
||||
constructing responses lives in the \ex{httpd-responses} structure.
|
||||
|
||||
\defun{make-response}{status-code maybe-message seconds mime extras
|
||||
body}{response}
|
||||
\begin{desc}
|
||||
This procedure constructs a response value. \var{Status-code} is an
|
||||
HTTP status code (more on that below). \var{Maybe-message} is a a
|
||||
message elaborating on the circumstances of the status code; it can
|
||||
also be \sharpf{} meaning that the server should send a default
|
||||
message associated with the status code. \var{Seconds} natural
|
||||
number indicating the time the content was created, typically the
|
||||
value of \verb|(time)|. \var{Mime} is a string indicating the MIME
|
||||
type of the response (such as \verb|"text/html"| or
|
||||
\verb|"application/octet-stream"|). \var{Extras} is an association
|
||||
list with extra headers to be added to the response; its elements
|
||||
are pairs, each of which consists of a symbol representing the field
|
||||
name and a string representing the field value. \var{Body}
|
||||
represents the body of the response; more on that below.
|
||||
\end{desc}
|
||||
|
||||
\defun{make-error-response}{status-code request [message] extras \ldots}{response}
|
||||
\begin{desc}
|
||||
This is a helper procedure for constructing error responses.
|
||||
\var{code} is status code of the response (see below). \var{Request}
|
||||
is the request that led to the error. \var{Message} is an optional
|
||||
string containing an error message written in HTML, and \var{extras}
|
||||
are further optional arguments containing further message lines to
|
||||
be added to the web page that's generated.
|
||||
|
||||
\ex{Make-error-response} constructs a response value which generates
|
||||
a web page containg a short explanatory message for the error at hand.
|
||||
\end{desc}
|
||||
|
||||
\begin{table}[htb]
|
||||
\centering
|
||||
\begin{tabular}{|l|l|l|}
|
||||
\hline
|
||||
ok & 200 & OK\\\hline
|
||||
created & 201 & Created\\\hline
|
||||
accepted & 202 & Accepted\\\hline
|
||||
prov-info & 203 & Provisional Information\\\hline
|
||||
no-content & 204 & No Content\\\hline
|
||||
|
||||
mult-choice & 300 & Multiple Choices\\\hline
|
||||
moved-perm & 301 & Moved Permanently\\\hline
|
||||
moved-temp & 302 & Moved Temporarily\\\hline
|
||||
method & 303 & Method (obsolete)\\\hline
|
||||
not-mod & 304 & Not Modified\\\hline
|
||||
|
||||
bad-request & 400 & Bad Request\\\hline
|
||||
unauthorized & 401 & Unauthorized\\\hline
|
||||
payment-req & 402 & Payment Required\\\hline
|
||||
forbidden & 403 & Forbidden\\\hline
|
||||
not-found & 404 & Not Found\\\hline
|
||||
method-not-allowed & 405 & Method Not Allowed\\\hline
|
||||
none-acceptable & 406 & None Acceptable\\\hline
|
||||
proxy-auth-required & 407 & Proxy Authentication Required\\\hline
|
||||
timeout & 408 & Request Timeout\\\hline
|
||||
conflict & 409 & Conflict\\\hline
|
||||
gone & 410 & Gone\\\hline
|
||||
internal-error & 500 & Internal Server Error\\\hline
|
||||
not-implemented & 501 & Not Implemented\\\hline
|
||||
bad-gateway & 502 & Bad Gateway\\\hline
|
||||
service-unavailable & 503 & Service Unavailable\\\hline
|
||||
gateway-timeout & 504 & Gateway Timeout\\\hline
|
||||
\end{tabular}
|
||||
\caption{HTTP status codes}
|
||||
\label{tab:status-code-names}
|
||||
\end{table}
|
||||
|
||||
\dfn{status-code}{\synvar{name}}{status-code}{syntax}
|
||||
\defunx{name->status-code}{symbol}{status-code}
|
||||
\defunx{status-code-number}{status-code}{integer}
|
||||
\defunx{status-code-message}{status-code}{string}
|
||||
\begin{desc}
|
||||
The \ex{status-code} syntax returns a status code where
|
||||
\synvar{name} is the name from Table~\ref{tab:status-code-names}.
|
||||
\ex{Name->status-code} also returns a status code for a name
|
||||
represented as a symbol. For a given status code,
|
||||
\ex{status-code-number} extracts its number, and
|
||||
\ex{status-code-message} extracts its associated default message.
|
||||
\end{desc}
|
||||
|
||||
\section{Request Handlers}
|
||||
|
||||
A request handler generates the actual content for a request; request
|
||||
handlers form a simple algebra and may be combined and composed in
|
||||
various ways.
|
||||
|
||||
|
||||
A request handler is a procedure of two arguments like this:
|
||||
\defun{request-handler}{path req}{response}
|
||||
\begin{desc}
|
||||
\var{Req} is a request. The \semvar{path} argument is the URL's
|
||||
path, parsed and split at slashes into a string list. For example,
|
||||
if the Web client dereferences URL
|
||||
%
|
||||
\begin{verbatim}
|
||||
http://clark.lcs.mit.edu:8001/h/shivers/code/web.tar.gz
|
||||
\end{verbatim}
|
||||
then the server would pass the following path to the top-level
|
||||
handler:
|
||||
%
|
||||
\begin{verbatim}
|
||||
("h" "shivers" "code" "web.tar.gz")
|
||||
\end{verbatim}
|
||||
%
|
||||
The \var{path} argument's pre-parsed representation as a string
|
||||
list makes it easy for the request handler to implement recursive
|
||||
operations dispatch on URL paths.
|
||||
|
||||
The request handler must return an HTTP response.
|
||||
\end{desc}
|
||||
|
||||
\subsection{Basic Request Handlers}
|
||||
|
||||
The web server comes with a useful toolbox of basic request handlers
|
||||
that can be used and built upon. The following procedures are
|
||||
exported by the \ex{httpd\=basic\=handlers} structure:
|
||||
|
||||
\defvar{null-request-handler}{request-handler}
|
||||
\begin{desc}
|
||||
This request handler always generated a \ex{not-found} error
|
||||
response, no patter what the request is.
|
||||
\end{desc}
|
||||
|
||||
\defun{make-predicate-handler}{predicate handler
|
||||
default-handler}{request-handler}
|
||||
\begin{desc}
|
||||
The request handler returned by this procedure first calls
|
||||
\var{predicate} on its path and request; it then acts like
|
||||
\var{handler} if the predicate returned a true vale, and like
|
||||
\var{default-handler} if the predicate returned \sharpf.
|
||||
\end{desc}
|
||||
|
||||
\defun{make-host-name-handler}{hostname handler default-handler}{request-handler}
|
||||
\begin{desc}
|
||||
The request handler returned by this procedure compares the host
|
||||
name specified in the request with \var{hostname}: if they match, it
|
||||
acts like \var{handler}, otherwise, it acts like
|
||||
\var{default-handler}.
|
||||
\end{desc}
|
||||
|
||||
\defun{make-path-predicate-handler}{predicate handler
|
||||
default-handler}{request-handler}
|
||||
\begin{desc}
|
||||
The request handler returned by this procedure first calls
|
||||
\var{predicate} on its path; it then acts like \var{handler} if the
|
||||
predicate returned a true vale, and like \var{default-handler} if
|
||||
the predicate returned \sharpf.
|
||||
\end{desc}
|
||||
|
||||
\defun{make-path-prefix-handler}{path-prefix handler default-handler}{request-handler}
|
||||
\begin{desc}
|
||||
This constructs a request handler that calls \var{handler} on its
|
||||
argument if \var{path-prefix} (a string) is the first element of the
|
||||
requested path; it calls \var{handler} on the rest of the path and
|
||||
the original request. Otherwise, the handler acts like
|
||||
\var{default-handler}.
|
||||
\end{desc}
|
||||
|
||||
\defun{alist-path-dispatcher}{handler-alist default-handler}{request-handler}
|
||||
\begin{desc}
|
||||
This procedure takes as arguments an alist mapping strings to path
|
||||
handlers, and a default request handler, and returns a handler that
|
||||
dispatches on its path argument. When the new request handler is
|
||||
applied to a path
|
||||
\begin{verbatim}
|
||||
("foo" "bar" "baz")
|
||||
\end{verbatim}
|
||||
it uses the
|
||||
first element of the path---\ex{foo}---to index into the
|
||||
alist. If it finds an associated request handler in the alist, it
|
||||
hands the request off to that handler, passing it the tail of the
|
||||
path, in this case
|
||||
\begin{verbatim}
|
||||
("bar" "baz")
|
||||
\end{verbatim}
|
||||
%
|
||||
On the other hand, if the path is
|
||||
empty, or the alist search does not yield a hit, we hand off to the
|
||||
default path handler, passing it the entire original path,
|
||||
\begin{verbatim}
|
||||
("foo" "bar" "baz")
|
||||
\end{verbatim}
|
||||
%
|
||||
This procedure is how you say: ``If the first element of the URL's
|
||||
path is `foo', do X; if it's `bar', do Y; otherwise, do Z.'' If one
|
||||
takes an object-oriented view of the process, an alist path-handler
|
||||
does method lookup on the requested operation, dispatching off to
|
||||
the appropriate method defined for the URL.
|
||||
|
||||
path is `foo', do X; if it's `bar', do Y; otherwise, do Z.''
|
||||
The slash-delimited URI path structure implies an associated tree of
|
||||
names. The path-handler system and the alist dispatcher allow you to
|
||||
names. The request-handler system and the alist dispatcher allow you to
|
||||
procedurally define the server's response to any arbitrary subtree
|
||||
of the path space.
|
||||
|
||||
Example: A typical top-level path handler is
|
||||
Example: A typical top-level request handler is
|
||||
\begin{alltt}
|
||||
(define ph
|
||||
(alist-path-dispatcher
|
||||
|
@ -265,9 +420,9 @@ and built upon (exported by the \ex{httpd\=basic\=handlers}-structure):
|
|||
\item If the path looks like \ex{("h"\ob{} "shivers"\ob{}
|
||||
"code"\ob{} "web.\ob{}tar.\ob{}gz")}, pass the path
|
||||
\ex{("shivers"\ob{} "code"\ob{} "web.\ob{}tar.\ob{}gz")} to a
|
||||
home-directory path handler.
|
||||
home-directory request handler.
|
||||
\item If the path looks like \ex{("cgi-\ob{}bin"\ob{} "calendar")},
|
||||
pass ("calendar") off to the CGI path handler.
|
||||
pass ("calendar") off to the CGI request handler.
|
||||
\item If the path looks like \ex{("seval"\ob{} \ldots)}, the tail
|
||||
of the path is passed off to the code-uploading seval path
|
||||
handler.
|
||||
|
@ -276,133 +431,21 @@ and built upon (exported by the \ex{httpd\=basic\=handlers}-structure):
|
|||
\ex{/usr/\ob{}lo\ob{}cal/\ob{}etc/\ob{}httpd/\ob{}htdocs},
|
||||
and serve that file.
|
||||
\end{itemize}
|
||||
\end{defundesc}
|
||||
\end{desc}
|
||||
|
||||
\begin{defundesc}{home-dir-handler}{subdir}{path-handler}
|
||||
This procedure builds a path handler that does basic file serving
|
||||
out of home directories. If the resulting \semvar{path-handler} is
|
||||
passed a path of \ex{(user . file\=path)}, then it serves the file
|
||||
\ex{user's\=ho\ob{}me\=di\ob{}rec\ob{}to\ob{}ry/\ob{}sub\ob{}dir/\ob{}file\=path}
|
||||
\subsection{Static Content Request Handlers}
|
||||
|
||||
The path handler only handles GET requests; the filename is not
|
||||
allowed to contain \ex{..} elements.
|
||||
\end{defundesc}
|
||||
The request handlers described in this section are for serving static
|
||||
content off directory trees in the file system. They live in the
|
||||
\ex{httpd-file-directory-handlers} structure.
|
||||
|
||||
\begin{defundesc}{tilde-home-dir-handler}{subdir default-path-handler}{path-handler}
|
||||
This path handler examines the car of the path. If it is a string
|
||||
beginning with a tilde, e.g., \ex{"~ziggy"}, then the string is
|
||||
taken to mean a home directory, and the request is served similarly
|
||||
to a home-dir-handler path handler. Otherwise, the request is passed
|
||||
off in its entirety to the \semvar{default-path-handler}.
|
||||
|
||||
This procedure is useful for implementing servers that provide the
|
||||
semantics of the NCSA httpd server.
|
||||
\end{defundesc}
|
||||
|
||||
\begin{defundesc}{cgi-handler}{cgi-directory}{path-handler}
|
||||
This procedure returns a path-handler that passes the request off to
|
||||
some program using the CGI interface. The script name is taken from
|
||||
the car of the path; it is checked for occurrences of \ex{..}'s. If
|
||||
the path is \ex{("my\=prog"\ob{} "foo"\ob{} "bar")} then the
|
||||
program executed is
|
||||
\ex{cgi\=di\ob{}rec\ob{}to\ob{}ry\ob{}my\=prog}.
|
||||
|
||||
When the CGI path handler builds the process environment for the CGI
|
||||
script, several elements (e.g., \ex{\$PATH and \$SERVER\_SOFTWARE}) are request-invariant, and can be
|
||||
computed at server start-up time. This can be done by calling
|
||||
\codex{(initialise-request-invariant-cgi-env)}
|
||||
when the server starts up. This is not necessary, but will make CGI
|
||||
requests a little faster.
|
||||
\end{defundesc}
|
||||
|
||||
\begin{defundesc}{rooted-file-handler}{root-dir}{path-handler}
|
||||
Returns a path handler that serves files from a particular root in
|
||||
the file system. Only the GET operation is provided. The path
|
||||
argument passed to the handler is converted into a filename, and
|
||||
appended to root-dir. The file name is checked for \ex{..}
|
||||
components, and the transaction is aborted if it does. Otherwise,
|
||||
the file is served to the client.
|
||||
\end{defundesc}
|
||||
|
||||
\begin{defundesc}{rooted-file-or-directory-handler}{root
|
||||
icon-name}{path-handler}
|
||||
|
||||
Dito, but also serve directory indices for directories without
|
||||
\ex{index.\ob{}html}. \semvar{icon-name} specifies how to generate
|
||||
the links to various decorative icons for the listings. It can either
|
||||
be a procedure which gets passed one of the icon tags listed below and
|
||||
is expected to return a link pointing to the icon. If it is a string,
|
||||
that is taken as prefix to which the file names of the tags listed
|
||||
below are appended.
|
||||
|
||||
\begin{tabular}{ll}
|
||||
Tag & Icon's file name \\
|
||||
\hline
|
||||
\ex{directory} & \ex{directory.xbm}\\
|
||||
\ex{text} & \ex{text.xbm}\\
|
||||
\ex{doc} & \ex{doc.xbm}\\
|
||||
\ex{image} & \ex{image.xbm}\\
|
||||
\ex{movie} & \ex{movie.xbm}\\
|
||||
\ex{audio} & \ex{sound.xbm}\\
|
||||
\ex{archive} & \ex{tar.xbm}\\
|
||||
\ex{compressed} & \ex{compressed.xbm}\\
|
||||
\ex{uu} & \ex{uu.xbm}\\
|
||||
\ex{binhex} & \ex{binhex.xbm}\\
|
||||
\ex{binary} & \ex{binary.xbm}\\
|
||||
\ex{blank} & \ex{blank.xbm}\\
|
||||
\ex{back} & \ex{back.xbm}\\
|
||||
\ex{\it{}else} & \ex{unknown.xbm}\\
|
||||
\end{tabular}
|
||||
\end{defundesc}
|
||||
|
||||
\begin{defundesc}{null-path-handler}{path req}{\noreturn}
|
||||
This path handler is useful as a default handler. It handles no
|
||||
requests, always returning a ``404 Not found'' reply to the client.
|
||||
\end{defundesc}
|
||||
|
||||
\section{HTTP errors}
|
||||
|
||||
Authors of path-handlers need to be able to handle errors in a
|
||||
reasonably simple fashion. The SUnet Web server provides a set of error
|
||||
conditions that correspond to the error replies in the HTTP protocol.
|
||||
These errors can be raised with the \ex{http\=error} procedure. When
|
||||
the server runs a path handler, it runs it in the context of an error
|
||||
handler that catches these errors, sends an error reply to the client,
|
||||
and closes the transaction.
|
||||
|
||||
\begin{defundesc}{http-error}{reply-code req \ovar{extra \ldots}}{\noreturn}
|
||||
This raises an http error condition. The reply code is one of the
|
||||
numeric HTTP error reply codes, which are bound to the variables
|
||||
\ex{http\=re\ob{}ply/\ob{}ok, http\=re\ob{}ply/\ob{}not\=found,
|
||||
http\=re\ob{}ply/\ob{}bad\=request}, and so forth. The
|
||||
\semvar{req} argument is the request record that caused the error.
|
||||
Any following extra args are passed along for informational
|
||||
purposes. Different HTTP errors take different types of extra
|
||||
arguments. For example, the ``301 moved permanently'' and ``302
|
||||
moved temporarily'' replies use the first two extra values as the
|
||||
\ex{URI:} and \ex{Lo\-ca\-tion:} fields in the reply header,
|
||||
respectively. See the clauses of the
|
||||
\ex{send\=http\=er\ob{}ror\=re\ob{}ply} procedure for details.
|
||||
\end{defundesc}
|
||||
|
||||
\begin{defundesc}{send-http-error-reply}{reply-code request \ovar{extra \ldots}}{\noreturn}
|
||||
This procedure writes an error reply out to the current output port.
|
||||
If an error occurs during this process, it is caught, and the
|
||||
procedure silently returns. The http server's standard error handler
|
||||
passes all http errors raised during path-handler execution to this
|
||||
procedure to generate the error reply before aborting the request
|
||||
transaction.
|
||||
\end{defundesc}
|
||||
|
||||
\section{Simple directory generation}
|
||||
|
||||
Most path-handlers that serve files to clients eventually call an
|
||||
internal procedure named \ex{file\=serve}, which implements a simple
|
||||
directory-generation service using the following rules:
|
||||
The request handlers in this section eventually call an internal
|
||||
procedure named \ex{file\=serve} for serving files which implements a
|
||||
simple directory-generation service using the following rules:
|
||||
\begin{itemize}
|
||||
\item If the filename has the form of a directory (i.e., it ends with
|
||||
a slash), then \ex{file\=serve} actually looks for a file named
|
||||
``index.html'' in that directory.
|
||||
\ex{index.html} in that directory.
|
||||
\item If the filename names a directory, but is not in directory form
|
||||
(i.e., it doesn't end in a slash, as in
|
||||
``\ex{/usr\ob{}in\ob{}clu\ob{}de}'' or ``\ex{/usr\ob{}raj}''),
|
||||
|
@ -416,47 +459,72 @@ directory-generation service using the following rules:
|
|||
client.
|
||||
\end{itemize}
|
||||
|
||||
\defun{rooted-file-handler}{root-dir}{request-handler}
|
||||
\begin{desc}
|
||||
This returns a request handler that serves files from a particular
|
||||
root in the file system. Only the \ex{GET} operation is provided.
|
||||
The path argument passed to the handler is converted into a
|
||||
filename, and appended to root-dir. The file name is checked for
|
||||
\ex{..} components, and the transaction is aborted if it does.
|
||||
Otherwise, the file is served to the client.
|
||||
\end{desc}
|
||||
|
||||
\defun{rooted-file-or-directory-handler}{root}{request-handler}
|
||||
\begin{desc}
|
||||
Dito, but also serve directory indices for directories without
|
||||
\ex{index.html}.
|
||||
\end{desc}
|
||||
|
||||
\defun{home-dir-handler}{subdir}{request-handler}
|
||||
\begin{desc}
|
||||
This procedure builds a request handler that does basic file serving
|
||||
out of home directories. If the resulting \var{request-handler} is
|
||||
passed a path of the form \ex{(\var{user} . \var{file-path})}, then it serves the file
|
||||
\ex{\var{subdir}/\var{file-path}} inside the user's home directory.
|
||||
|
||||
The request handler only handles GET requests; the filename is not
|
||||
allowed to contain \ex{..} elements.
|
||||
\end{desc}
|
||||
|
||||
\defun{tilde-home-dir-handler}{subdir
|
||||
default-request-handler}{request-handler}
|
||||
\begin{desc}
|
||||
This returns request handler that examines the car of the path. If
|
||||
it is a string beginning with a tilde, e.g., \ex{"~ziggy"}, then the
|
||||
string is taken to mean a home directory, and the request is served
|
||||
similarly to a home-dir-handler request handler. Otherwise, the
|
||||
request is passed off in its entirety to the
|
||||
\var{default-request-handler}.
|
||||
\end{desc}
|
||||
|
||||
\section{CGI Server}
|
||||
|
||||
\begin{defundesc}{cgi-handler}{bin-dir \ovar{cgi-bin-dir}}{path-handler}
|
||||
Returns a path handler (see \ref{httpd:path-handlers} for details
|
||||
about path handlers) for cgi-scripts located in
|
||||
\semvar{bin-dir}. \semvar{cgi-bin-dir} specifies the value of the
|
||||
\ex{PATH} variable of the environment the cgi-scripts run in. It defaults
|
||||
to
|
||||
``\ex{/bin:\ob{}/usr/bin:\ob{}/usr/ucb:\ob{}/usr/bsd:\ob{}/usr/local/bin}''
|
||||
but is overwritten by the current \ex{PATH} environment variable at
|
||||
the time \ex{cgi-handler} ist called. The cgi-scripts are called as
|
||||
specified by CGI/1.1\footnote{see
|
||||
\ex{http://hoohoo.ncsa.uiuc.edu/cgi/interface.html} for a sort of
|
||||
specification.}.
|
||||
|
||||
\begin{itemize}
|
||||
\item Various environment variables are set (like
|
||||
\ex{QUERY\_STRING} or \ex{REMOTE\_HOST}).
|
||||
\item ISINDEX queries get their arguments as command line arguments.
|
||||
\item Scripts are handled differently according to their name:
|
||||
\defun{cgi-handler}{bin-dir [cgi-bin-path]}{request-handler}
|
||||
\begin{desc}
|
||||
Returns a request handler for CGI scripts located in
|
||||
\var{bin-dir}. \var{Cgi-bin-dir} specifies the value of the
|
||||
\ex{PATH} variable of the environment the CGI scripts run in. It defaults
|
||||
to
|
||||
\begin{verbatim}
|
||||
/bin:/usr/bin:/usr/ucb:/usr/bsd:/usr/local/bin
|
||||
\end{verbatim}
|
||||
The CGI scripts are called as specified by CGI/1.1\footnote{see
|
||||
\url{http://hoohoo.ncsa.uiuc.edu/cgi/interface.html} for a sort of
|
||||
specification.}.
|
||||
|
||||
Note that the CGI handler looks at the name of the CGI script to
|
||||
determine how it should be handled:
|
||||
\begin{itemize}
|
||||
|
||||
\item If the name of the script starts with `\ex{nph-}', its reply
|
||||
is read, the RFC~822-fields like ``Content-Type'' and ``Status''
|
||||
is read, the RFC~822-fields like \ex{Content-Type} and \ex{Status}
|
||||
are parsed and the client is sent back a real HTTP reply,
|
||||
containing the rest of the script's output.
|
||||
|
||||
\item If the name of the script doesn't start with `\ex{nph-}',
|
||||
its output is sent back to the client directly. If its return code
|
||||
is not zero, an error message is generated.
|
||||
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
\end{defundesc}
|
||||
|
||||
\section{Support procs}
|
||||
|
||||
The source files contain a host of support procedures which will be of
|
||||
utility to anyone writing a custom path-handler. Read the files first.
|
||||
\FIXME{Let us read the files and paste the contents here.}
|
||||
\end{desc}
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
|
|
Loading…
Reference in New Issue