Compare commits

...

102 Commits

Author SHA1 Message Date
vibr bccf27785d rename READ-AND-DISCARD-TRAILER to DISCARD-TRAILER for naming consistency 2005-04-20 11:40:23 +00:00
vibr d06479ee4b adapt httpd-handler-lib-interface and httpd-handler structure to
implementation of chunked transfer coding in handler-lib
2005-04-16 20:49:29 +00:00
vibr 15049e1c58 implementing chunked transfer coding:
*new proc  READ-MESSAGE-BODY: a high-level interface for reading in
message bodies (should be used by all handlers)
*new proc READ-ORDINARY-BODY: reads in message bodies with no transfer
coding applied
*new proc READ-CHUNKED-BODY: reads in message bodies in chunked
transfer coding
*new proc GET-CHUNK-SIZE: reads in and parses the size of the next
chunk in a chunked message body
*helper procs READ-AND-DISCARD-TRAILER and DISCARD-LINE-TERMINATOR
*minor changes to GET-NUMERIC-FIELD-VALUE (cosmetic)
2005-04-16 20:45:33 +00:00
vibr 555d52806d modify seval-handler:
*don't use concept of reader-writer-body (which is broken), use
writer-body instead -> seval-handler now works correctly
*use new interface READ-MESSAGE-BODY from handler-lib for reading in the
message body
*rename READ-REQUEST-SEXP to PARSE-REQUEST-SEXP
*catch errors thrown by READ in PARSE-REQUEST-SEXP to answer 400
instead of 500 for requests whose message body doesn't contain a valid
s-expression
2005-04-16 20:33:35 +00:00
vibr 36db985453 in PARSE-HTTP-REQUEST: catch errors thrown by READ-RFC822-HEADERS and
answer 400 instead of 500 for requests with bad headers
2005-04-16 20:22:41 +00:00
vibr 0554b2d494 move copyright declarations to COPYING 2005-04-16 20:19:32 +00:00
vibr 630c77d83f syntax of comma separated list which is the field value of the
Transfer-Encoding header: needs a single SP less
2005-04-15 15:50:30 +00:00
vibr d174ad3954 add a copy of the regexp definitions HEX and ESCAPED from structure
url to resolve mutual dependency
2005-04-15 15:34:48 +00:00
vibr 184c284c4a move these procs from lib/sunet-utilities to httpd/handler-lib:
GET-SOCKET-HOST-STRING
GET-NUMERIC-FIELD-VALUE
GET-BODY-LENGTH-FROM-CONTENT-LENGTH
CHUNKED-TRANSFER-CODING?
reason: they are httpd-specific in using httpd's data types
2005-04-15 15:23:32 +00:00
vibr d915722a9b implementing chunked transfer coding:
* new predicate CHUNKED-TRANSFER-CODING?
 tests wether a request's entity body is sent in chunked
 transfer-encoding
2005-04-15 12:31:43 +00:00
vibr 4c1e1a16a8 trivia: towards a more modern HTML: closing slash within empty HTML
elements
*new: EMIT-EMPTY-TAG
*use EMIT-EMPTY-TAG instead of EMIT-TAG where appropriate
2005-04-14 21:15:21 +00:00
vibr e8dc69b745 we must send a "Connection: close" header as long as we don't support
persistent connections
2005-04-14 19:17:16 +00:00
vibr db826a9c1f add comment explaining why the concept of http-reader-writer-body
(which is used by the seval-handler only) doesn't work
2005-04-14 15:18:54 +00:00
vibr 8e7e071db2 fix MAKE-CLF (which prepended superfluous slash to requested resource) 2005-04-14 14:42:03 +00:00
vibr 9d3ddd79b9 *delete GET-SOCKET-HOSTNAME-AND-PORTNUMBER (was buggy anyway)
*rewrite GET-SOCKET-HOST-STRING
2005-04-14 14:06:00 +00:00
vibr 8974332da1 Location header must be an absolute URL:
*adapt file-dir-handler's 301 response
*new procs GET-SOCKET-HOSTNAME-AND-PORTNUMBER, GET-SOCKET-HOST-STRING
2005-04-14 11:38:37 +00:00
vibr 96b485294f *new predicate ABSOLUTE-URL?
*remove URI-HAS-PROTOCOL?
2005-04-14 11:32:38 +00:00
vibr f605367c1a answer 400 (Bad Request) for 1.1-Requests which don't include a Host
header:
*new proc CHECK-HOST-HEADER
2005-04-13 20:53:53 +00:00
vibr 453a7cdde6 correct call of http-error in MAKE-HOST-NAME-HANDLER 2005-04-13 20:48:57 +00:00
vibr 97f730075d trivia: formatting of error responses 2005-04-13 20:46:40 +00:00
vibr 40d7c923a2 answer 505 (Version not supported) for requests with major version >
implemented version:
*new proc CHECK-MAJOR-HTTP-VERSION
*new case in MAKE-ERROR-RESPONSE
2005-04-13 19:35:22 +00:00
vibr fe6b3fffac change URL-Parser's interface: parser now preserves info whether
Request-URI's path ends with a slash.
(See http://httpd.apache.org/docs-2.0/misc/rewriteguide.html ->
"Trailing slash problem" for reasons).
2005-04-13 10:32:29 +00:00
vibr 9118345aaa typos 2005-04-13 10:25:14 +00:00
vibr 33b3eb8df7 better documentation of HTTP-URL->URL-STRING and HTTP-URL-PATH->PATH-STRING 2005-04-10 15:30:32 +00:00
vibr d209db26d8 document new URL parser 2005-04-10 15:17:50 +00:00
vibr 0c7c957f2b add comment 2005-04-10 15:15:55 +00:00
vibr a44c53bc67 typos + emphasis 2005-04-10 13:14:02 +00:00
vibr e9bc839cd5 finally adapt documentation to new uri lib procs 2005-04-10 13:03:33 +00:00
vibr 90fc61473e add comment on assumptions about entity in request to seval-handler 2005-04-06 22:49:50 +00:00
vibr c9c45eae6e better comment 2005-04-06 22:47:14 +00:00
vibr 9342e0e593 *replace call of PARSE-REQUEST-URI (relict of old URL parser)
with call of URL-STRING->HTTP-URL
2005-04-06 22:45:48 +00:00
vibr 2dcdd41ed9 *reinsert lost line
*replace calls of UNESCAPE-URI (relict of old URL parser) with UNESCAPE
*simplify URI-HAS-PROTOCOL?
2005-04-06 22:43:53 +00:00
vibr 512ccfaed3 removal of old URL parser relicts:
*replace calls of UNESCAPE-URI with UNESCAPE
*remove imports of uri package where no longer needed
2005-04-06 22:41:10 +00:00
vibr ed53670895 *add comment explaining why we don't need to support "*" and authority
Request-URIS
*comment on when unescaping is done
*don't unescape query when building a http-url record
*don't escape query when unparsing a http-url record.
*remove PARSED-URI->HTTP-URL, integrate its code into
URL-STRING->HTTP-URL
*add dummy encoder ESCAPE-NOT-UNRESERVED-CHARS
2005-04-06 22:31:33 +00:00
vibr 745a123735 adapt code to remove calls of URI-PATH->URI (relict of old URL parser) 2005-04-06 12:32:17 +00:00
vibr 61a63b4d4b adapt URI-HAS-PROTOCOL? to new URL parser 2005-04-06 11:59:05 +00:00
vibr 0de6fe79b4 replace calls of HTTP-URL->STRING (relict of old URL parser)
with calls of HTTP-URL->URL-STRING
2005-04-06 11:44:28 +00:00
vibr 17a46a7e71 rename URI-STRING->HTTP-URL to URL-STRING->HTTP-URL 2005-04-06 11:35:37 +00:00
vibr 5836ae567b export HTTP-URL-PATH->PATH-STRING in url-interface 2005-04-05 18:45:53 +00:00
vibr 9399bf9397 remove MY-REPORTED-PORT (relict of old URL parser) 2005-04-05 18:45:02 +00:00
vibr da10de6309 add function HTTP-URL-PATH->PATH-STRING (unparses path in http-url record into string) 2005-04-05 18:43:23 +00:00
vibr e5c8cae17f rename HTTP-URL->URI-STRING to HTTP-URL->URL-STRING 2005-04-05 10:59:13 +00:00
vibr 1e93a6cb9f rename PARSE-URI to PARSE-URL 2005-04-05 10:50:12 +00:00
vibr c3b855ae22 minor changes:
*HTTP1.1 (this is wishful thinking)
*we don't have a general URI parser, just a URL parser
*typos
*alles irdische hat einen namen
2005-04-04 21:23:16 +00:00
vibr 3548b25c26 rename HTTP-URL-SEARCH to HTTP-URL-QUERY (adaption to restructured
http-url type in url.scm)
2005-04-04 21:13:35 +00:00
vibr d0c64d371a further restrict url-interface 2005-04-04 21:10:05 +00:00
vibr 46645ccd58 *remove definition of PARSE-REQUEST-URI (relict of Olin's old URL parser)
*use URI-STRING->HTTP-URL instead
2005-04-04 21:07:55 +00:00
vibr d864e4da80 correct reference in comment 2005-04-04 20:57:36 +00:00
vibr cf747a97b4 *add solution for mistake in RFC 2616 (where query part of
Request-URIs is only allowed for absoluteURIs)
*rename PARSE-HTTP-URL to URI-STRING->HTTP-URL
2005-04-04 15:35:50 +00:00
vibr ba78eba433 *add comment on another mistake in RFC 2616 (query part of
Request-URIs only allowed for absoluteURIs)
*add copy of Appendix A of RFC 2396 for convenience
2005-04-04 13:36:54 +00:00
vibr 69948e9561 adapt definitions of interfaces and structures to new URI
parsing framework
2004-10-18 18:33:19 +00:00
vibr ed1e4428c5 *move general procedures ESCAPE, UNESCAPE and their helper procs from
url.scm to uri.scm
-->Parser/Unparser for HTTP 1.1 URIs is now complete and resides in
url.scm; Encoder/Decoder applicable to URIs in general resides in uri.scm.
(All has been rewritten from scratch, next to nothing of Olin's code
is left).
<--
2004-10-18 18:23:03 +00:00
vibr 584bfa2cdb *work around scsh bug (?) in definition of charset UNRESERVED
*new procs for encoding URIs:
-general proc ESCAPE taking an RegExp representing forbidden chars as argument
-specialized procs ESCAPE-SEGMENT, ESCAPE-QUERY
*new helper proc ASCII->ESCAPED
*use ESCAPE-SEGMENT and ESCAPE-URI in HTTP-URL->URI-STRING
2004-10-18 17:35:40 +00:00
vibr c48446ba7f *remove ESCAPE-URI (didn't reliably differentiate between different portions of
a URI)
*move INT->HEXCHAR to url.scm
2004-10-18 16:37:32 +00:00
vibr 44a8ef28be *new procedure UNESCAPE (unescape URI-components using RegExps)
*move HEXCHAR->INT from uri.scm to here
*use UNESCAPE in PARSED-URI->HTTP-URL
2004-10-14 17:18:24 +00:00
vibr 9e71b351d4 *remove UNESCAPE-URI, HEX-DIGIT?
*move HEXCHAR->INT to url.scm
2004-10-14 17:14:44 +00:00
vibr 932f03a638 *fix two typos in RegExps
*better comments for PARSE-URI, SPLIT-PATH
*make SPLIT-PATH really accept PARSE-URI's return values
*restructure record-type HTTP-URL
*new procedure PARSED-URI->HTTP-URL
*rewrite PARSE-HTTP-URL to use PARSE-URI and PARSED-URI->HTTP-URL
*remove out-dated comments
*remove out-dated procedure PARSE-HTTP-URL-STRING
*remove DEFAULT-HTTP-SERVER (relict of server record-type)
*rewrite HTTP-URL->STRING
*rename HTTP-URL->STRING to HTTP-URL->URI-STRING
2004-10-11 17:01:32 +00:00
vibr fe08e779f0 remove record-type server and associated procedures
(being a relict of parsing of general URIs)
2004-10-11 09:05:24 +00:00
vibr 41d3e29766 *add comment on how ambiguous definition of Request_URIs in RFC 2616
is 'solved' by uri-parser

*add comment on record-type server and associated procedures
2004-10-11 08:54:41 +00:00
vibr 8de8e01f0d adapt documentation to reflect removal of old uri-parser and addition
of new one
2004-10-10 18:30:45 +00:00
vibr a1e79c4fc7 parse HTTP 1.1 URIs:
* add RegExps
* add proc PARSE-URI
* add proc SPLIT-ABS-PATH
2004-10-06 19:10:49 +00:00
vibr d9950a9b0b remove PARSE-URI from exports of module uri 2004-10-06 19:08:19 +00:00
vibr 2cb8502f9e remove uri-parser PARSE-URI
(completely out-of-date, has never seen RFC 2396)
2004-10-06 13:33:45 +00:00
vibr 649f374e8b add comment on uri-parser 2004-10-06 13:29:06 +00:00
vibr 53e3e9672f adapt to RFC terminology:
rename PARSE-HTTP-SERVERS-URL-FRAGMENT to PARSE-REQUEST-URI
rename variable uri-string to request-uri
2004-10-05 10:24:29 +00:00
vibr cd22ab11d4 Corrected bug: SEVAL now generates a response-body even if the request's body isn't
form-url encoded or doesn't contain a program
2004-08-15 12:44:55 +00:00
vibr 38f2594ba5 emit less newlines 2004-08-15 12:40:06 +00:00
vibr 35565068fb char-set:blank = LWS from RFC 2616 (after folding) 2004-08-15 12:02:36 +00:00
vibr ffac0ebcac simplified get-numeric-field-value (now uses string-trim-both),
adapted packages.scm
2004-08-15 11:49:15 +00:00
vibr 8bf71fc3a5 get-numeric-field-value: correct error message 2004-08-15 11:06:39 +00:00
vibr 44100cbf5e Added type NO-BODY for responses which must not contain a message-body
(201, 304, 404). Added export of type-predicate no-body?.  Adapted
SEND-HTTP-RESPONSE to check for no-body responses. Extended
MAKE-ERROR-RESPONSE to make responses 201, 304, 404.
2004-08-15 11:03:28 +00:00
vibr 0bb601a0e0 make server-generated webpages XHTML 1.0 Strict: emit prolog
(not tested)
2004-08-14 22:08:07 +00:00
vibr 549594bef4 with-tag* outputs no newlines 2004-08-14 22:05:06 +00:00
vibr ef48e4e5ae make-host-name-handler now uses get-header from sunet-utilities.scm 2004-08-14 22:03:19 +00:00
vibr 8cf841bad3 rename get-content-length to get-body-length-from-content-length
move get-body-length-from-content-length and get-numeric-field-value
from seval.scm to sunet-utilities.scm

adapt packages.scm
2004-08-14 21:58:11 +00:00
vibr f8559581d2 GET-NUMERIC-FIELD-VALUE now uses GET-HEADER from utilities.scm
and returns #f if GET-HEADER does so

adapted GET-CONTENT-LENGHT
2004-08-14 21:18:12 +00:00
vibr aea0e950ba removed definition of GET-HEADER which conflicted with definition of
GET-HEADER in utilities.scm and wasn't exported anyway
2004-08-14 20:54:57 +00:00
vibr ffbe3b21cd factor out parsing of content-length header value -> GET-CONTENT-LENGTH
generalize parsing of content-length header value to parse all header field
values of the form 1*DIGIT -> GET-NUMERIC-FIELD-VALUE

check for valid content-length header in SEVAL before answering 200

TODO: SEVAL is still buggy for request with invalid _body_
2004-08-14 19:07:23 +00:00
vibr 9fcfcf36f0 add comments explaining why seval-handler is buggy 2004-08-14 15:47:29 +00:00
vibr 06ec0f0293 server-generated webpages -> XHTML 1.0 Strict:
add XML namespace declaration to html element
2004-08-13 15:49:19 +00:00
vibr 6969b80206 - move XHTML stuff from responses.scm to htmlout.scm
- new function EMIT-PROLOG
- define XML namespace declaration
- adapt macro %hack-attr-elt: special treatment of xmlnsdecl-attr
(this is not nice, but the only alternative was hard-coding the XML
namespace declaration into the various handlers)
- element names -> lower case
2004-08-13 15:46:53 +00:00
vibr 96f0ae41d5 -move XHTML stuff from responses.scm to htmlout.scm
-don't use WRITE-STRING with \n (\n within strings is not in R5RS),
instead use FORMAT with ~%
-use EMIT-PROLOG and EMIT-TAG from htmlout.scm
2004-08-13 15:37:31 +00:00
vibr c089e26e96 move XHTML stuff from responses.scm to htmlout.scm, adapt
packages.scm
2004-08-13 15:34:00 +00:00
vibr a9ae5061d0 make server generated html pages XHTML 1.0 Strict:
element names -> lower case
2004-08-13 15:26:50 +00:00
vibr 4d7f10960c fix typo 2004-08-11 19:38:16 +00:00
vibr 7bdd94cdb5 "moved-temp" -> "temp-redirect"
(HTTP/1.1: use 307 instead of 302 for unambiguity)
2004-08-11 19:06:05 +00:00
vibr 7b6f5675af allow header in 405 answers 2004-08-11 14:53:11 +00:00
vibr 880a05229c -adapt calls of make-error-response
-allow header in 405 answers
-answer 501, not 405, for unimplemented/unrecognized methods
2004-08-11 14:51:51 +00:00
vibr 5f64e72cd0 make-error-response:
more meaningful error description: show parsed uri
2004-08-11 14:48:11 +00:00
vibr 3abe557a86 make-error-response in response.scm uses http-url->string 2004-08-11 14:45:32 +00:00
vibr 8b09f2b338 -adapt calls of make-error-response
-answer 501, not 405, for unimplemented/unrecognized methods
2004-08-11 12:28:46 +00:00
vibr 1c4445933d adapt docu of make-error-response to its reworked interface 2004-08-11 10:43:26 +00:00
vibr f22f43ccd1 make-error-response:
use stuff in req where req isn't #f
2004-08-11 10:17:14 +00:00
vibr 2ee378aea9 add comments:
-make clear difference between http-error and
fatal-syntax-error
-refer to make-error-response for args of http-error
2004-08-11 09:20:31 +00:00
vibr a3dd880c7a +catch calls of make-error-response with too few arguments 2004-08-10 14:26:50 +00:00
vibr 4b37826de8 +comment/question 2004-08-10 14:25:55 +00:00
vibr 1bdac52ad6 adapt calls of make-error-response: remove tautologous description of errors 2004-07-30 22:26:50 +00:00
vibr 7c7be57a22 changes to make-error-response:
(1)no special treatment of first "optional" argument
- rationale: a different number of args is required depending on the
specific error code

(2)use close-html to write out all _effectively_ optional args

(3)generated html-pages are valid XHTML 1.0 Strict

(4)require header Allow for 405 errors

(5)more exact descriptions of errors in generated html-pages

TODO: handle calls of make-error-response with too little args
2004-07-30 22:25:03 +00:00
vibr 63e4761c58 adapt calls of make-error-response:
no URI header field in 1.1
2004-07-29 16:09:31 +00:00
vibr 5e14a326b9 adapt make-error-response:
- no URI header field in 1.1
- new 30x status codes in 1.1
2004-07-29 16:08:30 +00:00
vibr 1b4bdb59c6 updated status codes 2004-07-16 15:09:55 +00:00
vibr f96d93b355 re-commit revision 1.22, this time to branch http-1-1 2004-05-27 14:47:46 +00:00
33 changed files with 1321 additions and 986 deletions

View File

@ -1,7 +1,10 @@
Copyright (c) 1994 by Brian D. Carlstrom and Olin Shivers.
Copyright (c) 1995-1996 by Olin Shivers.
Copyright (c) 1996-2001 by Mike Sperber.
Copyright (c) 1999-2001 by Martin Gasbichler.
Copyright (c) 1998-2001 by Eric Marsden.
Copyright (c) 2001-2003 by Andreas Bernauer.
Copyright (c) 2004-2005 by Viola Brunner.
All rights reserved.
Redistribution and use in source and binary forms, with or without

11
README
View File

@ -9,15 +9,15 @@ Scsh's facilities for multi-threaded systems programming und Unix.
SUnet includes the following components:
* The SUnet Web server
This is a highly configurable HTTP 1.0 server in Scheme.
The server is accompanied some libraries which may also
This is a highly configurable HTTP 1.1 server in Scheme.
The server is accompanied by some libraries which may also
be used separately:
* URI and URL parsers and unparsers
* an URL parser and unparser
* a library for writing CGI scripts in Scheme
* server extensions for interfacing to CGI scripts
* server extensions for uploading Scheme code
* simple structured HTML output library
* a simple structured HTML output library
The server also ships with a sophisticated interface for writing
server-side Web applications called "SUrflets".
@ -110,8 +110,7 @@ scsh@zurich.ai.mit.edu
Relax, hack, and enjoy!
Dr. S.
Dr. S.
Michael Sperber
Martin Gasbichler
Eric Marsden
Andreas Bernauer

View File

@ -204,34 +204,41 @@ constructing responses lives in the \ex{httpd-responses} structure.
\var{Location} must be URI-encoded and begin with a slash.
\end{desc}
\defun{make-error-response}{status-code request [message] extras \ldots}{response}
\defun{make-error-response}{status-code request extra \ldots}{response}
\begin{desc}
This is a helper procedure for constructing error responses.
\var{code} is status code of the response (see below). \var{Request}
is the request that led to the error. \var{Message} is an optional
string containing an error message written in HTML, and \var{extras}
are further optional arguments containing further message lines to
\ex{Make-error-response} returns a response value the body of which
is a web page explaining the error at hand.
\var{status-code} is the status code of the response (see below).
\var{request}
is the request that led to the error. \var{extra} are the further
arguments required for this specific \var{status-code} and
optionally further information-bits (preferably strings in HTML) to
be added to the web page that's generated.
\ex{Make-error-response} constructs a response value which generates
a web page containg a short explanatory message for the error at hand.
\end{desc}
\begin{table}[htb]
\centering
\begin{tabular}{|l|l|l|}
\hline
continue & 100 & Continue\\\hline
switch-protocol & 101 & Switching Protocol\\\hline
ok & 200 & OK\\\hline
created & 201 & Created\\\hline
accepted & 202 & Accepted\\\hline
prov-info & 203 & Provisional Information\\\hline
non-author-info & 203 & Non-Authoritative Information\\\hline
no-content & 204 & No Content\\\hline
reset-content & 205 & Reset Content\\\hline
partial-content & 206 & Partial Content\\\hline
mult-choice & 300 & Multiple Choices\\\hline
moved-perm & 301 & Moved Permanently\\\hline
moved-temp & 302 & Moved Temporarily\\\hline
method & 303 & Method (obsolete)\\\hline
found & 302 & Found\\\hline
see-other & 303 & See other\\\hline
not-mod & 304 & Not Modified\\\hline
use-proxy & 305 & Use Proxy\\\hline
temp-redirect & 307 & Temporary Redirect\\\hline
bad-request & 400 & Bad Request\\\hline
unauthorized & 401 & Unauthorized\\\hline
@ -239,16 +246,26 @@ constructing responses lives in the \ex{httpd-responses} structure.
forbidden & 403 & Forbidden\\\hline
not-found & 404 & Not Found\\\hline
method-not-allowed & 405 & Method Not Allowed\\\hline
none-acceptable & 406 & None Acceptable\\\hline
proxy-auth-required & 407 & Proxy Authentication Required\\\hline
not-acceptable & 406 & Not Acceptable\\\hline
proxy-auth-required &407 & Proxy Authentication Required\\\hline
timeout & 408 & Request Timeout\\\hline
conflict & 409 & Conflict\\\hline
gone & 410 & Gone\\\hline
length-required & 411 & Length Required\\\hline
precon-failed & 412 & Precondition Failed\\\hline
req-ent-too-large & 413 & Request Entity Too Large\\\hline
req-uri-too-large & 414 & Request URI Too Large\\\hline
unsupp-media-type & 415 & Unsupported Media Type\\\hline
req-range-not-sat & 416 & Requested Range Not Satisfiable\\\hline
expectation-failed & 417 & Expectation Failed\\\hline
internal-error & 500 & Internal Server Error\\\hline
not-implemented & 501 & Not Implemented\\\hline
bad-gateway & 502 & Bad Gateway\\\hline
service-unavailable & 503 & Service Unavailable\\\hline
service-unavailable &503 & Service Unavailable\\\hline
gateway-timeout & 504 & Gateway Timeout\\\hline
version-not-supp & 505 & HTTP Version Not Supported\\\hline
\end{tabular}
\caption{HTTP status codes}
\label{tab:status-code-names}
@ -334,8 +351,8 @@ exported by the \ex{httpd\=basic\=handlers} structure:
\defvar{null-request-handler}{request-handler}
\begin{desc}
This request handler always generated a \ex{not-found} error
response, no patter what the request is.
This request handler always generates a \ex{not-found} error
response, no matter what the request is.
\end{desc}
\defun{make-predicate-handler}{predicate handler

View File

@ -1,164 +1,48 @@
\chapter{Parsing and Processing URIs}\label{cha:uri}
\chapter{Processing URIs}\label{cha:uri}
The \ex{uri} structure contains a library for dealing with URIs.
The \ex{uri} module contains library functions for dealing with URIs.
\section{Notes on URI Syntax}
A URI (Uniform Resource Identifier) is of following syntax:
%
\begin{inset}
[\var{scheme}] \verb|:| \var{path} [\verb|?| \var{search}] [\verb|#| \var{fragid}]
\end{inset}
%
Parts in brackets may be omitted.
The generic syntax of URI (Uniform Resource Identifier) is defined in
RFC 2396; see Appendix A for a collected BNF of URI.
The URI contains characters like \verb|:| to indicate its different
parts. Some special characters are \emph{escaped} if they are a
regular part of a name and not indicators for the structure of a URI.
Escape sequences are of following scheme: \verb|%|\var{h}\var{h} where \var{h}
is a hexadecimal digit. The hexadecimal number refers to the
ASCII of the escaped character, e.g.\ \verb|%20| is space (ASCII
32) and \verb|%61| is `a' (ASCII 97). This module
provides procedures to escape and unescape strings that are meant to
be used in a URI.
Within URI non-printable Ascii characters are represented by an
\emph{escape encoding}. \emph{Reserved} characters used as
delimiters indicating the different parts of a URI also must be
\emph{escaped} if they are to be regular data of a URI component. The
set of characters actually \emph{reserved} within any given URI
component is defined by that component. Therefore
\emph{escaping} can only be done when the URI is being created from
its component parts; likewise, a URI must be separated into its
component parts before \emph{unescaping} can be done.
Escape sequences are of the following scheme: \verb|%| \var{h}\var{h}
where \var{h}\var{h} are the two hexadecimal digits representing the octet code. For
example \verb|%20| is the escaped encoding for the US-ASCII space character.
\section{Procedures}
\defun{parse-uri} {uri-string } {scheme path search
frag-id} \label{proc:parse-uri}
\defun{unescape}{string}{string}
\begin{desc}
Parses an \var{uri\=string} into its four fields.
The fields are \emph{not} unescaped, as the rules for
parsing the \var{path} component in particular need unescaped
text, and are dependent on \var{scheme}. The URL parser is
responsible for doing this. If the \var{scheme}, \var{search}
or \var{fragid} portions are not specified, they are \sharpf.
Otherwise, \var{scheme}, \var{search}, and \var{fragid} are
strings. \var{path} is a non-empty string list---the path split
at slashes.
\end{desc}
Here is a description of the parsing technique. It is inwards from
both ends:
\begin{itemize}
\item First, the code searches forwards for the first reserved
character (\verb|=|, \verb|;|, \verb|/|, \verb|#|, \verb|?|,
\verb|:| or \verb|space|). If it's a colon, then that's the
\var{scheme} part, otherwise there is no \var{scheme} part. At
all events, it is removed.
\item Then the code searches backwards from the end for the last reserved
char. If it's a sharp, then that's the \var{fragid} part---remove it.
\item Then the code searches backwards from the end for the last reserved
char. If it's a question-mark, then that's the \var{search}
part----remove it.
\item What's left is the path. The code split it at slashes. The
empty string becomes a list containing the empty string.
\end{itemize}
%
This scheme is tolerant of the various ways people build broken
URI's out there on the Net\footnote{So it does not absolutely conform
to RFC~1630.}, e.g.\ \verb|=| is a reserved character, but used
unescaped in the search-part. It was given to me\footnote{That's
Olin Shivers.} by Dan Connolly of the W3C and slightly modified.
\defun{unescape-uri}{string [start] [end]}{string}
\begin{desc}
\ex{Unescape-uri} unescapes a string. If \var{start} and/or \var{end} are
specified, they specify start and end positions within \var{string}
should be unescaped.
\ex{Unescape} unescapes a string.
\end{desc}
%
This procedure should only be used \emph{after} the URI was parsed,
since unescaping may introduce characters that blow up the
parse---that's why escape sequences are used in URIs.
This procedure may only be used \emph{after} the URI was parsed into
its component parts (see above).
\defvar{uri-escaped-chars}{char-set}
\defun{escape} {string regexp} {string}
\begin{desc}
This is a set of characters (in the sense of SRFI~14) which are
escaped in URIs. RFC 2396 defines this set as all characters which
are neither letters, nor digits, nor one of the following characters:
\verb|-|, \verb|_|, \verb|.|, \verb|!|, %$
\verb|~|, \verb|*|, \verb|'|, \verb|(|, \verb|)|.
\ex{Escape} replaces reserved or excluded characters in \var{string}
by their escaped representation. \var{regexp} defines which
characters are reserved or excluded within the particular URI component
being escaped.
\end{desc}
\defun{escape-uri} {string [escaped-chars]} {string}
\begin{desc}
This procedure escapes characters of \var{string} that are in
\var{escaped\=chars}. \var{Escaped\=chars} defaults to
\ex{uri\=escaped\=chars}.
\end{desc}
%
Be careful with using this procedure to chunks of text with
syntactically meaningful reserved characters (e.g., paths with URI
slashes or colons)---they'll be escaped, and lose their special
meaning. E.g.\ it would be a mistake to apply \ex{escape-uri} to
\begin{verbatim}
//lcs.mit.edu:8001/foo/bar.html
\end{verbatim}
%
because the sla\-shes and co\-lons would be escaped.
\defun{split-uri}{uri start end} {list}
\begin{desc}
This procedure splits \var{uri} at slashes. Only the substring given
with \var{start} (inclusive) and \var{end} (exclusive) as indices is
considered. \var{start} and $\var{end} - 1$ have to be within the
range of \var{uri}. Otherwise an \ex{index-out-of-range} exception
will be raised.
Example: \codex{(split-uri "foo/bar/colon" 4 11)} returns
\codex{("bar" "col")}
\end{desc}
\defun{uri-path->uri}{path}{string}
\begin{desc}
This procedure generates a path out of a URI path list by inserting
slashes between the elements of \var{plist}.
\end{desc}
%
If you want to use the resulting string for further operation, you
should escape the elements of \var{plist} in case they contain
slashes, like so:
%
\begin{verbatim}
(uri-path->uri (map escape-uri pathlist))
\end{verbatim}
\defun{simplify-uri-path}{path}{list}
\begin{desc}
This procedure simplifies a URI path. It removes \verb|"."| and
\verb|"/.."| entries from path, and removes parts before a root.
The result is a list, or \sharpf{} if the path tries to back up past
root.
\end{desc}
%
According to RFC~2396, relative paths are considered not to start with
\verb|/|. They are appended to a base URL path and then simplified.
So before you start to simplify a URL try to find out if it is a
relative path (i.e. it does not start with a \verb|/|).
Examples:
%
\begin{alltt}
(simplify-uri-path (split-uri "/foo/bar/baz/.." 0 15))
\(\Rightarrow\) ("" "foo" "bar")
(simplify-uri-path (split-uri "foo/bar/baz/../../.." 0 20))
\(\Rightarrow\) ()
(simplify-uri-path (split-uri "/foo/../.." 0 10))
\(\Rightarrow\) #f
(simplify-uri-path (split-uri "foo/bar//" 0 9))
\(\Rightarrow\) ("")
(simplify-uri-path (split-uri "foo/bar/" 0 8))
\(\Rightarrow\) ("")
(simplify-uri-path (split-uri "/foo/bar//baz/../.." 0 19))
\(\Rightarrow\) #f
\end{alltt}
This procedure may only be used on a URI \emph{component part}, not on a
complete URI made up of several component parts (see above). Use it to
write specialized escape-procedures for the respective component
parts. (See the \ex{url} module for examples).
%%% Local Variables:
%%% mode: latex

View File

@ -1,110 +1,74 @@
\chapter{Parsing and Processing URLs}\label{cha:url}
%
This modules contains procedures to parse and unparse URLs. Until
now, only the parsing of HTTP URLs is implemented.
The \ex{url} module contains procedures to parse and unparse HTTP 1.1 Request-URIs.
\section{Server Records}
A \textit{server} value describes path prefixes of the form
\var{user}:\var{password}@\var{host}:\var{port}. These are
frequently used as the initial prefix of URLs describing Internet
resources.
\defun{make-server}{user password host port}{server}
\defunx{server?}{thing}{boolean}
\defunx{server-user}{server}{string-or-\sharpf}
\defunx{server-password}{server}{string-or-\sharpf}
\defunx{server-host}{server}{string-or-\sharpf}
\defunx{server-port}{server}{string-or-\sharpf}
\defun{url-string->http-url}{string}{http-url}
\begin{desc}
\ex{Make-server} creates a new server record. Each slot is a
decoded string or \sharpf. (\var{Port} is also a string.)
\ex{server?} is the corresponding predicate, \ex{server-user},
\ex{server-password}, \ex{server-host} and \ex{server-port}
are the correspondig selectors.
\ex{Url-string->http-url} parses the Request-URI \var{string} into a
\ex{http-url} record.
\end{desc}
\defun{parse-server}{path default}{server}
\defunx{server->string}{server}{string}
\defun{http-url?}{thing}{boolean}
\begin{desc}
\ex{Parse-server} parses a URI path \var{path} (a list representing
a path, not a string) into a server value. Default values are taken
from the server \var{default} except for the host. The values
are unescaped and stored into a server record that is returned.
\ex{Fatal-syntax-error} is called, if the specified path has no
initial to slashes (i.e., it starts with `//\ldots').
\ex{server->string} just does the inverse job: it unparses
\var{server} into a string. The elements of the record
are escaped before they are put together.
Example:
\begin{alltt}
> (define default (make-server "andreas" "se ret" "www.sf.net" "80"))
> (server->string default)
"andreas:se\%20ret@www.sf.net:80"
> (parse-server '("" "" "foo\%20bar@www.scsh.net" "docu" "index.html")
default)
'#{server}
> (server->string ##)
"foo\%20bar:se\%20ret@www.scsh.net:80"
\end{alltt}
%
For details about escaping and unescaping see Chapter~\ref{cha:uri}.
\ex{http-url?} is the predicate for the \ex{http-url} record.
\end{desc}
\section{HTTP URLs}
\defun{make-http-url}{server path search frag-id}{http-url}
\defunx{http-url?}{thing}{boolean}
\defunx{http-url-server}{http-url}{server}
\defun{http-url-host}{http-url}{string or \sharpf}
\defunx{http-url-port}{http-url}{integer or \sharpf}
\defunx{http-url-path}{http-url}{list}
\defunx{http-url-search}{http-url}{string-or-\sharpf}
\defunx{http-url-frag-ment-identifier}{http-url}{string-or-\sharpf}
%
\defunx{http-url-query}{http-url}{string or \sharpf}
\begin{desc}
\ex{Make-http-url} creates a new \ex{httpd-url} record.
\var{Server} is a record, containing the initial part of the address
(like \ex{anonymous@clark.lcs.mit.edu:80}). \var{Path} contains the
URL's URI path ( a list). These elements are in raw, unescaped
format. To convert them back to a string, use
\ex{(uri-path->uri (map escape-uri pathlist))}. \var{Search}
and \var{frag-id} are the last two parts of the URL. (See
Chapter~\ref{cha:uri} about parts of an URI.)
\ex{Http-url?} is the predicate for HTTP URL values, and
\ex{http-url-server}, \ex{http-url-path}, \ex{http-url-search} and
\ex{http-url-fragment-identifier} are the corresponding selectors.
\ex{http-url-host}, \ex{http-url-port}, \ex{http-url-path} and
\ex{http-url-query} are the selectors for the \ex{http-url} record.
The \var{host} slot is a non-empty string or \sharpf.
The \var{port} slot is an integer or \sharpf.
The \var{path} slot is a list of strings containing the
Request-URI's path split at slashes and \emph{unescaped}.If the
Request-URI's path ends with a slash, an empty string is inserted as
the last element of the list.
The \var{query} slot is an non-empty-string, still in its
\emph{escaped} representation, or \sharpf.
\end{desc}
%
Examples for Request-URI strings and the slots of the corresponding
http-url record: \nopagebreak
\begin{alltt}
"http://foo.bar.org:7777///foo%20foo//bar.htm?bulb%20bulb"
\(\Rightarrow\) "foo.bar.org" 7777 '("foo foo" "bar.htm") "bulb%20bulb"
"http://foo.bar.org"
\(\Rightarrow\) "foo.bar.org" #f '() #f
"http://foo.bar.org//"
\(\Rightarrow\) "foo.bar.org" #f '("") #f
"/foo%20foo//bar.htm?bulb%20bulb"
\(\Rightarrow\) #f #f '("foo foo" "bar.htm") "bulb%20bulb"
"/foo%20foo//?bulb%20bulb"
\(\Rightarrow\) #f #f '("foo foo" "") "bulb%20bulb"
"/"
\(\Rightarrow\) #f #f '("") #f
\end{alltt}
\defun{http-url->url-string}{http-url}{string}
\begin{desc}
\ex{http-url->url-string} unparses a \ex{http-url} record and returns the
Request-URI \ex{string} of the original HTTP Request.
\end{desc}
\defun{parse-http-url}{path search frag-id}{http-url}
\begin{defundescx}{http-url->string}{http-url}{string}
This constructs an HTTP URL record from a URI path (a list of path
components), a search, and a frag-id component.
\ex{Http-url->string} just does the inverse job. It converts an
HTTP URL record into a string.
\end{defundescx}
%
Note: The URI parser \ex{parse-uri} maps a string to four parts:
\var{scheme}, \var{path}, \var{search} and \var{frag-id} (see
Section~\ref{proc:parse-uri} for details). If \var{scheme} is
\ex{http}, then the other three parts can be passed to
\ex{parse-http-url}, which parses them into a \ex{http-url} record.
All strings come back from the URI parser encoded. \var{Search} and
\var{frag-id} are left that way; this parser decodes the path
elements. The first two list elements of the path indicating the
leading double-slash are omitted.
The following procedure combines the jobs of \ex{parse-uri} and
\ex{parse-http-url}:
\defun{parse-http-url-string}{string}{http-url}
\defun{http-url-path->path-string}{http-url-path}{string}
\begin{desc}
This parses an HTTP URL and returns the corresponding URL value; it
calls \ex{fatal-syntax-error} if the URL string doesn't have an
\ex{http} scheme.
\ex{http-url-path->url-string} unparses the \ex{http-url-path} field of
an http-url record into its corresponding part of the Request-URI
\ex{string} of the original HTTP Request (re-escaping the path).
\end{desc}
%%% Local Variables:

View File

@ -5,6 +5,10 @@
(let ((surflets? (get-option-value 'with-surflets)))
(install-directory-contents "scheme" 'scheme)
(install-directory "web-server" 'misc-shared)
(install-file "start-web-server" 'misc-shared "web-server")
(install-file "start-extended-web-server" 'misc-shared "web-server")
(if surflets?
(install-file "start-surflet-server" 'misc-shared "web-server"))
(install-directory-contents "doc" 'doc)
(let ((doc-dir (get-directory 'doc #t))

View File

@ -103,7 +103,7 @@
(nph? (string-prefix? "nph-" prog)) ; PROG starts with "nph-" ?
; why did we had (string-suffix? "-nph" prog) here?
(search (http-url-search (request-url req))) ; Compute the
(search (http-url-query (request-url req))) ; Compute the
(argv (if (and search (not (string-index search #\=))) ; argv list.
(split-and-decode-search-spec search)
'()))
@ -124,25 +124,29 @@
(case (file-not-executable? filename)
((search-denied permission)
(make-error-response (status-code forbidden) req
"Permission denied."))
"No permission to search directory."))
((no-directory nonexistent)
(make-error-response (status-code not-found) req
"File or directory doesn't exist."))
(make-error-response (status-code not-found) req))
(else
(if nph?
(cgi-make-nph-response (run/port* doit))
(cgi-make-response (run/port* doit) path req)))))
(else
(make-error-response (status-code method-not-allowed) req request-method))))))
((string=? request-method "HEAD")
(make-error-response (status-code method-not-allowed) req "GET, POST"))
(else
(make-error-response (status-code not-implemented) req))))))
(define (split-and-decode-search-spec s)
(let recur ((i 0))
(cond
((string-index s #\+ i) => (lambda (j) (cons (unescape-uri s i j)
((string-index s #\+ i) => (lambda (j) (cons (unescape s i j)
(recur (+ j 1)))))
(else (list (unescape-uri s i (string-length s)))))))
(else (list (unescape s i (string-length s)))))))
;;; Compute the CGI scripts' process environment by adding the standard CGI
@ -171,14 +175,14 @@
(headers (request-headers req))
;; Compute the $PATH_INFO and $PATH_TRANSLATED strings.
(path-info (uri-path->uri path-suffix)) ; No encode or .. check.
(path-info (string-join path-suffix "/")) ; No encode or .. check.
(path-translated (path-list->file-name path-info bin-dir))
;; Compute the $SCRIPT_PATH string.
(url-path (http-url-path (request-url req)))
(script-path (take (- (length url-path) (length path-suffix))
url-path))
(script-name (uri-path->uri script-path)))
(script-name (string-join script-path "/")))
(receive (rhost rport)
(socket-address->internet-address raddr)
@ -201,7 +205,7 @@
,@request-invariant-cgi-env ; Stuff that never changes (see cgi-handler).
,@(cond ((http-url-search (request-url req)) =>
,@(cond ((http-url-query (request-url req)) =>
(lambda (srch) `(("QUERY_STRING" . ,srch))))
(else '()))
@ -259,9 +263,8 @@
(request-method req))
(if loc
(if (uri-has-protocol? (string-trim loc))
(make-error-response (status-code moved-perm) req
loc loc)
(if (absolute-url? (url-string->http-url (string-trim loc)))
(make-error-response (status-code moved-perm) req loc)
(make-redirect-response (string-trim loc)))
;; Send the response header back to the client
(make-response ;code message seconds mime extras body
@ -283,11 +286,6 @@
(make-writer-body (lambda (out options)
(copy-inport->outport script-port out)))))
(define (uri-has-protocol? loc)
(receive (proto path search frag)
(parse-uri loc)
(if proto #t #f)))
(define (extract-status-code-and-text status req)
(with-fatal-error-handler*
(lambda (c d)

View File

@ -2,10 +2,6 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1994 by Brian D. Carlstrom and Olin Shivers.
;;; Copyright (c) 1996-2002 by Mike Sperber.
;;; Copyright (c) 2000-2002 by Martin Gasbichler.
;;; Copyright (c) 2002 by Andreas Bernauer.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
@ -17,12 +13,13 @@
;;; a complete server, you need to define request handlers (see below) --
;;; they determine how requests are to be handled.
;;;
;;; The RFC detailing the HTTP 1.0 protocol, RFC 1945, can be found at
;;; http://www.w3.org/Protocols/rfc1945/rfc1945
;;; See RFC 2616 for the specification of the HTTP/1.1 protocol.
;;;
;;; The server is compatible with previous versions of HTTP in the way
;;; described in RFC 2616 19.6. See RFC 1945 for the specification of
;;; HTTP/1.0 and 0.9.
(define server/protocol "HTTP/1.0")
(define (httpd options)
(let ((port (httpd-options-port options))
(root-dir (httpd-options-root-directory options))
@ -154,23 +151,23 @@
(values #f
(apply make-error-response (status-code bad-request)
#f ; No request yet.
"Request parsing error -- report to client maintainer."
(condition-stuff c))))
((not (and (exception? c)
(eq? (exception-reason c)
(enum exception os-error))))
;; try to send bug report to client
(eq? (exception-reason c);;?? ->
(enum exception os-error))));;?? ->
;;which cases is this supposed to catch excactly? broken
;;connection to client? If so, does it work?
(values #f
(apply make-error-response (status-code internal-error)
#f ; don't know
"Internal error occured while processing request"
c)))
(else
(decline))))
(lambda ()
(let ((initial-req (parse-http-request sock options)))
(let redirect-loop ((req initial-req))
(check-major-http-version initial-req)
(check-host-header initial-req)
(let redirect-loop ((req initial-req))
(let response-loop ((response ((httpd-options-request-handler options)
(http-url-path (request-url req))
req)))
@ -181,7 +178,7 @@
(socket:inport sock))))
((nph-response? response)
(values req response))
((eq? (response-code response) (status-code redirect))
((eq? (response-code response) (status-code redirect));internal redirect
(redirect-loop (redirect-request req response sock options)))
(else
(values req response)))))))))
@ -193,6 +190,9 @@
options)
)))))
;;; REDIRECT-REQUEST relies on that nothing is read out from SOCKET.
(define (redirect-request req response socket options)
(let* ((new-location-uri (redirect-body-location (response-body response)))
(url (with-fatal-error-handler*
@ -206,7 +206,7 @@
;; (future) NOTE: With this, a redirection may change the
;; protocol in use (currently, the server only supports one of
;; it). This might be inapplicable.
(parse-http-servers-url-fragment new-location-uri socket options)))))
(url-string->http-url new-location-uri)))))
(make-request "GET"
new-location-uri
@ -215,18 +215,7 @@
'() ; no rfc822 headers
(request-socket req))))
;;;; HTTP request parsing
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;; This code provides procedures to read requests from an input
;;;; port.
;;; Read and parse an http request from INPORT.
;;;
;;; Note: this parser parses the URI into an http URL record. If the URI
;;; isn't an http URL, the parser fails. This may not be right. There's
;;; nothing in the http protocol to prevent you from passing a non-http
;;; URI -- what this would mean, however, is not clear. Like so much of
;;; the Web, the protocols are redundant, underconstrained, and ill-specified.
(define (parse-http-request sock options)
(let ((line (read-crlf-line (socket:inport sock))))
@ -250,46 +239,15 @@
((3) (parse-http-version (caddr elts)))
(else (fatal-syntax-error "Bad Request Line."))))
(meth (car elts))
(uri-string (cadr elts))
(url (parse-http-servers-url-fragment uri-string sock options))
(request-uri (cadr elts))
(url (url-string->http-url request-uri))
(headers (if (equal? version '(0 . 9))
'()
(read-rfc822-headers (socket:inport sock)))))
(make-request meth uri-string url version headers sock)))))
;;; Parse the URL, but if it begins without the "http://host:port"
;;; prefix, interpolate one from SOCKET. It would be sleazier but
;;; faster if we just computed the default host and port at
;;; server-startup time, instead of on every request.
;;; REDIRECT-REQUEST relys on that nothing is read out from SOCKET.
(define (parse-http-servers-url-fragment uri-string socket options)
(receive (scheme path search frag-id) (parse-uri uri-string)
(if frag-id ; Can't have a #frag part.
(fatal-syntax-error "HTTP URL contains illegal #<fragment> suffix."
uri-string)
(if scheme
(if (string-ci=? scheme "http") ; Better be an http url.
(parse-http-url path search #f)
(fatal-syntax-error "Non-HTTP URL" uri-string))
;; Interpolate the server struct from our net connection.
(if (and (pair? path) (string=? (car path) ""))
(let* ((addr (socket-local-address socket))
(local-name (or (httpd-options-fqdn options)
(socket-address->fqdn addr)))
(portnum (or (httpd-options-reported-port options)
(my-reported-port addr))))
(make-http-url (make-server #f #f
local-name
(number->string portnum))
(map unescape-uri (cdr path)) ; Skip initial /.
search
#f))
(fatal-syntax-error "Path fragment must begin with slash"
uri-string))))))
(with-fatal-error-handler
(lambda (c decline)
(fatal-syntax-error "Illegal RFC 822 field syntax of request headers"))
(read-rfc822-headers (socket:inport sock))))))
(make-request meth request-uri url version headers sock)))))
(define parse-http-version
@ -303,6 +261,19 @@
(lose vstring))))))
;;; check whether the request's major HTTP version is greater than the
;;; server's major HTTP version; if so, send 505 (Version not supported).
(define (check-major-http-version req)
(if (> (car (request-version req)) (car http-version))
(http-error (status-code version-not-supp) req)))
(define (check-host-header req)
(if (not (version< (request-version req) '(1 . 1)))
(or (get-header (request-headers req) 'host)
(http-error (status-code bad-request) req "Missing Host header"))))
;;; Split string into a list of whitespace-separated strings.
;;; This could have been trivially defined in scsh as (field-splitter " \t\n")
;;; but I hand-coded it because it's short, and I didn't want invoke the
@ -322,7 +293,7 @@
(else '()))))
(define (send-http-headers response port)
(display server/protocol port)
(display (version->string http-version) port)
(write-char #\space port)
(display (status-code-number (response-code response)) port)
(write-char #\space port)
@ -334,7 +305,8 @@
(send-http-header-fields
(list (cons 'server (string-append "Scheme Untergrund " sunet-version-identifier))
(cons 'content-type (response-mime response))
(cons 'date (rfc822-time->string (response-seconds response))))
(cons 'date (rfc822-time->string (response-seconds response)))
(cons 'connection "close"))
port)
(send-http-header-fields (response-extras response) port)
@ -358,7 +330,8 @@
(else
(if (not (v0.9-request? request))
(send-http-headers response output-port))
(if (not (string=? (request-method request) "HEAD"))
(if (not (or (string=? (request-method request) "HEAD")
(no-body? (response-body response)))) ;; response messages which MUST NOT include a message-body
(display-http-body (response-body response) input-port output-port options))
(http-log request (response-code response)))))
@ -370,7 +343,3 @@
(write-crlf port))
headers))
(define (my-reported-port addr)
(receive (ip-addr portnum) (socket-address->internet-address addr)
portnum))

View File

@ -2,7 +2,6 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1995 by Olin Shivers.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
@ -23,19 +22,37 @@
(define http-error? (condition-predicate 'http-error))
;; See make-error-response for what you have to stuff into args for
;; each status-code. (All http-errors will be caught by the top-level
;; error-handler of process-toplevel-request, and will be turned into
;; calls of make-error-response).
(define (http-error status-code req . args)
(apply signal 'http-error status-code req args))
;;; Syntax error condition
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Scheme 48 has a "syntax error" error condition, but it isn't an error
;;; condition! It's a warning condition. I don't understand this.
;;; We define a *fatal* syntax error here for the parsers to use.
;; fatal-syntax-error isn't really a different type of error - it's
;; just an abbreviated notation:
;; (fatal-syntax-error msg irritants)
;; is equivalent to
;; (http-error (status-code bad-request) #f msg irritants)
;; -> use fatal-syntax-error where the client request cannot be parsed
;; because of bad syntax
(define-condition-type 'fatal-syntax-error '(error))
(define fatal-syntax-error? (condition-predicate 'fatal-syntax-error))
;; as with http-errors fatal-syntax-errors will be caught by the
;; top-level error-handler of process-toplevel-request and turned into
;; calls of make-error-response
(define (fatal-syntax-error msg . irritants)
(apply signal 'fatal-syntax-error msg irritants))

View File

@ -211,7 +211,7 @@
;;; ROOTED-FILE-OR-DIRECTORY-HANDLER for examples on how to feed this.
(define (make-rooted-file-path-response root file-path file-serve-response req options)
(if (http-url-search (request-url req))
(if (http-url-query (request-url req))
(make-error-response (status-code bad-request) req
"Indexed search not provided for this URL.")
(cond ((dotdot-check root file-path) =>
@ -252,17 +252,24 @@
(send-file-response fname info req options))
((directory) ; Send back a redirection "foo" -> "foo/"
(let* ((url (request-url req))
(url-string (http-url->url-string url))
(location-prefix
(if (absolute-url? url)
url-string
(string-append
"http://" (get-socket-host-string req) url-string))) ;we don't support virtual hosts yet!
(location (string-append location-prefix "/")))
(make-error-response
(status-code moved-perm) req
(string-append (request-uri req) "/")
(string-append (http-url->string (request-url req))
"/")))
(status-code moved-perm) req location)))
(else (make-error-response (status-code forbidden) req)))))
(else
((string=? request-method "POST")
(make-error-response (status-code method-not-allowed) req
request-method))))))
"GET, HEAD"))
(else
(make-error-response (status-code not-implemented) req))))))
(define (directory-index-serve-response fname file-path req options)
(file-serve-response (string-append fname "index.html") file-path req options))
@ -361,7 +368,7 @@
((directory) "[DIR ]")
(else "[????]"))))
(if icon-name
(emit-tag port 'img
(emit-empty-tag port 'img
(cons 'src icon-name)
(cons 'alt tag-name))
(display tag-name port))
@ -415,7 +422,8 @@
(file-directory-options-back-icon-url options))
(blank-icon
(file-directory-options-blank-icon-url options)))
(with-tag port html ()
(emit-prolog port)
(with-tag port html (xmlnsdecl-attr)
(let ((title (string-append "Index of /"
(string-join file-path "/"))))
(with-tag port head ()
@ -425,16 +433,16 @@
(with-tag port pre ()
(if blank-icon
(display "[ ]" port)
(emit-tag port 'img
(emit-empty-tag port 'img
(cons 'src blank-icon)
(cons 'alt " ")))
(write-string "Name " port)
(write-string "Last modified " port)
(write-string "Size " port)
(write-string "Description" port)
(emit-tag port 'hr)
(emit-empty-tag port 'hr)
(if back-icon
(emit-tag port 'img
(emit-empty-tag port 'img
(cons 'src back-icon)
(cons 'alt "[UP ]"))
(display "[UP ]" port))
@ -444,11 +452,14 @@
(write-string "Parent directory" port))
(write-crlf port)))
(let ((n-files (directory-index req fname port options)))
(emit-tag port 'hr)
(emit-empty-tag port 'hr)
(format port "~d files" n-files))))))))))))
(else
(make-error-response (status-code method-not-allowed) req
request-method)))))
((string=? request-method "POST")
(make-error-response (status-code method-not-allowed) req
"GET, HEAD"))
(else
(make-error-response (status-code not-implemented) req)))))
(define (index-or-directory-serve-response fname file-path req options)
(let ((index-fname (string-append fname "index.html")))

View File

@ -0,0 +1,195 @@
;;; A library of procs for request handlers.
;;; This file is part of the Scheme Untergrund Networking package.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Reading in the message body of a request.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Read in the message body of an HTTP message and return it as a string.
;;;
;;; READ-MESSAGE-BODY handles ordinary message bodies as well as
;;; message bodies to which the transfer coding "chunked" has been
;;; applied.
;;;
;;; Note: all request handlers should use READ-MESSAGE-BODY, and should not
;;; read in message bodies by themselves: READ-MESSAGE-BODY implements
;;; the correct order in which a message body's length is determined.
;;; (See RFC 2616, 4.4 for precedence of Transfer-encoding header over Content-length header.)
(define (read-message-body req)
(let ((inport (socket:inport (request-socket req))))
(if (chunked-transfer-coding? req)
(read-chunked-body inport)
(read-ordinary-body inport req))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Reading in ordinary bodies (no transfer coding applied)
;;; Read in the message body, return it as a string.
(define (read-ordinary-body inport req)
(let* ((body-length (get-body-length-from-content-length req)) ;make sure we have a valid Content-length header in request
(maybe-body (read-string body-length inport)))
(or maybe-body
(fatal-syntax-error "EOF while reading in message body"))))
;;Get length of the request's message body from Content-length header or
;;throw fatal-syntax-error if no such header
(define (get-body-length-from-content-length req)
(let
;;try to get field value of first Content-length header (RFC 2616 allows only one Content-length: header)
((maybe-length (get-numeric-field-value req 'content-length)))
(or maybe-length
(fatal-syntax-error "No Content-Length header in request"))))
;; GET-NUMERIC-FIELD-VALUE
;; generalized function to get a field value of the form 1*DIGIT
;; req is a request record, field-name a symbol
;; check wether a header-field with name field-name is contained in req;
;; if not, return #f,
;; else, take the first such header field and check wether its field-content conforms to
;; field-content = *LWS 1*DIGIT *LWS
;; if so, return digit as a number
(define (get-numeric-field-value req field-name)
(let
;;try to get first "field-name" header
((field-content (get-header (request-headers req) field-name)))
(if field-content ;; request contained "field-name" header
(let ;;see * below
((field-value (string->number (string-trim-both field-content char-set:blank)))) ;;char-set:blank = Space + Tab = LWS from RFC2616 after folding
(if (and field-value (>= field-value 0)) ;;yes, field value contained only digits.
field-value
(fatal-syntax-error
(format #f
"~A header contained only whitespace, or characters other than digits, or whitespace between digits"
field-name))))
#f)))
;;* RFC 2616, 4.2: The field-content does not include any leading or
;;trailing LWS: linear white space occurring before the first
;;non-whitespace character of the field-value or after the last
;;non-whitespace character of the field-value. Such leading or
;;trailing LWS MAY be removed without changing the semantics of the
;;field value.
;;(probably read-rfc822-headers in rfc822.scm should do the job of skipping leading and trailing whitespace?)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Decoding chunked entity bodies: "chunked transfer-coding"
;;; Is the request's entity body sent in chunked transfer-encoding?
;;; (See RFC 2616, 14.41 and 3.6 for syntax and semantics of Transfer-Encoding header.)
(define (chunked-transfer-coding? req)
(let ((field-value (get-header (request-headers req) 'transfer-encoding)))
(if (not field-value)
#f
; the field value is a comma-separated list of transfer-codings (3.6),
; extract the last transfer-coding in the list
(let* ((reversed-field-value (string-reverse field-value))
(index ; does the list contain more than one element?
(string-contains reversed-field-value " ,"))
(last-transfer-coding
(if index
(string-trim (string-reverse (string-take reversed-field-value index)))
(string-trim field-value))))
; the first token of the (extracted last) transfer-coding must be "chunked" to indicate chunked transfer coding
(string-prefix? "chunked" last-transfer-coding)))))
;;; Read in the chunked entity body, return it as a string.
;;; (See RFC 2616, 3.6.1 and 19.4.6 for the composition of chunked entity bodies.)
(define (read-chunked-body inport)
(let read-chunks ((chunk-size (get-chunk-size inport))
(res ""))
(if (= 0 chunk-size) ;last-chunk
(begin (discard-trailer inport); see comment *1
res)
(let ((maybe-chunk-data (read-string chunk-size inport)))
(if maybe-chunk-data
(begin (discard-line-terminator inport)
(read-chunks (get-chunk-size inport) (string-append res maybe-chunk-data)))
(fatal-syntax-error "EOF while reading in chunk-data in chunked entity body"))))))
;comment *1:
;
;This is were we don't achieve conditional compliance: we ought to read
;in the entity-headers in the trailer and incorporate them into the
;request record. Within our current scheme (where reading in the
;entity-body is the request-handlers job - while the request-handler
;is only called _after_ the request record has been built) this is not
;possible.
;
;Note that in their current state (04/2005) the handlers actually
;disregard most request headers anyway (even the request headers
;parsed into the request record).
;
;Alternatively we could check the request for a Trailer header and
;respond with 500 if we find one. Problem here: the clients "SHOULD
;include a Trailer header field in a message using chunked
;transfer-coding with a non-empty trailer" (14.40) - they are not
;obliged to! So even if we check for a Trailer header we may still
;silently disregard a trailer.
;;;Read in a chunk-size line within a chunked entity body; return the chunk-size as an integer.
;;; (See RFC 2616, 3.6.1 and 19.4.6 for the composition of chunked entity bodies.)
(define (get-chunk-size inport)
(let ((chunk-size-line (read-crlf-line inport)))
(if (eof-object? chunk-size-line)
(fatal-syntax-error "EOF while reading in chunk-size in chunked entity body")
(let* ((chunk-extensions-index (string-contains chunk-size-line "; "))
(hex-string (if chunk-extensions-index
(string-take chunk-size-line chunk-extensions-index)
chunk-size-line))
(chunk-size-int (string->number (string-trim-both hex-string char-set:blank) 16)))
(if (and chunk-size-int (>= chunk-size-int 0)) ; yes, chunk-size contained only hex chars
chunk-size-int
(fatal-syntax-error "Chunk-size within chunked entity body is incorrect or syntactically faulty"))))))
(define (discard-trailer inport)
(with-fatal-error-handler
(lambda (c decline)
(fatal-syntax-error "Illegal RFC 822 field syntax within trailer of entity message body"))
(read-rfc822-headers inport)))
(define (discard-line-terminator inport)
(read-char inport) (read-char inport)) ;;assuming the line terminator is CRLF as required by RFC 2616
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Misc
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; interpolate host info from our request's net connection.
;; return string. Example: "134.2.12.72:7777"
(define (get-socket-host-string req)
(let ((addr (socket-local-address (request-socket req))))
(call-with-values
(lambda ()(socket-address->internet-address addr))
(lambda (ipaddr portnum)
(string-append (format-internet-host-address ipaddr) ":" (number->string portnum))))))

View File

@ -23,7 +23,7 @@
;;; lookup to determine how to implement a given operation on a particular
;;; path.
;;;
;;; The REQUEST is a request record, as defined in httpd-core.scm, containing
;;; The REQUEST is a request record, as defined in request.scm, containing
;;; the details of the client request.
;; general request handler combinator:
@ -48,20 +48,14 @@
(make-predicate-handler
(lambda (path req)
;; we expect only one host-header-field
(let ((body (string-trim (get-header (request-headers req) 'host))))
(or (string-ci=? hostname body)
(string-prefix-ci? (string-append hostname ":") body))))
(let ((maybe-val (get-header (request-headers req) 'host)))
(if maybe-val
(let ((val (string-trim maybe-val)))
(or (string-ci=? hostname val)
(string-prefix-ci? (string-append hostname ":") val)))
(http-error (status-code bad-request) req "No Host: header"))))
handler default-handler))
(define (get-header headers tag)
(cond
((assq tag headers) => cdr)
(else
(http-error (status-code bad-request) #f
(string-append "Request did not contain "
(symbol->string tag)
" header")))))
;; selects handler according to path-prefix
;; if path-prefix matches, handler is called without the path-prefix
(define (make-path-prefix-handler path-prefix handler default-handler)

View File

@ -117,7 +117,7 @@
((list? parse-info) ; it's an info path
(lambda (url)
(values parse-info
(unescape-uri (http-url-search url)))))
(unescape (http-url-query url)))))
(else
(let ((info-path
((infix-splitter ":")
@ -128,7 +128,7 @@
"")))))
(lambda (url)
(values info-path
(unescape-uri (http-url-search url))))))))
(unescape (http-url-query url))))))))
(make-reference
(cond
((procedure? reference) reference)
@ -163,19 +163,21 @@
'()
(make-writer-body
(lambda (out options)
(emit-prolog out)
(receive (find-entry node-name) (parse-info-url (request-url req))
(display-node node-name
(file-finder find-entry)
(referencer make-reference (request-url req) out)
icon-name
out))
(with-tag out address ()
(with-tag out address ();; this is outside the html element?
(write-string address out)))))))
(else
((or (string=? request-method "HEAD")
(string=? request-method "POST"))
(make-error-response (status-code method-not-allowed) req
request-method)))))))
"GET"))
(else
(make-error-response (status-code not-implemented) req)))))))
(define split-header-line
(let ((split (infix-splitter (make-regexp "(, *)|( +)|( *\t *)")))
@ -232,7 +234,7 @@
(string-append "(" file ")" node))))
(define (display-icon file alt out)
(emit-tag out 'img
(emit-empty-tag out 'img
(cons 'src file)
(cons 'alt alt)
(cons 'align "bottom")))
@ -243,7 +245,7 @@
(let ((file (or node-file file)))
(with-tag out a ((href (make-reference
old-entry
(escape-uri (unparse-node-name file node)))))
(escape-not-unreserved-chars (unparse-node-name file node)))))
(if (and (not (null? maybe-icon))
(car maybe-icon))
(display-icon (car maybe-icon) (cadr maybe-icon) out))
@ -280,18 +282,18 @@
(emit-title out (string-append "Info Node: "
(unparse-node-name file node)))
(with-tag out h1 ()
(emit-tag out 'img
(emit-empty-tag out 'img
(cons 'src (icon-name 'info))
(cons 'alt "Info Node")
(cons 'align 'bottom))
(write-string (unparse-node-name file node) out))
(emit-tag out 'hr)
(emit-empty-tag out 'hr)
(maybe-display-header next (icon-name 'next) "[Next]")
(maybe-display-header previous (icon-name 'previous) "[Previous]")
(maybe-display-header up (icon-name 'up) "[Up]")
(if (or next previous up)
(emit-tag out 'hr)))
(emit-empty-tag out 'hr)))
;; Text
@ -438,7 +440,7 @@
(receive (port file-header node-header up-header prev-header next-header)
(find-node file node find-file)
(with-tag out html ()
(with-tag out html (xmlnsdecl-attr)
(with-tag out head ()
(display-title file node-header up-header
prev-header next-header

View File

@ -123,8 +123,8 @@
(socket-remote-address (request-socket req)))
(format-internet-host-address host-address))
(request-method req) ; request method
(uri-path->uri
(http-url-path (request-url req))) ; requested file
(http-url-path->path-string
(http-url-path (request-url req))) ; requested file (escaped as it was in original request)
(version->string (request-version req)) ; protocol version
(status-code-number status-code)
23 ; filesize (unknown)
@ -169,11 +169,21 @@
(or (maybe-dns-lookup remote-ip) "-")
(format-date "[~d/~b/~Y:~H:~M:~S +0000]" (date)) ; +0000 as we don't know
(string-join (list request-type
(string-append "/" requested-file)
requested-file
protocol))
; Unfortunately, we first split the request line into
; method/request-type etc. and put it together here.
; Files conform to CLF are expected to print the original line.
; --> Shouldn't be a problem: the original request
; line is reconstructed almost completely:
; requested-file (i.e. http-url->url-string url) is
; exactly the original Request_URI (apart from
; multiple slashes, which are thrown away),
; request-type and protocol are the original.
; --> Only number of slashes in Request_URI and
; whitespace between parts of Request-Line can differ.
(or http-code "-")
(or filesize "-")
(if (string? referer) (string-trim referer) '-)

View File

@ -1,7 +1,7 @@
;;;; HTTP request
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1996 by Olin Shivers.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
@ -10,7 +10,7 @@
(define-record-type request :request
(make-request method uri url version headers socket)
request?
(method request-method) ; A string such as "GET", "PUT", etc.
(method request-method) ; A string such as "GET", "POST", etc.
(uri request-uri) ; The escaped URI string as read from request line.
(url request-url) ; An http URL record (see url.scm).
(version request-version) ; A (major . minor) integer pair.
@ -26,6 +26,7 @@
(request-version req)
(request-headers req)
(request-socket req))))
;;; A http protocol version is an integer pair: (major . minor).
(define (version< v1 v2)

View File

@ -1,10 +1,10 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1994 by Brian D. Carlstrom and Olin Shivers.
;;; Copyright (c) 2002 by Mike Sperber.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
(define http-version '(1 . 1));server's HTTP-version is only hardcoded here!
(define-record-type http-response :http-response
(make-response code message seconds mime extras body)
response?
@ -21,6 +21,12 @@
;;representing the field value.
(body response-body));; message-body
;;TODO: mime shouldn't be a field in http-response, because it needn't be present for
;;responses which don't include a message-body.
;;Instead treat mime-type like any other header.
;;(Not urgent, as RFC 2616 doesn't prohibit presence of Content-Type header field
;;in body-less responses).
;; This is mainly for nph-... CGI scripts.
;; This means that the body will output the entire MIME message, not
;; just the part after the headers.
@ -40,6 +46,15 @@
writer-body?
(proc writer-body-proc))
;; the concept of http-reader-writer-body doesn't work: status-line
;; and headers of the response (i.e. the whole http-response record)
;; have to be built _before_ we have seen the entity-body of the
;; request. (Not until display-http-body hands over the iport to
;; reader-writer-body the entity-body can be read in). If the
;; entity-body is erroneous or if we encounter a server internal error
;; while reading in the entity-body we are not able to send an
;; appropriate response. (At that point of time we already sent
;; status-line and response-headers!)
(define-record-type http-reader-writer-body :http-reader-writer-body
(make-reader-writer-body proc)
reader-writer-body?
@ -50,6 +65,14 @@
redirect-body?
(location redirect-body-location))
;; type for responses which MUST NOT include a body (101, 204, 304)
(define-enumerated-type no-body :no-body
no-body?
no-body-elements
no-body-name
no-body-index
(none))
(define (display-http-body body iport oport options)
(cond
((writer-body? body)
@ -66,36 +89,56 @@
(number status-code-number)
(message status-code-message)
(
(continue 100 "Continue")
(switch-protocol 101 "Switching Protocols")
(ok 200 "OK")
(created 201 "Created")
(accepted 202 "Accepted")
(prov-info 203 "Provisional Information")
(non-author-info 203 "Non-Authoritative Information")
(no-content 204 "No Content")
(reset-content 205 "Reset Content")
(partial-content 206 "Partial Content")
(mult-choice 300 "Multiple Choices")
(moved-perm 301 "Moved Permanently")
(moved-temp 302 "Moved Temporarily")
(method 303 "Method (obsolete)")
(found 302 "Found");;use 303 or 307 for unambiguity;
;;use 302 for compatibility with
;;pre-1.1-clients
(see-other 303 "See other");;client is expected to
;;perform a GET on new URI
(not-mod 304 "Not Modified")
(use-proxy 305 "Use Proxy")
(temp-redirect 307 "Temporary Redirect");;analogous to "302
;;Moved Temporarily"
;;in RFC1945
(bad-request 400 "Bad Request")
(unauthorized 401 "Unauthorized")
(payment-req 402 "Payment Required")
(payment-required 402 "Payment Required")
(forbidden 403 "Forbidden")
(not-found 404 "Not Found")
(method-not-allowed 405 "Method Not Allowed")
(none-acceptable 406 "None Acceptable")
(not-acceptable 406 "Not Acceptable")
(proxy-auth-required 407 "Proxy Authentication Required")
(timeout 408 "Request Timeout")
(conflict 409 "Conflict")
(gone 410 "Gone")
(gone 410 "Gone")
(length-required 411 "Length Required")
(precon-failed 412 "Precondition Failed")
(req-ent-too-large 413 "Request Entity Too Large")
(req-uri-too-large 414 "Request URI Too Large")
(unsupp-media-type 415 "Unsupported Media Type")
(req-range-not-sat 416 "Requested Range Not Satisfiable")
(expectation-failed 417 "Expectation Failed")
(internal-error 500 "Internal Server Error")
(not-implemented 501 "Not Implemented")
(bad-gateway 502 "Bad Gateway")
(service-unavailable 503 "Service Unavailable")
(gateway-timeout 504 "Gateway Timeout")
(version-not-supp 505 "HTTP Version Not Supported")
(redirect -301 "Internal redirect")))
(define (name->status-code name)
@ -122,147 +165,183 @@
(else
(loop (+ i 1)))))))
;;; (make-error-response status-code req [message . extras])
;;; (make-error-response status-code req [extras])
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;;; As a special case, request REQ is allowed to be #f, meaning we haven't
;;; even had a chance to parse and construct the request. This is only used
;;; for 400 BAD-REQUEST error report.
;;; even had a chance to parse and construct the request. This can be the case for
;;; internal-error, bad-request, (possibly bad-gateway and ...?)
(define (make-error-response code req . args)
(let* ((message (and (pair? args) (car args)))
(extras (if (pair? args) (cdr args) '()))
(generic-title (lambda (port)
(title-html port
(status-code-message code))))
(send-message (lambda (port)
(if message
(format port "<BR>~%Further Information: ~A<BR>~%" message))))
(close-html (lambda (port)
(for-each (lambda (x) (format port "<BR>~s~%" x)) extras)
(write-string "</BODY>\n" port)))
(create-response
(lambda (headers writer-proc)
(make-response code
#f
(time)
"text/html"
headers
(make-writer-body writer-proc)))))
(define (make-error-response code req . extras)
(let*
;;catch server internal errors coming off by calls of make-error-response with too few arguments
((assert (lambda (n)
(if (< (length extras) n)
(make-error-response (status-code internal-error) req
"Too few arguments to make-error-response"))))
(generic-title (lambda (port)
(title-html port
(status-code-message code))))
(close-html (lambda (port args)
(if (not (null? args))
(format port "<br/>~%Further Information:~%"))
(for-each (lambda (x) (format port "<br/>~%~A~%" x)) args)
(format port "</p>~%</body>~%</html>~%")))
(create-response
(lambda (headers body)
(make-response code
#f
(time)
"text/html"
headers
body)))
(create-writer-body-response
(lambda (headers writer-proc)
(create-response headers (make-writer-body writer-proc))))
(create-no-body-response
(lambda (headers)
(create-response headers (no-body none)))))
(cond
;; This error response requires two args: message is the new URI: field,
;; and the first EXTRA is the older Location: field.
((or (eq? code (status-code moved-temp))
(eq? code (status-code moved-perm)))
(create-response
(list (cons 'uri message)
(cons 'location (car extras)))
;;this response requires one arg:
;;the value of the Upgrade field header,
;;which must be a string listing the protocols which are being switched
;;for example "HTTP/2.0, IRC/6.9"
((eq? code (status-code switch-protocol));; server currently doesn't have ability to switch protocols
(assert 1)
(create-no-body-response
(list (cons 'upgrade (car extras))
(cons 'connection "upgrade")))) ;; need this, because Upgrade header field only applies to immediate connection
((eq? code (status-code no-content))
(create-no-body-response '()))
;; This error response requires one arg:
;; the value of the Location field header,
;; which must be a single absolute URI
((or (eq? code (status-code found));302
(eq? code (status-code see-other));303
(eq? code (status-code temp-redirect));307
(eq? code (status-code moved-perm)));301
(assert 1)
(create-writer-body-response
(list (cons 'location (car extras)))
(lambda (port options)
(title-html port "Document moved")
(format port
"This document has ~A moved to a <A HREF=\"~A\">new location</A>.~%"
(if (eq? code (status-code moved-temp))
"temporarily"
"permanently")
message)
(close-html port))))
"The requested resource has moved ~A to a <a href=\"~A\">new location</a>.~%"
(if (eq? code (status-code moved-perm))
"permanently"
"temporarily")
(car extras))
(close-html port (cdr extras)))))
((eq? code (status-code bad-request))
(create-response
((eq? code (status-code not-mod))
(create-no-body-response '())) ;;see RCF 2616 10.3.5: this is only a valid answer if the server never sends
;;any of the headers Expires, Cache-Control, Vary for this resource
((eq? code (status-code bad-request))
(create-writer-body-response
'()
(lambda (port options)
(generic-title port)
(write-string "<P>Client sent a query that this server could not understand.\n"
port)
(send-message port)
(close-html port))))
(lambda (port options)
(generic-title port)
(format port "The request the client sent could not be understood by this server due to malformed syntax.~% Report to client maintainer.~%")
(close-html port extras))))
;; This error response requires one arg:
;; the value of the Allow field header,
;; which must be a string listing the valid methods for the requested resource
;; Ex.: "GET, HEAD, POST"
((eq? code (status-code method-not-allowed))
(create-response
'()
(assert 1)
(create-writer-body-response
(list (cons 'allow (car extras)))
(lambda (port options)
(generic-title port)
(write-string "<P>Method not allowed.\n" port)
(send-message port)
(close-html port))))
(format port "The method ~A is not allowed on the requested resource ~A.~%"
(request-method req) (http-url->url-string (request-url req)))
(close-html port (cdr extras)))))
;; This error response requires one arg:
;; the value of the WWW-Authenticate header field,
;; which must be a challenge (as described in RFC 2617)
((eq? code (status-code unauthorized))
(create-response
(list (cons 'WWW-Authenticate message)) ; Vas is das?
;; Vas das is? See: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.47
;; message should be a challenge(?)
(assert 1)
(create-writer-body-response
(list (cons 'WWW-Authenticate (car extras)))
(lambda (port options)
(title-html port "Authorization Required")
(write-string "<P>Browser not authentication-capable or\n" port)
(write-string "authentication failed.\n" port)
(send-message port)
(close-html port))))
(title-html port "Authentication Required")
(format port "Client not authentication-capable or authentication failed.~%")
(close-html port (cdr extras)))))
((eq? code (status-code forbidden))
(create-response
(create-writer-body-response
'()
(lambda (port options)
(title-html port "Request not allowed.")
(format port
"Your client does not have permission to perform a ~A~%"
(request-method req))
(format port "operation on url ~a.~%" (request-uri req))
(send-message port)
(close-html port))))
(format port "The request the client sent is not allowed.~% Retrying won't help.~%")
(close-html port extras))))
((eq? code (status-code not-found))
(create-response
(create-writer-body-response
'()
(lambda (port options)
(title-html port "URL not found")
(write-string
"<P>The requested URL was not found on this server.\n"
port)
(send-message port)
(close-html port))))
(title-html port "Resource not found")
(format port "The requested resource ~A was not found on this server.~%"
(http-url->url-string (request-url req)))
(close-html port extras))))
((eq? code (status-code internal-error))
(create-response
(create-writer-body-response
'()
(lambda (port options)
(generic-title port)
(format port "The server encountered an internal error or
misconfiguration and was unable to complete your request.
<P>
Please inform the server administrator, ~A, of the circumstances leading to
the error, and time it occured.~%"
(format port "This server encountered an internal error or misconfiguration and was unable to complete your request.~%<br/>~%Please inform the server administrator ~A of the circumstances leading to the error, and the time it occured.~%"
(or (httpd-options-server-admin options)
"[no mail address available]"))
(send-message port)
(close-html port))))
(close-html port extras))))
((eq? code (status-code not-implemented))
(create-response
(create-writer-body-response
'()
(lambda (port options)
(generic-title port)
(format port "This server does not currently implement
the requested method (~A).~%"
(format port "This server does not recognize or does not implement the requested method ~A.~%"
(request-method req))
(send-message port)
(close-html port))))
(close-html port extras))))
((eq? code (status-code bad-gateway))
(create-response
(create-writer-body-response
'()
(lambda (port options)
(generic-title port)
(format port "An error occured while waiting for the
response of a gateway.~%")
(send-message port)
(close-html port)))))))
(format port "This server received an invalid response from the upstream server it accessed in attempting to fulfill the request.~%")
(close-html port extras))))
((eq? code (status-code version-not-supp))
(create-writer-body-response
'()
(lambda (port options)
(generic-title port)
(format port "This server does not support the requested HTTP major version ~D.~%The highest HTTP major version supported is 1.~%"
(car (request-version req)))
; (format port "This server does not support the requested HTTP major version ~D.~%The highest HTTP major version supported is ~D.~%"
; (car (request-version req))
; (car http-version))
(close-html port extras)))))))
(define (title-html out message)
(format out "<HEAD>~%<TITLE>~%~A~%</TITLE>~%</HEAD>~%~%" message)
(format out "<BODY>~%<H1>~A</H1>~%" message))
;;produce valid XHTML 1.0 Strict
(emit-prolog out)
(emit-tag out 'html xmlnsdecl-attr)
(format out "~%<head>~%<title>~%~A~%</title>~%</head>~%" message)
(format out "<body>~%<h1>~A</h1>~%<p>~%" message))
;; Creates a redirect response. The server will serve the new file
;; indicated by NEW-LOCATION. NEW-LOCATION must be uri-encoded and
@ -270,7 +349,7 @@ response of a gateway.~%")
;; the browser won't notice the redirect. Thus, it will keep the
;; original URL. For "real" redirections, use
;; (make-error-response (status-code moved-perm) req
;; "new-location" "new-location").
;; "new-location").
(define (make-redirect-response new-location)
(make-response
(status-code redirect)

View File

@ -20,7 +20,7 @@
((list? finder)
(lambda (url)
(values finder
(unescape-uri (http-url-search url))
(unescape (http-url-query url))
'())))
(else
(let ((man-path
@ -32,7 +32,7 @@
"")))))
(lambda (url)
(values man-path
(unescape-uri (http-url-search url))
(unescape (http-url-query url))
'()))))))
(reference-template
(cond
@ -62,17 +62,22 @@
'()
(make-writer-body
(lambda (out options)
(emit-prolog out)
(receive (man-path entry and-then)
(parse-man-url (request-url req))
(emit-man-page man-binary nroff-binary rman-binary
gzcat-binary
entry man man-path and-then reference-template out))
(with-tag out address ()
(with-tag out address () ;;außerhalb des html elements?
(display address out)))))))
(else
((or (string=? request-method "HEAD")
(string=? request-method "POST"))
(make-error-response (status-code method-not-allowed) req
request-method)))))))
"GET"))
(else
(make-error-response (status-code not-implemented) req)))))))
(define (cat-man-page key section out)
(let ((title (if section

View File

@ -38,69 +38,73 @@
(cond
((string=? request-method "POST") ; Could do others also.
(seval path req))
((or (string=? request-method "HEAD")
(string=? request-method "GET"))
(make-error-response (status-code method-not-allowed) req
"POST"))
(else
(make-error-response (status-code method-not-allowed) req request-method)))))
(make-error-response (status-code not-implemented) req)))))
(define (seval path req)
(let* ((message-body (read-message-body req))
(sexp (parse-request-sexp message-body)))
(make-response
(status-code ok)
#f
(time)
"text/html"
'()
(make-reader-writer-body
(lambda (iport oport options)
(let ((sexp (read-request-sexp req iport)))
(http-syslog (syslog-level debug) "read sexp: ~a" sexp)
(with-tag oport HEAD ()
(newline oport)
(emit-title oport "Scheme program output"))
(newline oport)
(with-tag oport BODY ()
(newline oport)
(do/timeout
10
(receive vals
;; Do the computation.
(begin (emit-header oport 2 "Output from execution")
(newline oport)
(with-tag oport PRE ()
(newline oport)
(force-output oport); In case we're gunned down.
(with-current-output-port oport
(eval-safely sexp))))
;; Pretty-print the returned value(s).
(emit-header oport 2 "Return value(s)")
(with-tag oport PRE ()
(for-each (lambda (val) (p val oport))
vals))))))))))
(make-writer-body
(lambda (oport options)
(http-syslog (syslog-level debug) "read sexp: ~a" sexp)
(emit-prolog oport)
(with-tag oport html (xmlnsdecl-attr)
(newline oport)
(with-tag oport head ()
(newline oport)
(emit-title oport "Scheme program output")
(newline oport))
(newline oport)
(with-tag oport body ()
(newline oport)
(do/timeout
10
(receive vals
;; Do the computation.
(begin (emit-header oport 1 "Output from execution")
(newline oport)
(with-tag oport pre ()
(newline oport)
(force-output oport); In case we're gunned down.
(with-current-output-port oport
(eval-safely sexp))))
;; Pretty-print the returned value(s).;; hier noch mal newline rausschreiben?
(emit-header oport 1 "Return value(s)")
(with-tag oport pre ()
(for-each (lambda (val) (p val oport))
vals)))))))))))
;;; Read an HTTP request entity body from stdin. The Content-length:
;;; element of request REQ's header tells how many bytes to this entity
;;; is. The entity should be a URI-encoded form body. Pull out the
;;; program=<stuff>
;;; string, extract <stuff>, uri-decode it, parse that into an s-expression,
;;; and return it.
;;; Parse the request's message body.
(define (read-request-sexp req iport)
(cond
((get-header (request-headers req) 'content-length) =>
(lambda (cl-str) ; Take the first Content-length: header,
(let* ((cl-start (string-skip cl-str char-set:whitespace)) ; skip whitespace,
(cl (if cl-start ; & convert to
(string->number (substring cl-str ; a number.
cl-start
(string-length cl-str)))
0)) ; All whitespace?? -- WTF.
(qs (read-string cl iport)) ; Read in CL chars,
(q (parse-html-form-query qs)) ; and parse them up.
(s (cond ((assoc "program" q) => cdr)
(else (error "No program in entity body.")))))
(http-syslog (syslog-level debug)
"Seval sexp: ~s" s)
(read (make-string-input-port s)))))
(else (error "No `Content-length:' field in POST request."))))
;;; We assume, that the entity is "form-url encoded" data (see
;;; parse-forms.scm for a description of this encoding). This
;;; assumption is rather strange - it may safely be made only if
;;; there's a "Content-type: application/x-www-form-urlencoded" header.
;;; Pull out the program=<stuff> string, extract <stuff>,
;;; parse that into an s-expression, and return it.
(define (parse-request-sexp body)
(let* ((parsed-html-form-query (parse-html-form-query body))
(program (cond ((assoc "program" parsed-html-form-query) => cdr)
(else (fatal-syntax-error "No program was found in request's message body.")))))
(http-syslog (syslog-level debug)
"Seval sexp: ~s" program)
(with-fatal-error-handler
(lambda (c decline)
(fatal-syntax-error "The program in the request's message body isn't a valid s-expression"))
(read (make-string-input-port program))))) ;; return first sexp, discard others

View File

@ -26,7 +26,7 @@
'application/x-www-form-urlencoded' as content-type"))
(cond
((string=? request-method "GET")
(form-query-list (http-url-search
(form-query-list (http-url-query
(surflet-request-url surflet-request))))
((string=? request-method "POST")
(or (cached-bindings surflet-request)

View File

@ -443,6 +443,7 @@
shift-reset ;SHIFT and RESET
(subset srfi-1 (alist-cons alist-delete!))
srfi-6 ;string-ports
(subset srfi-13 (string-join))
srfi-14 ;CHAR-SET:DIGIT
srfi-27 ;random numbers
surflet-requests ;requests for surflets
@ -626,7 +627,6 @@
(define-structure surflets/addresses surflets/addresses-interface
(open scheme
srfi-23 ;error
(subset uri (escape-uri))
define-record-types
(subset surflets/utilities (generate-unique-name)))
(files addresses))
@ -644,7 +644,7 @@
(open scheme
surflets/input-field-value
surflets/addresses
(subset uri (unescape-uri)))
(subset uri (unescape)))
(files returned-via))
(define-structure surflets/outdaters surflets/outdaters-interface
@ -659,7 +659,7 @@
weak ;weak pointers
surflets/utilities ;form-query-list
surflet-requests
(subset url (http-url-search))
(subset url (http-url-query))
(subset srfi-14 (char-set:digit))
(subset srfi-13 (string-index string-trim))
(subset srfi-1 (filter))

View File

@ -44,7 +44,7 @@
(lambda (path req)
(if (pair? path) ; need at least one element
(let ((request-method (request-method req))
(path-string (uri-path->uri path)))
(path-string (string-join path "/")))
(if (or (string=? request-method "GET")
(string=? request-method "POST"))
(make-input-response

View File

@ -11,6 +11,34 @@
;;; HTML text representation -- surrounding it with single or double quotes,
;;; as appropriate, etc.
;;XHTML 1.0 Strict
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; a well-formed XML document begins with a prolog;
;; this is the prolog for an XHTML 1.0 strict document:
(define XMLdecl
"<?xml version=\"1.0\" encoding=\"iso-8859-1\" ?>")
(define doctypedecl
"<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">")
(define (emit-prolog out)
(display XMLdecl out)
(newline out)
(display doctypedecl out)
(newline out))
;; the root element html must contain an xmlns declaration for the
;; XHTML namespace, which ist defined to be
;; http://www.w3.org/1999/xhtml
(define xmlnsval "http://www.w3.org/1999/xhtml")
;; for use with emit-tag and with-tag:
(define xmlnsdecl-attr (cons 'xmlns xmlnsval))
;;; Printing HTML tags.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; All the emit-foo procedures have the same basic calling conventions:
@ -42,6 +70,26 @@
attrs)
(display #\> out)))
;;; Empty elements, e.g. <hr />
(define (emit-empty-tag out tag . attrs)
(let ((out (fmt->port out)))
(display "<" out)
(display tag out)
(for-each (lambda (attr)
(display #\space out)
(cond ((pair? attr) ; name="val"
(display (car attr) out)
(display "=\"" out) ; Should check for
(display (cdr attr) out) ; internal double-quote
(display #\" out)) ; etc.
(else
(display attr out)))) ; name
attrs)
(display " /" out)
(display #\> out)))
;;; </tag>
@ -49,7 +97,7 @@
(format out "</~a>" tag))
;;; <P>
;;; <p>
(define (emit-p . args) ; (emit-p [out attr1 ...])
(receive (out attrs) (if (pair? args)
@ -61,13 +109,13 @@
(apply emit-tag out 'p attrs)))
;;; <TITLE> Make Money Fast!!! </TITLE>
;;; <title> Make Money Fast!!! </title>
(define (emit-title out title) ; Takes no attributes.
(format out "<title>~a~%</title>~%" title))
(format out "<title>~a</title>" title))
(define (emit-header out level text . attribs)
(apply with-tag* out (string-append "H" (number->string level))
(apply with-tag* out (string-append "h" (number->string level))
(lambda () (display text (fmt->port out)))
attribs))
@ -90,11 +138,11 @@
;;; instead of (NAME VALUE).
;;;
;;; For example,
;;; (let ((hp "http://clark.lcs.mit.edu/~shivers")) ; My home page.
;;; (with-tag port A ((href hp-url) (name "hp"))
;;; (let ((hp-url "http://clark.lcs.mit.edu/~shivers")) ; My home page.
;;; (with-tag port a ((href hp-url) (name "hp"))
;;; (display "home page" port)))
;;; outputs
;;; <A href="http://clark.lcs.mit.edu/~shivers" name="hp">home page</A>
;;; <a href="http://clark.lcs.mit.edu/~shivers" name="hp">home page</a>
(define-syntax with-tag
(syntax-rules ()
@ -107,9 +155,11 @@
;;; Why can't this be a LET-SYNTAX inside of WITH-TAG?
(define-syntax %hack-attr-elt
(syntax-rules () ; Build attribute-list element:
(syntax-rules (xmlnsdecl-attr) ; Build attribute-list element:
((%hack-attr-elt (name val)) ; (name elt) => (cons 'name elt)
(cons 'name val))
((%hack-attr-elt xmlnsdecl-attr)
xmlnsdecl-attr)
((%hack-attr-elt name) 'name))) ; name => 'name
@ -191,3 +241,4 @@
(if (null? maybe-port)
(write-string (escape-html s))
(write-string (escape-html s) (fmt->port (car maybe-port)))))

View File

@ -12,12 +12,12 @@
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; The form's field data are turned into a single string, of the form
;;; name=val&name=val
;;; where the <name> and <val> parts are URI encoded to hide their
;;; &, =, and + chars, among other things. After URI encoding, the
;;; space chars are converted to + chars, just for fun. It is important
;;; to encode the spaces this way, because the perfectly general %xx escape
;;; mechanism might be insufficiently confusing. This variant encoding is
;;; called "form-url encoding."
;;; where the <name> and <val> parts are URI encoded to hide their &,
;;; =, and + chars and other reserves or excluded characters. After
;;; URI encoding, the space chars are converted to + chars, just for
;;; fun. It is important to encode the spaces this way, because the
;;; perfectly general %xx escape mechanism might be insufficiently
;;; confusing. This variant encoding is called "form-url encoding."
;;;
;;; If the form's method is POST,
;;; Browser sends the form's field data in the entity block, e.g.,
@ -32,6 +32,7 @@
;;;
;;; In either case, the data is "form-url encoded" (as described above).
(define (parse-html-form-query q)
(let ((qlen (string-length q)))
(let recur ((i 0))
@ -46,11 +47,11 @@
(else '()))))) ; BOGUS STRING -- Issue a warning.
;;; Map plus characters to spaces, then do URI decoding.
;;; Map plus characters to spaces, then unescape.
(define (unescape-uri+ s . maybe-start/end)
(let-optionals maybe-start/end ((start 0)
(end (string-length s)))
(unescape-uri (string-map (lambda (c) (if (char=? c #\+) #\space c))
(unescape (string-map (lambda (c) (if (char=? c #\+) #\space c))
(if (and (zero? start)
(= end (string-length s)))
s ; Gratuitous optimisation.

View File

@ -2,10 +2,10 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 2002 by Andreas Bernauer.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
;;; interpolate hostname or IP address from socket local address. return a string
(define (host-name-or-ip addr)
(with-fatal-error-handler
(lambda (condition more)
@ -68,7 +68,6 @@
(lambda ()
(release-lock lock))))
;; Get Header from (RFC822 like) header alist
(define (get-header headers tag)
(cond ((assq tag headers) => cdr)

View File

@ -3,93 +3,51 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1995 by Olin Shivers.
;;; Copyright (c) 2004 by Viola Brunner.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
;;; URI syntax -- [scheme] : path [? search ] [# fragmentid]
;;; References:
;;; - http://www.w3.org/Addressing/rfc1630.txt
;;; Original RFC
;;; - http://www.w3.org/hypertext/WWW/Addressing/URL/URI_Overview.html
;;; General Web page of URI pointers.
;;; RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax
(define uri-reserved (string->char-set ";/#?: ="))
(define uri-reserved-sans-= (char-set-delete uri-reserved #\=))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; decode a URI
;;; walk over string s and unescape all occurrences of RegExp 'escaped' (see url.scm).
(define (parse-uri s)
(let* ((slen (string-length s))
;; Search forwards for colon (or intervening reserved char).
(rs1 (string-index s uri-reserved)) ; 1st reserved char
(colon (and rs1 (char=? (string-ref s rs1) #\:) rs1))
(path-start (if colon (+ colon 1) 0))
;copy from url.scm:
(define hex (rx hex-digit))
(define escaped (rx (: "%" ,hex ,hex)))
;; Search backwards for # (or intervening reserved char).
(rs-last (string-index-right s uri-reserved))
(sharp (and rs-last (char=? (string-ref s rs-last) #\#) rs-last))
;; Search backwards for ? (or intervening reserved char).
;; (NB: #\= may be after #\? and before #\#)
(rs-penult (string-index-right s
uri-reserved-sans-=
path-start
(or sharp slen)))
(ques (and rs-penult (char=? (string-ref s rs-penult) #\?) rs-penult))
(path-end (or ques sharp slen)))
(values (and colon (substring s 0 colon))
(split-uri s path-start path-end)
(and ques (substring s (+ ques 1) (or sharp slen)))
(and sharp (substring s (+ sharp 1) slen)))))
;;; Remark:
;;; we assume no non-ASCII characters occur in the URI; therefore the
;;; ascii table is used for conversion of the octet the hexnumber
;;; represents to a char.
;;; Caution:
;;; Don't use this proc until *after* you've parsed the URL -- unescaping
;;; might introduce reserved chars (like slashes and colons) that could
;;; blow your parse.
;;; a URI must be separated into its components (for a HTTP-URL e.g. parsed by
;;; PARSE-URL) before the escaped characters within those components
;;; can be safely decoded. Don't use UNESCAPE on an unparsed URI.
(define (unescape-uri s . maybe-start/end)
(let-optionals maybe-start/end ((start 0)
(end (string-length s)))
(let* ((esc-seq? (lambda (i) (and (< (+ i 2) end)
(char=? (string-ref s i) #\%)
(hex-digit? (string-ref s (+ i 1)))
(hex-digit? (string-ref s (+ i 2))))))
(hits (let lp ((i start) (hits 0)) ; count # of esc seqs.
(if (< i end)
(if (esc-seq? i)
(lp (+ i 3) (+ hits 1))
(lp (+ i 1) hits))
hits))))
(if (and (zero? hits) (zero? start) (= end (string-length s)))
s
(let* ((nlen (- (- end start) (* hits 2))) ; the new length
; of the
; unescaped
; string stores
; the result
(ns (make-string nlen)))
(define (unescape s)
(regexp-fold
escaped
(lambda (start-search match res)
(let* ((start-match (match:start match))
(hexchar-low (string-ref s (+ start-match 2)))
(hexchar-high (string-ref s (+ start-match 1)))
(hex-low (hexchar->int hexchar-low))
(hex-high (hexchar->int hexchar-high))
(ascii (+ (* 16 hex-high) hex-low)))
(string-append
res
(substring s start-search start-match)
(string (ascii->char ascii)))))
""
s
(lambda (start-search res)
(string-append res (substring s start-search (string-length s))))))
(let lp ((i start) (j 0)) ; sweep over the string
(if (< j nlen)
(lp (cond
((esc-seq? i) ; unescape
; escape-sequence
(string-set! ns j
(let ((d1 (string-ref s (+ i 1)))
(d2 (string-ref s (+ i 2))))
(ascii->char (+ (* 16 (hexchar->int d1))
(hexchar->int d2)))))
(+ i 3))
(else (string-set! ns j (string-ref s i))
(+ i 1)))
(+ j 1))))
ns)))))
(define hex-digit?
(let ((hex-digits (string->char-set "0123456789abcdefABCDEF")))
(lambda (c) (char-set-contains? hex-digits c))))
; make use of the fact that numbers and characters are in order in the ascii table
(define (hexchar->int c)
@ -101,100 +59,47 @@
(char->ascii #\a))
10))))
(define int->hexchar
(let ((table '#(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9
#\A #\B #\C #\D #\E #\F)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; encode a URI:
;;; replace characters which are reserved or excluded by their escaped representation.
;;; Caution:
;;; Each component of a URI may have its own set of characters that are reserved,
;;; -> differentiate between components by writing specialized procedures
;;; (see url.scm for examples)
;;; Caution:
;;; don't encode an already encoded string; #\% chars would be escaped again.
;;; escape occurrences of RegExp regexp in string s
(define (escape s regexp)
(regexp-fold
regexp
(lambda (start-search match res)
(let* ((start-match (match:start match))
(forbidden-char (string-ref s start-match)))
(string-append
res
(substring s start-search start-match)
(ascii->escaped (char->ascii forbidden-char)))))
""
s
(lambda (start-search res)
(string-append res (substring s start-search (string-length s))))))
;;;generate string representing hex-ascii-code for the decimal-ascii-code DEC-INT
(define (ascii->escaped dec-int)
(let* ((hex-int-high (bitwise-and (arithmetic-shift dec-int -4) #xF))
(hex-int-low (bitwise-and dec-int #xF)))
(string-append
"%" (int->hexstring hex-int-high) (int->hexstring hex-int-low))))
(define int->hexstring
(let ((table '#("0" "1" "2" "3" "4" "5" "6" "7" "8" "9"
"A" "B" "C" "D" "E" "F")))
(lambda (i) (vector-ref table i))))
;;; Caution:
;;; All reserved chars (e.g., slash, sharp, colon) get escaped: "=;/#?: "
;;; So don't apply this proc to chunks of text with syntactically meaningful
;;; reserved chars (e.g., paths with URI slashes or colons) -- they'll be
;;; escaped, and lose their special meaning. E.g. it would be a mistake
;;; to apply ESCAPE-URI to "//lcs.mit.edu:8001/foo/bar.html" because the
;;; slashes and colons would be escaped.
(define uri-escaped-chars
(char-set-complement
;; RFC 2396 (URI Generic Syntax) specifies unreserved = alphanum | mark
(char-set-union char-set:letter+digit
(string->char-set "-_.!~*'()"))))
;;; Takes a set of chars to escape. This is because we sometimes need to
;;; escape larger sets of chars for different parts of a URI.
(define (escape-uri s . maybe-escaped-chars)
(let-optionals maybe-escaped-chars ((escaped-chars uri-escaped-chars))
(let ((nlen (string-fold
(lambda (c i)
(+ i
(if (char-set-contains? escaped-chars c)
3
1)))
0
s))) ; new length of escaped string
(if (= nlen (string-length s))
s
(let ((ns (make-string nlen)))
(string-fold
(lambda (c i) ; replace each occurance of an
; character to escape with %ff where ff
; is the ascii-code in hexadecimal
; notation
(+ i (cond
((char-set-contains? escaped-chars c)
(string-set! ns i #\%)
(let* ((d (char->ascii c))
(dhi (bitwise-and (arithmetic-shift d -4) #xF))
(dlo (bitwise-and d #xF)))
(string-set! ns (+ i 1)
(int->hexchar dhi))
(string-set! ns (+ i 2)
(int->hexchar dlo)))
3)
(else (string-set! ns i c)
1))))
0
s)
ns)))))
;;; Cribbed from scsh's fname.scm
(define (split-uri uri start end) ; Split at /'s (infix grammar).
(let split ((i start)) ; "" -> ("")
(cond
((>= i end) '(""))
((string-index uri #\/ i) =>
(lambda (slash)
(cons (substring uri i slash)
(split (+ slash 1)))))
(else (list (substring uri i end))))))
;;; The elements of PLIST must be escaped in case they contain slashes.
;;; This procedure doesn't escape them for you; you must do that yourself:
;;; (uri-path->uri (map escape-uri pathlist))
(define (uri-path->uri plist)
(string-join plist "/")) ; Insert slashes between elts of PLIST.
(define (simplify-uri-path p)
(if (null? p)
#f ; P must be non-null
(let lp ((path-list (cdr p))
(stack (list (car p))))
(if (null? path-list) ; we're done
(reverse stack)
(cond
((string=? (car path-list) "..") ; back up
; neither the empty path nor root
(if (not (or (null? stack) (string=? (car stack) "")))
(lp (cdr path-list) (cdr stack))
#f))
((string=? (car path-list) ".") ; leave this
(lp (cdr path-list) stack))
((string=? (car path-list) "") ; back to root
(lp (cdr path-list) '("")))
(else ; usual segment
(lp (cdr path-list) (cons (car path-list) stack))))))))

View File

@ -1,163 +1,399 @@
;;; URL parsing and unparsing -*- Scheme -*-
;;; HTTP 1.1 Request-URI parsing and unparsing -*- Scheme -*-
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1995 by Olin Shivers.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
;;; I'm only implementing HTTP URL's right now.
;;; References:
;;; - http://www.w3.org/Addressing/rfc1738.txt
;;; Original RFC
;;; - http://www.w3.org/hypertext/WWW/Addressing/URL/Overview.html
;;; General Web page of URI pointers.
;;; RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1
;;; RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax
;;;
;;; RFC 2616 adopts definitions of regexps from RFC 2396
;;; (see copy of Appendix A of RFC 2396 below)
;;; Unresolved issues:
;;; - The server parser shouldn't substitute default values --
;;; that should happen in a separate step.
;;; Note: there are 2 Problems in RFC 2616 concerning URIS:
;;; The steps in hacking a URL are:
;;; - Take the UID, parse it, and resolve it with the context UID, if any.
;;; - Consult the UID's <scheme>. Pick the appropriate URL parser and parse.
;;; Problem 1:
;;; RFC 2616 is ambiguous in defining Request_URIS:
;;;
;;; section 5.1.2 states:
;;; HTTP 1.1 Request-URIS are of the form
;;; Request-URI = "*" | absoluteURI | abs_path | authority
;;;
;;; whilst section 3.2.2 defines the 'http_URL'
;;; http_URL = "http://" host [ ":" port ] [ abs_path [ "?" query ]]
;;;
;;; Solution to Problem 1:
;;; Since allowing for general absoluteURIs doesn't make too much sense
;;; we implement Request_URIs of the form
;;; Request-URI = ( http_URL | abs_path) ["#" fragment]
;;; where http_URL is a only a subset of absoluteURI
;;; Server strings: //<user>:<password>@<host>:<port>/
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; A SERVER record describes path-prefixes of the form
;;; //<user>:<password>@<host>:<port>/
;;; These are frequently used as the initial prefix of URL's describing
;;; Internet resources.
(define-record-type server :server ; Each slot is a decoded string or #f.
(make-server user password host port)
server?
(user server-user)
(password server-password)
(host server-host)
(port server-port))
;;; Parse a URI path (a list representing a path, not a string!) into
;;; a server record. Default values are taken from the server
;;; record DEFAULT except for the host. Returns a server record if
;;; it wins. CADDR drops the server portion of the path. In fact,
;;; fatal-syntax-error is called, if the path doesn't start with '//'.
;
(define (parse-server path default)
(if (and (pair? path) ; The thing better begin
(string=? (car path) "") ; with // (i.e., have two
(pair? (cdr path)) ; initial "" elements).
(string=? (cadr path) ""))
(let* ((uhs (caddr path)) ; Server string.
(uhs-len (string-length uhs))
(at (string-index uhs #\@)) ; Usr:passwd at-sign, if any.
(colon1 (and at (string-index uhs #\:))) ; Usr:passwd colon,
(colon1 (and colon1 (< colon1 at) colon1)) ; if any.
(colon2 (string-index uhs #\: (or at 0)))) ; Host:port colon, if any.
(make-server (if at
(unescape-uri uhs 0 (or colon1 at))
(server-user default))
(if colon1
(unescape-uri uhs (+ colon1 1) at)
(server-password default))
(unescape-uri uhs (if at (+ at 1) 0)
(or colon2 uhs-len))
(if colon2
(unescape-uri uhs (+ colon2 1) uhs-len)
(server-port default))))
(fatal-syntax-error "URL must begin with //..." path)))
;;; Unparser
(define server-escaped-chars
(char-set-union uri-escaped-chars ; @ and : are also special
(string->char-set "@:"))) ; in UH strings.
(define (server->string uh)
(let* ((us (server-user uh))
(pw (server-password uh))
(ho (server-host uh))
(po (server-port uh))
;; Encode before assembly in case pieces contain colons or at-signs.
(e (lambda (s) (escape-uri s server-escaped-chars)))
(user/passwd (if us
`(,(e us) . ,(if pw `(":" ,(e pw) "@") '("@")))
'()))
(host/port (if ho
`(,(e ho) . ,(if po `(":" ,(e po)) '()))
'())))
(apply string-append (append user/passwd host/port))))
;;; Problem 2:
;;; according to RFC 2616, section 5.1.2, the Request-URI may only
;;; have a [? query] part if it's an absoluteURI; on the other hand
;;; only requests being made to proxies are supposed to use
;;; absoluteURIs; abs_path is the normal case. So this must be a mistake.
;;; See also http://skrb.org/ietf/http_errata.html#uriquery
;;;
;;; Solution to Problem 2:
;;, we implement Request_URIs of the form
;;; Request-URI = ( http_URL | abs_path ["?" query] ) ["#" fragment]
;;; HTTP URL parsing
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Note: we don't have to support Request-URIS of the form "*" or
;;; authority, because these are not used with the any of the methods
;;; HEAD, GET and POST, which are the only methods we implement so
;;; far.
;;; The PATH slot of this record is the URL's path split at slashes,
;;; e.g., "foo/bar//baz/" => ("foo" "bar" "" "baz" "")
;;; These elements are in raw, unescaped format. To convert back to
;;; a string, use (uri-path->uri (map escape-uri pathlist)).
;;; Here we depart from the RFCs:
;;; RFC 2616 and 1945 disallow a #fragment-suffix of the Request-URI.
;;; For compatibility with buggy clients we _do_ allow for it.
;;; (Apache does so, too).
;;; RexExps for Request-URIs as scsh SREs
;;; stick to RFC terminology throughout
;;; (see copy of Appendix A of RFC 2396 below)
;;;
;;; we implement Request_URIs of the form
;;; Request-URI = ( http_URL | abs_path ["?" query] ) ["#" fragment]
(define digit (rx numeric))
(define alpha (rx alphabetic))
(define alphanum (rx alphanumeric))
(define hex (rx hex-digit))
(define escaped (rx (: "%" ,hex ,hex)))
(define mark (rx ( "-_.!~*'()")))
(define unreserved (rx (~ (~ (| ,alphanum ,mark)))))
(define reserved (rx ( ";/?:@&=+$,")))
(define uric (rx (| ,reserved ,unreserved ,escaped)))
(define fragment (rx (* ,uric)))
(define query (rx (* ,uric)))
(define pchar-charset (rx ( ":@&=+$,")))
(define pchar (rx (| ,unreserved ,escaped ,pchar-charset)))
(define param (rx (* ,pchar)))
(define segment (rx (:
(* ,pchar)
(* (: ";" ,param)))))
(define path-segments (rx (:
,segment
(* (: "/" ,segment)))))
(define abs_path (rx (:
"/"
,path-segments)))
(define port (rx (* ,digit)))
(define IPv4address (rx (+ ,digit) "." (+ ,digit) "." (+ ,digit) "." (+ ,digit)))
(define toplabel (rx (:
(|
,alpha
(:
,alpha
(* (| ,alphanum "-"))
,alphanum)))))
(define domainlabel (rx (:
(|
,alphanum
(: ,alphanum
(* (| ,alphanum "-"))
,alphanum)))))
(define hostname (rx (:
(* (: ,domainlabel "."))
,toplabel
(? "."))))
(define host (rx (| ,hostname ,IPv4address)))
(define http_URL (rx (:
"http://"
(submatch ,host)
(?
(: ":" (submatch ,port)))
(?
(: (submatch ,abs_path)
(?
(: "?" (submatch ,query))))))))
(define http_URL_with_frag (rx (: bos
,@http_URL
(? (: "#" ,fragment))
eos)))
(define abs_path_with_frag (rx (: bos
(submatch ,abs_path)
(? (: "?" (submatch ,query)))
(? (: "#" ,fragment))
eos)))
(define Request-URI (rx (| ,@http_URL_with_frag ,@abs_path_with_frag)))
;;; parse a HTTP 1.1 Request_URI
;;;
;;; return matches of regexps host, port, abs_path, query;
;;;
;;; If request-uri is a relative URI, host and port are #f;
;;; port and query are also #f if they are not given.
;;; If there's no abs_path given, or abs_path is "/", path is the empty list;
;;; otherwise it is a list containing the path's segments.
;;;
;;; Caution: parse-url doesn't unescape anything yet!
(define (parse-url request-uri)
(cond
((regexp-search abs_path_with_frag request-uri)
=> (lambda (match)
(let ((path (split-abs-path (match:substring match 1)))
(query (match:substring match 2)))
(values #f #f path query))))
((regexp-search http_URL_with_frag request-uri)
=>(lambda (match)
(let ((host (match:substring match 1))
(port (match:substring match 2))
(path (split-abs-path (match:substring match 3)))
(query (match:substring match 4)))
(values host port path query))))
(else
(fatal-syntax-error "Request-URI syntactically faulty"))))
;;; split the string abs-path at slashes, return list of 'segments' (see RegExp definition above).
;;;
;;; SPLIT-PATH assumes abs-path if either #f or matches the RegExp abs_path,
;;; no checks are done.
;;;
;;; Remark: abs_path allows for strings containing several consecutive slashes;
;;; SPLIT-ABS-PATH treats them as one slash.
;;; (e.g., "/foo///bar//baz" => ("foo" "bar" "baz"))
;;;
;;; Note: we have to differentiate between paths with trailing
;;; slash(es) and paths without and hand that information over
;;; to the request handler. (See
;;; http://httpd.apache.org/docs-2.0/misc/rewriteguide.html ->
;;;"Trailing Slash problem" for the reasons.)
;;; If there is one or more trailing slash(es) the last element of the
;;; returned list will be an empty string.
;;; (e.g., "/foo///bar//baz//" => ("foo" "bar" "baz" ""))
(define (split-abs-path abs-path)
(if abs-path
(let* ((trailing-slash (char=? #\/ (string-ref abs-path (- (string-length abs-path) 1))))
(last-element (if trailing-slash '("") '())))
(regexp-fold-right
(rx (+ (~ ("/"))))
(lambda (match i res)
(cons (match:substring match 0) res))
last-element
abs-path))
'()))
;;; record type HTTP-URL for Request_URIs
;;;
;;; The HOST slot is a non-empty-string or #f.
;;;
;;; The PORT slot is an integer or #f.
;;;
;;; The PATH slot is a list of strings containing the Request_URI's
;;; path split at slashes and unescaped. If the Request_URI's path
;;; ends with a slash, an empty string is inserted as the last element
;;; of the list.
;;; (e.g., "/foo///bar//baz" => ("foo" "bar" "baz"))
;;; (e.g., "/foo///bar//baz//" => ("foo" "bar" "baz" ""))
;;;
;;; The QUERY slot is an non-empty-string, still in its escaped
;;; representation, or #f.
;;; Caution: the path slot of a http-url record has already been
;;; UNESCAPED; don't unescape it a second time!
;;; The query slot is still in its escaped representation.
(define-record-type http-url :http-url
(make-http-url server path search fragment-identifier)
(make-http-url host port path query)
http-url?
(server http-url-server) ; Initial //anonymous@clark.lcs.mit.edu:80/
(path http-url-path) ; Rest of path, split at slashes & decoded.
(search http-url-search)
(fragment-identifier http-url-fragment-identifier))
(host http-url-host)
(port http-url-port)
(path http-url-path)
(query http-url-query))
;;; The URI parser (parse-uri in uri.scm) maps a string to four parts:
;;; <scheme> : <path> ? <search> # <frag-id> <scheme>, <search>, and
;;; <frag-id> are strings; <path> is a non-empty string list -- the
;;; URI's path split at slashes. Optional parts of the URI, when
;;; missing, are specified as #f. If <scheme> is "http", then the
;;; other three parts can be passed to PARSE-HTTP-URL, which parses
;;; them into a HTTP-URL record. All strings come back from the URI
;;; parser encoded. SEARCH and FRAG-ID are left that way; this parser
;;; decodes the path elements.
;;;
;;; Returns a HTTP-URL record, if possible. Otherwise
;;; FATAL-SYNTAX-ERROR is called.
;;; Is http-url of the form http_URL, i.e. absolute?
(define (absolute-url? http-url)
(http-url-host http-url))
(define (parse-http-url path search frag-id)
(let ((uh (parse-server path default-http-server)))
(if (or (server-user uh) (server-password uh))
(fatal-syntax-error
"HTTP URL's may not specify a user or password field" path))
;;; parse a HTTP 1.1. Request_URI into a http-url record
(make-http-url uh (map unescape-uri (cdddr path)) search frag-id)))
(define (url-string->http-url uri-string)
(receive (host port path query)
(parse-url uri-string)
(let ((portnumber (and port (string->number port)))
(unescaped-path (map unescape path)))
(make-http-url host portnumber unescaped-path query))))
(define (parse-http-url-string string)
(call-with-values
(lambda () (parse-uri string))
(lambda (scheme path search frag-id)
(if (string=? scheme "http")
(parse-http-url path search frag-id)
(fatal-syntax-error "not an HTTP URL" path)))))
;;; Unparse a http-url record into its corresponding Request_URI
;;; Default http port is 80.
(define default-http-server (make-server #f #f #f "80"))
;;; The following holds (apart from multiple slashes in the path,
;;; which are removed by url-string->http-url):
;;; (http-url->url-string (url-string->http-url <request-uri-string>)) == <request-uri-string>
(define (http-url->url-string http-url)
;;; Unparse.
(let* ((host (http-url-host http-url))
(scheme-and-host-string
(if host
(string-append "http://" host)
""))
(define (http-url->string url)
(string-append "http://"
(server->string (http-url-server url))
"/"
(uri-path->uri (map escape-uri (http-url-path url)))
(cond ((http-url-search url) =>
(lambda (s) (string-append "?" s)))
(else ""))
(cond ((http-url-fragment-identifier url) =>
(lambda (fi) (string-append "#" fi)))
(else ""))))
(port (http-url-port http-url))
(port-string
(if port
(string-append ":" (number->string port))
""))
(path (http-url-path http-url))
(path-string
(fold-right
(lambda (segment res)
(string-append "/" (escape-segment segment) res))
""
path))
(query (http-url-query http-url))
(query-string (if query
(string-append "?" query)
"")))
(string-append scheme-and-host-string port-string path-string query-string)))
;;; Unparse the http-url-path field of an http-url record into its
;;; corresponding part of the Request_URI
(define (http-url-path->path-string http-url-path)
(fold-right
(lambda (segment res)
(string-append "/" (escape-segment segment) res))
""
http-url-path))
;;; decoding and encoding Request-URIs:
;;; to decode Request-URIs use UNESCAPE from uri.scm
;;; encode Request-URIs:
;;; Each component of a URI may have its own set of characters that are reserved,
;;; -> differentiate between components.
;;; not allowed within component 'segment' in 'abs_path'
(define segment-reserved-and-excluded (rx (~ ,unreserved ,pchar-charset (";"))))
;;; not allowed within component 'query'
(define query-reserved-and-excluded (rx (~ ,unreserved ,reserved )))
;;; encode 'abs_path' portion of a URI:
;;; use SPLIT-PATH to split abs_path into its segments,
;;; then apply ESCAPE-SEGMENT to the segments.
(define (escape-segment segment)
(escape segment segment-reserved-and-excluded))
;;; encode 'query' portion of a URI
(define (escape-query query)
(escape query query-reserved-and-excluded))
;;; encode something we don't know: escape all but the unreserved characters.
(define (escape-not-unreserved-chars something)
(escape something (rx (~ ,unreserved))))
;; Appendix A of RFC 2396
;;
;A. Collected BNF for URI
; URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
; absoluteURI = scheme ":" ( hier_part | opaque_part )
; relativeURI = ( net_path | abs_path | rel_path ) [ "?" query ]
; hier_part = ( net_path | abs_path ) [ "?" query ]
; opaque_part = uric_no_slash *uric
; uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
; "&" | "=" | "+" | "$" | ","
; net_path = "//" authority [ abs_path ]
; abs_path = "/" path_segments
; rel_path = rel_segment [ abs_path ]
; rel_segment = 1*( unreserved | escaped |
; ";" | "@" | "&" | "=" | "+" | "$" | "," )
; scheme = alpha *( alpha | digit | "+" | "-" | "." )
; authority = server | reg_name
; reg_name = 1*( unreserved | escaped | "$" | "," |
; ";" | ":" | "@" | "&" | "=" | "+" )
; server = [ [ userinfo "@" ] hostport ]
; userinfo = *( unreserved | escaped |
; ";" | ":" | "&" | "=" | "+" | "$" | "," )
; hostport = host [ ":" port ]
; host = hostname | IPv4address
; hostname = *( domainlabel "." ) toplabel [ "." ]
; domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
; toplabel = alpha | alpha *( alphanum | "-" ) alphanum
; IPv4address = 1*digit "." 1*digit "." 1*digit "." 1*digit
; port = *digit
; path = [ abs_path | opaque_part ]
; path_segments = segment *( "/" segment )
; segment = *pchar *( ";" param )
; param = *pchar
; pchar = unreserved | escaped |
; ":" | "@" | "&" | "=" | "+" | "$" | ","
; query = *uric
; fragment = *uric
; uric = reserved | unreserved | escaped
; reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
; "$" | ","
; unreserved = alphanum | mark
; mark = "-" | "_" | "." | "!" | "~" | "*" | "'" |
; "(" | ")"
; escaped = "%" hex hex
; hex = digit | "A" | "B" | "C" | "D" | "E" | "F" |
; "a" | "b" | "c" | "d" | "e" | "f"
; alphanum = alpha | digit
; alpha = lowalpha | upalpha
; lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
; "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
; "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
; upalpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
; "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
; "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
; digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
; "8" | "9"

View File

@ -3,10 +3,6 @@
;;; This file is part of the Scheme Untergrund Networking package.
;;; Copyright (c) 1994 by Brian D. Carlstrom and Olin Shivers.
;;; Copyright (c) 1996-2002 by Mike Sperber.
;;; Copyright (c) 2000-2002 by Martin Gasbichler.
;;; Copyright (c) 1998-2001 by Eric Marsden.
;;; For copyright information, see the file COPYING which comes with
;;; the distribution.
@ -19,6 +15,7 @@
(define-interface htmlout-interface
(export emit-tag
emit-empty-tag
emit-close-tag
emit-p
@ -29,7 +26,10 @@
with-tag*
escape-html
emit-text))
emit-text
emit-prolog
xmlnsdecl-attr))
(define-interface smtp-interface
(export smtp-send-mail
@ -49,37 +49,23 @@
rfc822-time->string))
(define-interface uri-interface
(export parse-uri
uri-escaped-chars
unescape-uri
escape-uri
split-uri
uri-path->uri
simplify-uri-path))
(export unescape
escape))
(define-interface url-interface
(export server?
make-server
server-user
server-password
server-host
server-port
parse-server
server->string
(export escape-not-unreserved-chars
escaped
http-url?
make-http-url
http-url-server
http-url-host
http-url-port
http-url-path
http-url-search
http-url-fragment-identifier
http-url-query
parse-http-url
parse-http-url-string
http-url->string))
absolute-url?
url-string->http-url
http-url->url-string
http-url-path->path-string))
(define-interface ftp-library-interface
(export copy-port->port-binary
@ -341,7 +327,8 @@
version->string))
(define-interface httpd-responses-interface
(export make-response response?
(export http-version
make-response response?
response-code
response-message
response-seconds
@ -358,6 +345,7 @@
make-writer-body writer-body?
make-reader-writer-body reader-writer-body?
make-redirect-body redirect-body? redirect-body-location
no-body?
display-http-body
status-code?
@ -370,6 +358,10 @@
make-error-response
make-redirect-response))
(define-interface httpd-handler-lib-interface
(export get-socket-host-string
read-message-body))
(define-interface httpd-basic-handlers-interface
(export make-predicate-handler
make-path-predicate-handler
@ -469,19 +461,14 @@
(define-structure uri uri-interface
(open scheme-with-scsh
(subset srfi-13 (string-index string-index-right string-fold string-join))
let-opt
receiving
ascii
bitwise
field-reader-package)
bitwise)
(files (lib uri)))
(define-structure url url-interface
(open scheme-with-scsh
define-record-types
receiving
(subset srfi-13 (string-index))
(subset srfi-1 (fold-right))
uri
httpd-errors)
(files (lib url)))
@ -599,7 +586,7 @@
(open scheme-with-scsh
format-net
sigevents
(subset srfi-13 (string-join))
(subset srfi-13 (string-join string-skip string-trim-both))
dns
let-opt ; :optional
locks
@ -642,7 +629,6 @@
rfc822
handle ; ignore-errors
conditions ; condition-stuff
uri
url
format-net
rate-limit ; rate-limiting stuff
@ -660,6 +646,7 @@
httpd-logging
httpd-requests
httpd-responses
httpd-handler-lib
sunet-version
)
@ -696,8 +683,7 @@
i/o ; make-null-output-port
locks
receiving
uri ; uri-path->uri
url ; http-url-path
url ; http-url-path, http-url-path->path-string
httpd-requests ; request record
httpd-responses
formats
@ -721,6 +707,8 @@
(define-structure httpd-responses httpd-responses-interface
(open scheme
(subset scsh (format-date write-string time date))
url
htmlout
syslog
define-record-types
finite-types
@ -730,12 +718,26 @@
httpd-read-options)
(files (httpd response)))
(define-structure httpd-handler-lib httpd-handler-lib-interface
(open scheme-with-scsh
crlf-io ; read-crlf-line
rfc822 ;read-rfc822-headers
format-net ;format-internet-host-address
(subset srfi-13 (string-trim-both string-trim string-prefix? string-reverse string-contains string-take))
handle-fatal-error
sunet-utilities ;get-header
httpd-requests
httpd-responses
httpd-errors)
(files (httpd handler-lib)))
(define-structure httpd-basic-handlers httpd-basic-handlers-interface
(open scheme-with-scsh
rfc822
httpd-requests ; REQUEST record type, v0.9-request
(subset srfi-1 (fold-right))
(subset srfi-13 (string-trim string-prefix-ci?))
sunet-utilities
httpd-responses
httpd-errors
)
@ -748,6 +750,7 @@
httpd-requests
httpd-responses
httpd-errors
httpd-handler-lib
httpd-basic-handlers
httpd-read-options
url
@ -768,7 +771,7 @@
httpd-requests ; v0.9-request
httpd-responses
httpd-logging ; http-log
uri ; UNESCAPE-URI
httpd-handler-lib
htmlout ; Formatted HTML output
pp
(subset srfi-13 (string-skip))
@ -777,7 +780,8 @@
handle ; IGNORE-ERROR
parse-html-forms ; PARSE-HTML-FORM-QUERY
threads ; SLEEP
sunet-utilities ; GET-HEADER
sunet-utilities
handle-fatal-error
)
(files (httpd seval)))
@ -815,7 +819,7 @@
(define-structure httpd-cgi-handlers httpd-cgi-handlers-interface
(open scheme-with-scsh
(subset srfi-1 (alist-delete))
(subset srfi-13 (string-prefix? string-index string-trim substring/shared))
(subset srfi-13 (string-prefix? string-index string-trim substring/shared string-join))
rfc822
crlf-io ; WRITE-CRLF
uri
@ -825,6 +829,7 @@
httpd-responses
httpd-basic-handlers ; HTTP-HOMEDIR, SERVE-ROOTED-FILE-PATH
httpd-errors ; HTTP-ERROR
httpd-handler-lib
httpd-file-directory-handlers ; dot-dot-check, copy-inport->outport
sunet-version
formats

View File

@ -111,11 +111,6 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o http-test -e mai
(else
(error "Internal error, option not found" option alist))))
(define (become-nobody-if-root)
(cond ((zero? (user-uid))
(set-gid (->gid "nobody"))
(set-uid (->uid "nobody")))))
(define (main args)
(with-cwd
(file-name-directory (car args))
@ -126,6 +121,9 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o http-test -e mai
(log-file-name . "/tmp/httpd.log")
(requests . 5)))
(options (make-options-from-args (cdr args) default-options)))
(cond ((zero? (user-uid))
(set-gid (->gid "nobody"))
(set-uid (->uid "nobody"))))
(format #t "Going to run Webserver with:
htdocs-dir: ~a
@ -147,15 +145,15 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o http-test -e mai
with-simultaneous-requests (lookup-option options 'requests)
with-syslog? #t
with-log-file (lookup-option options 'log-file-name)
with-post-bind-thunk become-nobody-if-root
with-request-handler
(alist-path-dispatcher
(list (cons "seval" seval-handler)
(list (cons "h" (home-dir-handler "public_html"))
(cons "seval" seval-handler)
;; You may want to adapt this to your site.
;; call like http://localhost:8080/man/man?ssh(1)
(cons "man" (rman-handler 'man
'nroff
"/usr/X11R6/bin/rman"
"/usr/bin/rman"
"/usr/bin/zcat"
#f "man?%s(%s)"
"Generated by rman-gateway"))
@ -165,9 +163,8 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o http-test -e mai
"Generated by info-gateway"))
(cons "cgi-bin" (cgi-handler
(lookup-option options 'cgi-bin-dir))))
(tilde-home-dir-handler "public_html"
(rooted-file-or-directory-handler
(lookup-option options htdocs-dir)))))))))
(rooted-file-or-directory-handler
(lookup-option options 'htdocs-dir))))))))
))
;; EOF

View File

@ -11,8 +11,10 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
httpd-make-options
httpd-basic-handlers
httpd-file-directory-handlers
httpd-cgi-handlers
httpd-seval-handlers
; cgi-server
; seval-handler
; rman-gateway
; info-gateway
surflet-handler
surflet-handler/options
let-opt
@ -27,7 +29,6 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
(format #f
"Usage: start-surflet-server
[-h DIR | --htdocs-dir=DIR] [-s DIR | --surflet-dir=DIR]
[--cgi-bin-dir=DIR]
[-i DIR | --images-dir=DIR] [-p NUM | --port=NUM]
[-l FILE | --log-file-name=FILE] [-r NUM | --requests=NUM]
[--help]
@ -35,14 +36,14 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
with
htdocs-dir directory of html files (default: root/htdocs)
surflet-dir directory of SUrflet files (default: root/surflets)
cgi-bin-dir directory of cgi files (default: root/cgi-bin)
images-dir directory of images files (default: root/img)
port port server is listening to (default: 8080)
port port server is listening to (default: 8008)
log-file-name directory where to store the logfile in CLF
(default: /tmp/httpd.log)
requests maximal amount of simultaneous requests (default 5)
--help show this help
"))
NOTE: This is the SUrflet-server. It does not support cgi-bin.~%"))
(define (display-usage)
(display (usage) (current-error-port))
@ -82,9 +83,6 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
(surflet-dir-option
(option '(#\s "surflet-dir") #t #f
(absolute-file-option-proc 'surflet-dir)))
(cgi-bin-dir-option
(option '(#\c "cgi-bin-dir") #t #f
(absolute-file-option-proc 'cgi-bin-dir)))
(images-dir-option
(option '(#\i "images-dir") #t #f
(absolute-file-option-proc 'images-dir)))
@ -103,7 +101,6 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
(display-usage)))))
(args-fold arg-list
(list htdocs-dir-option surflet-dir-option
cgi-bin-dir-option
images-dir-option port-option
log-file-name-option requests-option
help-option)
@ -131,38 +128,34 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
(main `(main ,@(car args)))
(main '(main))))
(define (become-nobody-if-root)
(cond ((zero? (user-uid))
(set-gid (->gid "nobody"))
(set-uid (->uid "nobody")))))
(define (main args)
(with-cwd
(file-name-directory (car args))
(let* ((default-options
`((htdocs-dir . ,(absolute-file-name "root/htdocs"))
(surflet-dir . ,(absolute-file-name "root/surflets"))
(cgi-bin-dir . ,(absolute-file-name "root/cgi-bin"))
(images-dir . ,(absolute-file-name "root/img"))
(port . 8080)
(port . 8008)
(log-file-name . "/tmp/httpd.log")
(requests . 5)))
(options (make-options-from-args (cdr args) default-options)))
(cond ((zero? (user-uid))
(set-gid (->gid "nobody"))
(set-uid (->uid "nobody"))))
(format #t "Going to run SUrflet server with:
htdocs-dir: ~a
surflet-dir: ~a
cgi-bin-dir: ~a
images-dir: ~a
port: ~a
log-file-name: ~a
a maximum of ~a simultaneous requests, syslogging activated,
and home-dir-handler (public_html) activated.
NOTE: This is the SUrflet server. It does not support cgi.
"
(lookup-option options 'htdocs-dir)
(lookup-option options 'surflet-dir)
(lookup-option options 'cgi-bin-dir)
(lookup-option options 'images-dir)
(lookup-option options 'port)
(lookup-option options 'log-file-name)
@ -175,7 +168,6 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
with-simultaneous-requests (lookup-option options 'requests)
with-syslog? #t
with-log-file (lookup-option options 'log-file-name)
with-post-bind-thunk become-nobody-if-root
;; The following settings are made to avoid dns lookups.
with-reported-port (lookup-option options 'port)
with-fqdn "localhost"
@ -183,8 +175,7 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
with-request-handler
(alist-path-dispatcher
(list
(cons "cgi-bin" (cgi-handler (lookup-option options 'cgi-bin-dir)))
(cons "seval" seval-handler)
(cons "h" (home-dir-handler "public_html"))
(cons "source" (rooted-file-or-directory-handler
(lookup-option options 'surflet-dir)
(with-file-name->content-type
@ -198,9 +189,8 @@ exec scsh -lel SSAX-4.9/load.scm -lel sunet-2.1/load.scm -dm -o surflet-server -
(cons "surflet" (surflet-handler
(with-surflet-path
(lookup-option options 'surflet-dir)))))
(tilde-home-dir-handler "public_html"
(rooted-file-or-directory-handler
(lookup-option options 'htdocs-dir)))))))))
(rooted-file-or-directory-handler
(lookup-option options 'htdocs-dir))))))))
))
;; EOF

View File

@ -9,24 +9,19 @@
<ul>
<li><a href="/sunet-manual/index.html">SUnet release manual</a></li>
<li><a href="/cgi-bin/comments.sh">A small CGI script</a></li>
<li><a href="/index-surflet.html">SUrflets homepage</a>
(<code>start-surflet-server</code> only)</li>
<li><a href="/index-surflet.html">SUrflets homepage</a></li>
<li><a href="seval.html">Computing Scheme Forms
Interactively</a></li>
<li><a href="files/text.txt">Text file</a></li>
<li><a href="files">Directory</a></li>
<li><a href="files/zipped.gz">Compressed File</a></li>
<li><a href="index.html">This file</a></li>
<li><a href="man/man?man(1)">man ls</a>
(<code>start-extended-web-server</code> only)</li>
<li><a href="info/info?(info.info)Top">Info page for Info</a>
(<code>start-extended-web-server</code> only)</li></li>
</ul>
<br>
<hr>
<!-- Created: Thu Aug 22 16:44:16 CEST 2002 -->
<!-- hhmts start -->
Last modified: Mon May 17 10:13:07 MST 2004
Last modified: Wed Apr 23 09:25:58 MST 2003
<!-- hhmts end -->
</body>
</html>

View File

@ -28,7 +28,7 @@
(p (url "/" "Return to main menu") (br)
(url "add-html.scm" "Start new calculation."))))))))
(let* ((bindings (form-query-list
(http-url-search (surflet-request-url result))))
(http-url-query (surflet-request-url result))))
(number (string->number
(extract-single-binding "number" bindings))))
(if number
@ -53,7 +53,7 @@
(a (@ (href "javascript:history.go(-2)")) "New calculation (same session)")(br)
(a (@ (href ,new-url)) "Close this session")))))))
;; How to clear session data and go to another HTML page:
(send-error (status-code moved-temp) req
(send-error (status-code temp-redirect) req
"/" "/")
))
; ))

View File

@ -2,7 +2,7 @@
(open surflet-requests ; SURFLET-REQUEST-url
httpd-responses ; MAKE-RESPONSE
parse-html-forms ; PARSE-HTML-FORM-QUERY
url ; HTTP-url-SEARCH
url ; http-url-query
srfi-1 ; FILTER
surflet-handler/surflets ; SEND/SUSPEND, SEND/FINISH
surflet-handler/primitives ; MAKE-SURFLET-RESPONSE
@ -79,7 +79,7 @@
(let* ((title (if (pair? maybe-title) (car maybe-title) #f))
(result (send/suspend (make-get-number-page input-text title)))
(bindings (parse-html-form-query
(http-url-search (surflet-request-url result))))
(http-url-query (surflet-request-url result))))
(number (string->number
(extract-single-binding "number" bindings))))
(if number
@ -96,7 +96,7 @@
(send/suspend make-result-page)
;; This finishes the session and does a redirect to the root
;; page.
(send-error (status-code moved-temp) #f "/" "/"))
(send-error (status-code temp-redirect) #f "/" "/"))
))