2003-01-15 07:21:18 -05:00
|
|
|
\chapter{Parsing and Processing URLs}\label{cha:url}
|
2002-02-12 06:50:54 -05:00
|
|
|
%
|
2003-01-15 07:21:18 -05:00
|
|
|
This modules contains procedures to parse and unparse URLs. Until
|
|
|
|
now, only the parsing of HTTP URLs is implemented.
|
|
|
|
|
|
|
|
\section{Server Records}
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
A \textit{server} value describes path prefixes of the form
|
|
|
|
\var{user}:\var{password}@\var{host}:\var{port}. These are
|
|
|
|
frequently used as the initial prefix of URLs describing Internet
|
|
|
|
resources.
|
|
|
|
|
|
|
|
\defun{make-server}{user password host port}{server}
|
|
|
|
\defunx{server?}{thing}{boolean}
|
|
|
|
\defunx{server-user}{server}{string-or-\sharpf}
|
|
|
|
\defunx{server-password}{server}{string-or-\sharpf}
|
|
|
|
\defunx{server-host}{server}{string-or-\sharpf}
|
|
|
|
\defunx{server-port}{server}{string-or-\sharpf}
|
2002-02-12 06:50:54 -05:00
|
|
|
\begin{desc}
|
2003-01-15 07:21:18 -05:00
|
|
|
\ex{Make-server} creates a new server record. Each slot is a
|
|
|
|
decoded string or \sharpf. (\var{Port} is also a string.)
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\ex{server?} is the corresponding predicate, \ex{server-user},
|
|
|
|
\ex{server-password}, \ex{server-host} and \ex{server-port}
|
|
|
|
are the correspondig selectors.
|
2002-02-12 06:50:54 -05:00
|
|
|
\end{desc}
|
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\defun{parse-server}{path default}{server}
|
|
|
|
\defunx{server->string}{server}{string}
|
|
|
|
\begin{desc}
|
|
|
|
\ex{Parse-server} parses a URI path \var{path} (a list representing
|
|
|
|
a path, not a string) into a server value. Default values are taken
|
|
|
|
from the server \var{default} except for the host. The values
|
|
|
|
are unescaped and stored into a server record that is returned.
|
|
|
|
\ex{Fatal-syntax-error} is called, if the specified path has no
|
|
|
|
initial to slashes (i.e., it starts with `//\ldots').
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\ex{server->string} just does the inverse job: it unparses
|
|
|
|
\var{server} into a string. The elements of the record
|
|
|
|
are escaped before they are put together.
|
2002-05-12 01:52:57 -04:00
|
|
|
|
|
|
|
Example:
|
2002-08-21 09:35:41 -04:00
|
|
|
\begin{alltt}
|
2003-01-15 07:21:18 -05:00
|
|
|
> (define default (make-server "andreas" "se ret" "www.sf.net" "80"))
|
|
|
|
> (server->string default)
|
2002-05-12 01:52:57 -04:00
|
|
|
"andreas:se\%20ret@www.sf.net:80"
|
2003-01-15 07:21:18 -05:00
|
|
|
> (parse-server '("" "" "foo\%20bar@www.scsh.net" "docu" "index.html")
|
|
|
|
default)
|
|
|
|
'#{server}
|
|
|
|
> (server->string ##)
|
2002-05-12 01:52:57 -04:00
|
|
|
"foo\%20bar:se\%20ret@www.scsh.net:80"
|
2002-08-21 09:35:41 -04:00
|
|
|
\end{alltt}
|
2003-01-15 07:21:18 -05:00
|
|
|
%
|
|
|
|
For details about escaping and unescaping see Chapter~\ref{cha:uri}.
|
|
|
|
\end{desc}
|
2002-05-12 01:52:57 -04:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\section{HTTP URLs}
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2004-10-10 14:30:45 -04:00
|
|
|
\defun{parse-uri} {uri-string } {host port path query} \label{proc:parse-uri}
|
|
|
|
\begin{desc}
|
|
|
|
Parses an HTTP 1.1 \var{uri\=string} into its four fields.
|
|
|
|
The fields returned are \emph{not} decoded.
|
|
|
|
If \var{uri\=string} is not an http URL but an abs\_path
|
|
|
|
the \var{host}, \var{port}
|
|
|
|
and \var{query} portions are not specified, they are \sharpf.
|
|
|
|
Otherwise, \var{host}, \var{port}, and \var{query} are
|
|
|
|
strings. \var{path} is a non-empty string list---the path split
|
|
|
|
at slashes.
|
|
|
|
\end{desc}
|
|
|
|
This parser does not absolutely conform to RFC 2616 in allowing
|
|
|
|
a fragment-suffix. Furthermore only http URLs, not absolute URIs in general are
|
|
|
|
recognized.
|
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\defun{make-http-url}{server path search frag-id}{http-url}
|
2002-02-12 06:50:54 -05:00
|
|
|
\defunx{http-url?}{thing}{boolean}
|
2003-01-15 07:21:18 -05:00
|
|
|
\defunx{http-url-server}{http-url}{server}
|
|
|
|
\defunx{http-url-path}{http-url}{list}
|
|
|
|
\defunx{http-url-search}{http-url}{string-or-\sharpf}
|
|
|
|
\defunx{http-url-frag-ment-identifier}{http-url}{string-or-\sharpf}
|
|
|
|
%
|
2002-02-12 06:50:54 -05:00
|
|
|
\begin{desc}
|
2003-01-15 07:21:18 -05:00
|
|
|
\ex{Make-http-url} creates a new \ex{httpd-url} record.
|
|
|
|
\var{Server} is a record, containing the initial part of the address
|
|
|
|
(like \ex{anonymous@clark.lcs.mit.edu:80}). \var{Path} contains the
|
|
|
|
URL's URI path ( a list). These elements are in raw, unescaped
|
|
|
|
format. To convert them back to a string, use
|
2004-01-13 09:53:00 -05:00
|
|
|
\ex{(uri-path->uri (map escape-uri pathlist))}. \var{Search}
|
2003-01-15 07:21:18 -05:00
|
|
|
and \var{frag-id} are the last two parts of the URL. (See
|
|
|
|
Chapter~\ref{cha:uri} about parts of an URI.)
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\ex{Http-url?} is the predicate for HTTP URL values, and
|
|
|
|
\ex{http-url-server}, \ex{http-url-path}, \ex{http-url-search} and
|
|
|
|
\ex{http-url-fragment-identifier} are the corresponding selectors.
|
2002-02-12 06:50:54 -05:00
|
|
|
\end{desc}
|
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
\defun{parse-http-url}{path search frag-id}{http-url}
|
|
|
|
\begin{defundescx}{http-url->string}{http-url}{string}
|
|
|
|
This constructs an HTTP URL record from a URI path (a list of path
|
|
|
|
components), a search, and a frag-id component.
|
|
|
|
|
|
|
|
\ex{Http-url->string} just does the inverse job. It converts an
|
|
|
|
HTTP URL record into a string.
|
2002-02-12 06:50:54 -05:00
|
|
|
\end{defundescx}
|
2003-01-15 07:21:18 -05:00
|
|
|
%
|
|
|
|
Note: The URI parser \ex{parse-uri} maps a string to four parts:
|
|
|
|
\var{scheme}, \var{path}, \var{search} and \var{frag-id} (see
|
|
|
|
Section~\ref{proc:parse-uri} for details). If \var{scheme} is
|
|
|
|
\ex{http}, then the other three parts can be passed to
|
|
|
|
\ex{parse-http-url}, which parses them into a \ex{http-url} record.
|
|
|
|
All strings come back from the URI parser encoded. \var{Search} and
|
|
|
|
\var{frag-id} are left that way; this parser decodes the path
|
|
|
|
elements. The first two list elements of the path indicating the
|
|
|
|
leading double-slash are omitted.
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2003-01-15 07:21:18 -05:00
|
|
|
The following procedure combines the jobs of \ex{parse-uri} and
|
|
|
|
\ex{parse-http-url}:
|
|
|
|
|
|
|
|
\defun{parse-http-url-string}{string}{http-url}
|
|
|
|
\begin{desc}
|
|
|
|
This parses an HTTP URL and returns the corresponding URL value; it
|
|
|
|
calls \ex{fatal-syntax-error} if the URL string doesn't have an
|
|
|
|
\ex{http} scheme.
|
|
|
|
\end{desc}
|
2002-02-12 06:50:54 -05:00
|
|
|
|
|
|
|
%%% Local Variables:
|
|
|
|
%%% mode: latex
|
2002-08-21 10:52:34 -04:00
|
|
|
%%% TeX-master: "man"
|
2002-02-12 06:50:54 -05:00
|
|
|
%%% End:
|