adapt documentation to reflect removal of old uri-parser and addition
of new one
This commit is contained in:
parent
a1e79c4fc7
commit
8de8e01f0d
|
@ -1,6 +1,8 @@
|
||||||
\chapter{Parsing and Processing URIs}\label{cha:uri}
|
\chapter{Parsing and Processing URIs}\label{cha:uri}
|
||||||
|
|
||||||
The \ex{uri} structure contains a library for dealing with URIs.
|
The \ex{uri} structure contains a library for dealing with URIs.
|
||||||
|
and is out-of-date by now---it is build up on RFC 1360 of 1994 which
|
||||||
|
was replaced by RFC 1738, RFC 1808, and finally RFC 2396 of 1998.
|
||||||
|
|
||||||
\section{Notes on URI Syntax}
|
\section{Notes on URI Syntax}
|
||||||
|
|
||||||
|
@ -24,43 +26,6 @@ be used in a URI.
|
||||||
|
|
||||||
\section{Procedures}
|
\section{Procedures}
|
||||||
|
|
||||||
\defun{parse-uri} {uri-string } {scheme path search
|
|
||||||
frag-id} \label{proc:parse-uri}
|
|
||||||
\begin{desc}
|
|
||||||
Parses an \var{uri\=string} into its four fields.
|
|
||||||
The fields are \emph{not} unescaped, as the rules for
|
|
||||||
parsing the \var{path} component in particular need unescaped
|
|
||||||
text, and are dependent on \var{scheme}. The URL parser is
|
|
||||||
responsible for doing this. If the \var{scheme}, \var{search}
|
|
||||||
or \var{fragid} portions are not specified, they are \sharpf.
|
|
||||||
Otherwise, \var{scheme}, \var{search}, and \var{fragid} are
|
|
||||||
strings. \var{path} is a non-empty string list---the path split
|
|
||||||
at slashes.
|
|
||||||
\end{desc}
|
|
||||||
|
|
||||||
Here is a description of the parsing technique. It is inwards from
|
|
||||||
both ends:
|
|
||||||
\begin{itemize}
|
|
||||||
\item First, the code searches forwards for the first reserved
|
|
||||||
character (\verb|=|, \verb|;|, \verb|/|, \verb|#|, \verb|?|,
|
|
||||||
\verb|:| or \verb|space|). If it's a colon, then that's the
|
|
||||||
\var{scheme} part, otherwise there is no \var{scheme} part. At
|
|
||||||
all events, it is removed.
|
|
||||||
\item Then the code searches backwards from the end for the last reserved
|
|
||||||
char. If it's a sharp, then that's the \var{fragid} part---remove it.
|
|
||||||
\item Then the code searches backwards from the end for the last reserved
|
|
||||||
char. If it's a question-mark, then that's the \var{search}
|
|
||||||
part----remove it.
|
|
||||||
\item What's left is the path. The code split it at slashes. The
|
|
||||||
empty string becomes a list containing the empty string.
|
|
||||||
\end{itemize}
|
|
||||||
%
|
|
||||||
This scheme is tolerant of the various ways people build broken
|
|
||||||
URI's out there on the Net\footnote{So it does not absolutely conform
|
|
||||||
to RFC~1630.}, e.g.\ \verb|=| is a reserved character, but used
|
|
||||||
unescaped in the search-part. It was given to me\footnote{That's
|
|
||||||
Olin Shivers.} by Dan Connolly of the W3C and slightly modified.
|
|
||||||
|
|
||||||
\defun{unescape-uri}{string [start] [end]}{string}
|
\defun{unescape-uri}{string [start] [end]}{string}
|
||||||
\begin{desc}
|
\begin{desc}
|
||||||
\ex{Unescape-uri} unescapes a string. If \var{start} and/or \var{end} are
|
\ex{Unescape-uri} unescapes a string. If \var{start} and/or \var{end} are
|
||||||
|
|
|
@ -56,6 +56,21 @@ For details about escaping and unescaping see Chapter~\ref{cha:uri}.
|
||||||
|
|
||||||
\section{HTTP URLs}
|
\section{HTTP URLs}
|
||||||
|
|
||||||
|
\defun{parse-uri} {uri-string } {host port path query} \label{proc:parse-uri}
|
||||||
|
\begin{desc}
|
||||||
|
Parses an HTTP 1.1 \var{uri\=string} into its four fields.
|
||||||
|
The fields returned are \emph{not} decoded.
|
||||||
|
If \var{uri\=string} is not an http URL but an abs\_path
|
||||||
|
the \var{host}, \var{port}
|
||||||
|
and \var{query} portions are not specified, they are \sharpf.
|
||||||
|
Otherwise, \var{host}, \var{port}, and \var{query} are
|
||||||
|
strings. \var{path} is a non-empty string list---the path split
|
||||||
|
at slashes.
|
||||||
|
\end{desc}
|
||||||
|
This parser does not absolutely conform to RFC 2616 in allowing
|
||||||
|
a fragment-suffix. Furthermore only http URLs, not absolute URIs in general are
|
||||||
|
recognized.
|
||||||
|
|
||||||
\defun{make-http-url}{server path search frag-id}{http-url}
|
\defun{make-http-url}{server path search frag-id}{http-url}
|
||||||
\defunx{http-url?}{thing}{boolean}
|
\defunx{http-url?}{thing}{boolean}
|
||||||
\defunx{http-url-server}{http-url}{server}
|
\defunx{http-url-server}{http-url}{server}
|
||||||
|
|
Loading…
Reference in New Issue