2005-04-10 09:03:33 -04:00
|
|
|
\chapter{Processing URIs}\label{cha:uri}
|
2003-01-14 10:02:44 -05:00
|
|
|
|
2005-04-10 09:14:02 -04:00
|
|
|
The \ex{uri} module contains library functions for dealing with URIs.
|
2003-01-14 10:02:44 -05:00
|
|
|
|
|
|
|
\section{Notes on URI Syntax}
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2005-04-10 09:03:33 -04:00
|
|
|
The generic syntax of URI (Uniform Resource Identifier) is defined in
|
|
|
|
RFC 2396; see Appendix A for a collected BNF of URI.
|
2003-01-14 10:02:44 -05:00
|
|
|
|
2005-04-10 09:03:33 -04:00
|
|
|
Within URI non-printable Ascii characters are represented by an
|
|
|
|
\emph{escape encoding}. \emph{Reserved} characters used as
|
|
|
|
delimiters indicating the different parts of a URI also must be
|
|
|
|
\emph{escaped} if they are to be regular data of a URI component. The
|
|
|
|
set of characters actually \emph{reserved} within any given URI
|
|
|
|
component is defined by that component. Therefore
|
|
|
|
\emph{escaping} can only be done when the URI is being created from
|
|
|
|
its component parts; likewise, a URI must be separated into its
|
|
|
|
component parts before \emph{unescaping} can be done.
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2005-04-10 09:14:02 -04:00
|
|
|
Escape sequences are of the following scheme: \verb|%| \var{h}\var{h}
|
2005-04-10 09:03:33 -04:00
|
|
|
where \var{h}\var{h} are the two hexadecimal digits representing the octet code. For
|
|
|
|
example \verb|%20| is the escaped encoding for the US-ASCII space character.
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2005-04-10 09:03:33 -04:00
|
|
|
\section{Procedures}
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2005-04-10 09:03:33 -04:00
|
|
|
\defun{unescape}{string}{string}
|
2003-01-14 10:02:44 -05:00
|
|
|
\begin{desc}
|
2005-04-10 09:03:33 -04:00
|
|
|
\ex{Unescape} unescapes a string.
|
2003-01-14 10:02:44 -05:00
|
|
|
\end{desc}
|
|
|
|
%
|
2005-04-10 09:03:33 -04:00
|
|
|
This procedure may only be used \emph{after} the URI was parsed into
|
|
|
|
its component parts (see above).
|
2002-02-12 06:50:54 -05:00
|
|
|
|
2005-04-10 09:03:33 -04:00
|
|
|
\defun{escape} {string regexp} {string}
|
2002-02-12 06:50:54 -05:00
|
|
|
\begin{desc}
|
2005-04-10 09:03:33 -04:00
|
|
|
\ex{Escape} replaces reserved or excluded characters in \var{string}
|
|
|
|
by their escaped representation. \var{regexp} defines which
|
|
|
|
characters are reserved or excluded within the particular URI component
|
|
|
|
being escaped.
|
2002-02-12 06:50:54 -05:00
|
|
|
\end{desc}
|
|
|
|
|
2005-04-10 09:14:02 -04:00
|
|
|
This procedure may only be used on a URI \emph{component part}, not on a
|
2005-04-10 09:03:33 -04:00
|
|
|
complete URI made up of several component parts (see above). Use it to
|
|
|
|
write specialized escape-procedures for the respective component
|
2005-04-10 09:14:02 -04:00
|
|
|
parts. (See the \ex{url} module for examples).
|
2002-02-12 06:50:54 -05:00
|
|
|
|
|
|
|
%%% Local Variables:
|
|
|
|
%%% mode: latex
|
2002-08-21 10:52:34 -04:00
|
|
|
%%% TeX-master: "man"
|
2002-02-12 06:50:54 -05:00
|
|
|
%%% End:
|