\chapter{Processing URIs}\label{cha:uri} The \ex{uri} module contains library functions for dealing with URIs. \section{Notes on URI Syntax} The generic syntax of URI (Uniform Resource Identifier) is defined in RFC 2396; see Appendix A for a collected BNF of URI. Within URI non-printable Ascii characters are represented by an \emph{escape encoding}. \emph{Reserved} characters used as delimiters indicating the different parts of a URI also must be \emph{escaped} if they are to be regular data of a URI component. The set of characters actually \emph{reserved} within any given URI component is defined by that component. Therefore \emph{escaping} can only be done when the URI is being created from its component parts; likewise, a URI must be separated into its component parts before \emph{unescaping} can be done. Escape sequences are of the following scheme: \verb|%| \var{h}\var{h} where \var{h}\var{h} are the two hexadecimal digits representing the octet code. For example \verb|%20| is the escaped encoding for the US-ASCII space character. \section{Procedures} \defun{unescape}{string}{string} \begin{desc} \ex{Unescape} unescapes a string. \end{desc} % This procedure may only be used \emph{after} the URI was parsed into its component parts (see above). \defun{escape} {string regexp} {string} \begin{desc} \ex{Escape} replaces reserved or excluded characters in \var{string} by their escaped representation. \var{regexp} defines which characters are reserved or excluded within the particular URI component being escaped. \end{desc} This procedure may only be used on a URI \emph{component part}, not on a complete URI made up of several component parts (see above). Use it to write specialized escape-procedures for the respective component parts. (See the \ex{url} module for examples). %%% Local Variables: %%% mode: latex %%% TeX-master: "man" %%% End: