sunet/doc/latex/uri.tex

51 lines
1.9 KiB
TeX

\chapter{Processing URIs}\label{cha:uri}
The \ex{uri} module contains library functions for dealing with URIs.
\section{Notes on URI Syntax}
The generic syntax of URI (Uniform Resource Identifier) is defined in
RFC 2396; see Appendix A for a collected BNF of URI.
Within URI non-printable Ascii characters are represented by an
\emph{escape encoding}. \emph{Reserved} characters used as
delimiters indicating the different parts of a URI also must be
\emph{escaped} if they are to be regular data of a URI component. The
set of characters actually \emph{reserved} within any given URI
component is defined by that component. Therefore
\emph{escaping} can only be done when the URI is being created from
its component parts; likewise, a URI must be separated into its
component parts before \emph{unescaping} can be done.
Escape sequences are of the following scheme: \verb|%| \var{h}\var{h}
where \var{h}\var{h} are the two hexadecimal digits representing the octet code. For
example \verb|%20| is the escaped encoding for the US-ASCII space character.
\section{Procedures}
\defun{unescape}{string}{string}
\begin{desc}
\ex{Unescape} unescapes a string.
\end{desc}
%
This procedure may only be used \emph{after} the URI was parsed into
its component parts (see above).
\defun{escape} {string regexp} {string}
\begin{desc}
\ex{Escape} replaces reserved or excluded characters in \var{string}
by their escaped representation. \var{regexp} defines which
characters are reserved or excluded within the particular URI component
being escaped.
\end{desc}
This procedure may only be used on a URI \emph{component part}, not on a
complete URI made up of several component parts (see above). Use it to
write specialized escape-procedures for the respective component
parts. (See the \ex{url} module for examples).
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "man"
%%% End: