Reasonably complete and up-to-date docs for RFC822.

This commit is contained in:
sperber 2003-01-21 09:00:56 +00:00
parent be87dc978e
commit 7b82bb70e0
1 changed files with 52 additions and 84 deletions

View File

@ -1,47 +1,67 @@
\chapter{RFC~822 Library}\label{cha:rfc822} \chapter{RFC~822 Library}\label{cha:rfc822}
% %
The \ex{rfc822} structure provides rudimentary support for parsing The \ex{rfc822} structure provides rudimentary support for parsing
headers according to RFC 822 \textit{Standard for the format of ARPA headers according to RFC~822 \textit{Standard for the format of ARPA
Internet text messages}. These headers show up in SMTP messages, Internet text messages}. These headers show up in SMTP messages,
HTTP headers, etc. HTTP headers, etc.
\defun{read-rfc822-field} {[port] [read-line]} {name body} An RFC~822 header field consists of a \textit{field name} and a
\textit{field body}, like so:
%
\begin{verbatim}
Subject: RFC 822 can format itself in the ARPA
\end{verbatim}
%
Here, the field name is `\ex{Subject}', and the field name is `\ex{
RFC 822 can format itself in the ARPA}' (note the leading space).
The field body can be spread over several lines:
%
\begin{verbatim}
Subject: RFC 822 can format itself
in the ARPA
\end{verbatim}
%
In this case, RFC~822 specifies that the meaning of the field body is
actually all the lines of the body concatenated, without the
intervening line breaks.
The \ex{rfc822} structure provides two sets of parsing
procedures---one represents field bodies in the RFC-822-specified
meaning, as a single string, the other (with \ex{-with-line-breaks}
appended to the names) reflects the line breaks and represents the
bodies as a list of string, one for each line. The latter set only
marginally useful---mainly for code that needs to output headers in
the same form as they were originally provided.
\defun{read-rfc822-field}{[port] [read-line]}{name body}
\defun{read-rfc822-field-with-line-breaks}{[port] [read-line]}{name body-lines}
\begin{desc} \begin{desc}
Read one field from the port, and return two values: Read one field from the port, and return two values:
%
\begin{description} \begin{description}
\item[\var{name}] This is a symbol describing the RFC 822 field \item[\var{name}] This is a symbol describing the field
name, such as \ex{subject} or \ex{to}. The symbol consists of all name, such as \ex{subject} or \ex{to}. The symbol consists of all
lower-case letters.\footnote{In fact, it \ex{read-rfc822-field} lower-case letters.\footnote{In fact, it \ex{read-rfc822-field}
uses the preferred case for symbols of the underlying Scheme uses the preferred case for symbols of the underlying Scheme
implementation which, in the case of scsh, happens to be lower-case.} implementation which, in the case of scsh, happens to be lower-case.}
\item[\var{body}] This is list of strings which are the field's \item[\var{body} or \var{body-lines}] This is the field body.
body, e.g. Each list element is one line from the field's body, \var{Body} is a single string, \var{body-lines} is a list of
so if the field spreads out over three lines, then the body is a strings, one for each line of the body. In each case,
list of three strings. The terminating \ex{cr}/\ex{lf}'s are the terminating \ex{cr}/\ex{lf}'s (but nothing else) are
trimmed from each string. Note that header bodies frequently contain trimmed from each string.
space after the colon like this:
%
\begin{verbatim}
Subject: RFC 822 can format itself in the ARPA
\end{verbatim}
%
In this case, \var{body} will be
\begin{verbatim}
(" RFC 822 can format itself in the ARPA")
\end{verbatim}
\end{description} \end{description}
% %
When there are no more fields---EOF or a blank line has terminated When there are no more fields---EOF or a blank line has terminated
the header section---then \ex{read-rfc822-field} returns [\sharpf\ \sharpf]. the header section---then both procedures returns [\sharpf\
\sharpf].
\var{Port} is an optional input port to read from---it defaults to \var{Port} is an optional input port to read from---it defaults to
the value of \ex{(current-input-port)}. the value of \ex{(current-input-port)}.
\var{Read-line} is an optional parameter specifying a procedure of \var{Read-line} is an optional parameter specifying a procedure of
one argument (the input port) used to read the raw header lines. one argument (the input port) used to read the raw header lines.
The default used by \ex{read-rfc822-field} terminates lines with The default used by these procedures terminates lines with
either \ex{cr}/\ex{lf} or just \ex{lf}, and it trims the terminator either \ex{cr}/\ex{lf} or just \ex{lf}, and it trims the terminator
from the line. This procedure should trim the terminator of the from the line. This procedure should trim the terminator of the
line, so an empty line is returned as an empty string. line, so an empty line is returned as an empty string.
@ -51,74 +71,22 @@ Subject: RFC 822 can format itself in the ARPA
RFC~822. RFC~822.
\end{desc} \end{desc}
\defun{read-rfc822-headers} {[port] [read-line]} {association-list} \defun{read-rfc822-headers} {[port] [read-line]} {alist}
\defunx{read-rfc822-headers-with-line-breaks} {[port] [read-line]} {alist}
\begin{desc} \begin{desc}
This procedure reads in and parses a section of text that looks like This procedure reads in and parses a section of text that looks like
the header portion of an RFC~822 message. It returns an association the header portion of an RFC~822 message. It returns an association
list mapping a field name (a symbol such as 'date or 'subject) to a list mapping field names (a symbol such as \ex{date} or \ex{subject}) to
list of field bodies----one for each occurence of the field in the field bodies. The representation of the field bodies is as with
header. So if there are five \ex{Received-by} fields in the header, \ex{read-rfc822-field} and \ex{read-rfc822-field-with-line-breaks}.
the alist maps \ex{received-by} to a five-element list. Each body is
in turn represented by a list of strings----one for each line of the
field. So a field spread across three lines would produce a
three-element body.
\var{Port} and \var{read-line} are as with \ex{read-rfc822-field}. These procedures preserve the order of the header fields. Note that
\end{desc} several header fields might share the same field name---in that
case, the returned alist will contain several entries with the same
\ex{car}.
\defun{rejoin-header-lines} {alist [seperator]} {association list} \var{Port} and \var{read-line} are as with \ex{read-rfc822-field}
\begin{desc} and \ex{read-rfc822-field-with-line-breaks}.
Takes a field \var{alist} such as is returned by
\ex{read-rfc822-headers} and returns an equivalent association list.
Each body (\str list) in the input \var{alist} is joined into a
single list in the output alist. \var{separator} is the string used
to join these elements together; it defaults to a single space, but
can usefully be ``\verb|\n|'' (linefeed) or ``\verb|\r\n|''
(carriage-return/line-feed).
To rejoin a single body list, use scsh's \ex{join-strings}
procedure.
\end{desc}
%
For the following definitions' examples, let's use this set of of
RFC~822 headers:
\begin{alltt}
From: shivers
To: ziggy,
newts
To: gjs, tk
\end{alltt}
%
\defun{get-header-all} {headers name} {string list list}
\begin{desc}
Returns all entries or \sharpf, e.g.\
\codex{(get-header-all hdrs 'to)}
returns
\codex{'((" ziggy," " newts") (" gjs, tk"))}
\end{desc}
\defun{get-header-lines} {headers name} {string list}
\begin{desc}
Returns all lines of the first entry or \sharpf, e.g.\
\codex{(get-header-lines hdrs 'to)}
returns
\codex{(" ziggy," " newts")}
\end{desc}
\defun{get-header} {headers name [separator]} {string}
\begin{desc}
Returns the first entry with the lines joined together by seperator
(newline by default), e.g.\
\codex{(get-header hdrs 'to)}
returns
\begin{alltt}
" ziggy,
newts"
\end{alltt}
%
Note, that \ex{newts} is led by two spaces.
\end{desc} \end{desc}
%%% Local Variables: %%% Local Variables: