This file documents names defined in rfc822.scm: NOTES A note on line-terminators: Line-terminating sequences are always a drag, because there's no agreement on them -- the Net protocols and DOS use cr/lf; Unix uses lf; the Mac uses cr. One one hand, you'd like to use the code for all of the above, on the other, you'd also like to use the code for strict applications that need definitely not to recognise bare cr's or lf's as terminators. RFC 822 requires a cr/lf (carriage-return/line-feed) pair to terminate lines of text. On the other hand, careful perusal of the text shows up some ambiguities (there are maybe three or four of these, and I'm too lazy to write them all down). Furthermore, it is an unfortunate fact that many Unix apps separate lines of RFC 822 text with simple linefeeds (e.g., messages kept in /usr/spool/mail). As a result, this code takes a broad-minded view of line-terminators: lines can be terminated by either cr/lf or just lf, and either terminating sequence is trimmed. If you need stricter parsing, you can call the lower-level procedure %READ-RFC-822-FIELD and %READ-RFC822-HEADERS procs. They take the read-line procedure as an extra parameter. This means that you can pass in a procedure that recognises only cr/lf's, or only cr's (for a Mac app, perhaps), and you can determine whether or not the terminators get trimmed. However, your read-line procedure must indicate the header-terminating empty line by returning *either* the empty string or the two-char string cr/lf (or the EOF object). DEFINITIONS AND DESCRIPTIONS (read-rfc822-field [port]) (%read-rfc822-field read-line port) Read one field from the port, and return two values [NAME BODY]: - NAME Symbol such as 'subject or 'to. The field name is converted to a symbol using the Scheme implementation's preferred case. If the implementation reads symbols in a case-sensitive fashion (e.g., scsh), lowercase is used. This means you can compare these symbols to quoted constants using EQ?. When printing these field names out, it looks best if you capitalise them with (CAPITALIZE-STRING (SYMBOL->STRING FIELD-NAME)). - BODY List of strings which are the field's body, e.g. ("shivers@lcs.mit.edu"). Each list element is one line from the field's body, so if the field spreads out over three lines, then the body is a list of three strings. The terminating cr/lf's are trimmed from each string. A leading space or a leading horizontal tab is also trimmed, but one and onyl one. When there are no more fields -- EOF or a blank line has terminated the header section -- then the procedure returns [#f #f]. The %READ-RFC822-FIELD variant allows you to specify your own read-line procedure. The one used by READ-RFC822-FIELD terminates lines with either cr/lf or just lf, and it trims the terminator from the line. Your read-line procedure should trim the terminator of the line, so an empty line is returned as an empty string. The procedures raise an error if the syntax of the read field (the line returned by the read-line-function) is illegal (RFC822 illegal). read-rfc822-headers [port] %read-rfc822-headers read-line port Read in and parse up a section of text that looks like the header portion of an RFC 822 message. Return an alist mapping a field name (a symbol such as 'date or 'subject) to a list of field bodies -- one for each occurence of the field in the header. So if there are five "Received-by:" fields in the header, the alist maps 'received-by to a five element list. Each body is in turn represented by a list of strings -- one for each line of the field. So a field spread across three lines would produce a three element body. The %READ-RFC822-HEADERS variant allows you to specify your own read-line procedure. See notes (A note on line-terminators) above for reasons why. rejoin-header-lines alist [seperator] Takes a field alist such as is returned by READ-RFC822-HEADERS and returns an equivalent alist. Each body (string list) in the input alist is joined into a single list in the output alist. SEPARATOR is the string used to join these elements together; it defaults to a single space " ", but can usefully be "\n" or "\r\n". To rejoin a single body list, use scsh's JOIN-STRINGS procedure. For the following definitions' examples, let's use this set of of RFC822 headers: From: shivers To: ziggy, newts To: gjs, tk get-header-all headers name returns all entries or #f, p.e. (get-header-all hdrs 'to) -> ((" ziggy," " newts") (" gjs, tk")) get-header-lines headers name returns all lines of the first entry or #f, p.e. (get-header-lines hdrs 'to) -> (" ziggy," " newts") get-headers headers name [seperator] returns the first entry with the lines joined together by seperator (newline by default (\n)), p.e. (get-header hdrs 'to) -> "ziggy,\n newts" htab is the horizontal tab (ascii-code 9) string->symbol-pref is a procedure that takes a string and converts it to a symbol using the Scheme implementation's preferred case. The preferred case is recognized by a doing a symbol->string conversion of 'a. DESIREABLE FUNCTIONALITIES - Unfolding long lines. - Lexing structured fields. - Unlexing structured fields into canonical form. - Parsing and unparsing dates. - Parsing and unparsing addresses.