284 lines
8.8 KiB
Plaintext
284 lines
8.8 KiB
Plaintext
_XML_ Library
|
|
=============
|
|
|
|
Files: xml.ss xmlr.ss xmls.ss
|
|
Signature: xml^
|
|
|
|
Basic XML Data Types
|
|
====================
|
|
|
|
Document:
|
|
This structure represents an XML document. The only useful part is
|
|
the document-element, which contains all the content. The rest of
|
|
of the structure contains DTD information, which isn't supported,
|
|
and processing-instructions.
|
|
|
|
Element:
|
|
Each pair of start/end tags and everything in between is an element.
|
|
It has the following pieces:
|
|
a name
|
|
attributes
|
|
contents including sub-elements
|
|
Xexpr:
|
|
S-expression representations of XML data.
|
|
|
|
The end of this document has more details.
|
|
|
|
Functions
|
|
=========
|
|
|
|
> read-xml : [Input-port] -> Document
|
|
reads in an XML document from the given or current input port
|
|
XML documents contain exactly one element. It throws an xml-read:error
|
|
if there isn't any element or if there are more than one element.
|
|
|
|
Malformed xml is reported with source locations in
|
|
the form `l.c/o', where l is the line number, c is
|
|
the column number and o is the number of characters
|
|
from the beginning of the file.
|
|
|
|
> write-xml : Document [Output-port] -> Void
|
|
writes a document to the given or current output port, currently
|
|
ignoring everything except the document's root element.
|
|
|
|
> write-xml/content : Content [Output-port] -> Void
|
|
writes a document's contents to the given or current output port
|
|
|
|
> display-xml : Document [Output-port] -> Void
|
|
just like write-xml, but newlines and indentation make the output more
|
|
readable, though less technically correct when white space is
|
|
significant.
|
|
|
|
> display-xml/content : Content [Output-port] -> Void
|
|
just like write-xml/content, but with indentation and newlines
|
|
|
|
> xml->xexpr : Content -> Xexpr
|
|
converts the interesting part of an XML document into an Xexpression
|
|
|
|
> xexpr->xml : Xexpr -> Content
|
|
converts an Xexpression into the interesting part of an XML document
|
|
|
|
> xexpr->string : Xexpression -> String
|
|
converts an Xexpression into a string representation
|
|
|
|
> eliminate-whitespace : (listof Symbol) (Bool -> Bool) -> Element -> Element
|
|
Some elements should not contain any text, only other tags, except they
|
|
often contain whitespace for formating purposes. Given a list of tag names
|
|
and the identity function, eliminate-whitespace produces a function that
|
|
filters out pcdata consisting solely of whitespace from those elements and
|
|
raises and error if any non-whitespace text appears. Passing in the function
|
|
called "not" instead of the identity function filters all elements which are not
|
|
named in the list. Using void filters all elements regardless of the list.
|
|
|
|
Parameters
|
|
==========
|
|
|
|
> empty-tag-shorthand : 'always | 'never | (listof Symbol)
|
|
Default: 'always
|
|
This determines if the output functions should use the <empty/> tag
|
|
notation instead of writing <empty></empty>. The first form is the
|
|
preferred XML notation. However, most browsers designed for HTML
|
|
will only properly render XHTML if the document uses a mixture of the
|
|
two formats. _html-empty-tags_ contains the W3 consortium's
|
|
recommended list of XHTML tags that should use the shorthand.
|
|
|
|
> collapse-whitespace : Bool
|
|
Default: #f
|
|
All consecutive whitespace is replaced by a single space.
|
|
CDATA sections are not affected.
|
|
|
|
> trim-whitespace : Bool
|
|
This parameter no longer exists. Consider using collapse-whitespace
|
|
and eliminate-whitespace instead.
|
|
|
|
> read-comments : Bool
|
|
Default: #f
|
|
Comments, by definition, should be ignored by programs. However,
|
|
interoperating with ad hoc extentions to other languages sometimes
|
|
requires processing comments anyway.
|
|
|
|
> xexpr-drop-empty-attributes : Bool
|
|
Default: #f
|
|
It's easier to write functions processing Xexpressions, if they always
|
|
have a list of attributes. On the other hand, it's less cumbersome to
|
|
write Xexpresssions by hand without empty lists of attributes
|
|
everywhere. Normally xml->xexpr leaves in empty attribute lists.
|
|
Setting this parameter to #t drops them, so further editing the
|
|
Xexpression by hand is less annoying.
|
|
|
|
Examples
|
|
========
|
|
|
|
Reading an Xexpression:
|
|
(xml->xexpr (document-element (read-xml input-port)))
|
|
|
|
Writing an Xexpression:
|
|
(empty-tag-shorthand html-empty-tags)
|
|
(write-xml/content (xexpr->xml `(html (head (title ,banner))
|
|
(body ((bgcolor "white"))
|
|
,text)))
|
|
output-port)
|
|
|
|
What this Library Doesn't Provide
|
|
=================================
|
|
|
|
Document Type Declaration (DTD) processing
|
|
Validation
|
|
Expanding user-defined entites
|
|
Reading user-defined entites in attributes
|
|
Unicode support
|
|
|
|
XML Datatype Details
|
|
====================
|
|
|
|
Note: Users of the XML collection don't need to know most of these definitions.
|
|
|
|
Note: Xexpr is the only important one to understand. Even then,
|
|
Processing-instructions may be ignored.
|
|
|
|
> Xexpr = String
|
|
| (list* Symbol (listof (list Symbol String)) (list Xexpr))
|
|
| (cons Symbol (listof Xexpr)) ;; an element with no attributes
|
|
| Symbol ;; symbolic entities such as
|
|
| Number ;; numeric entities like 
|
|
| Misc
|
|
|
|
> Document = (make-document Prolog Element (listof Processing-instruction))
|
|
(define-struct document (prolog element misc))
|
|
|
|
> Prolog = (make-prolog (listof Misc) Document-type [Misc ...])
|
|
(define-struct prolog (misc dtd misc2))
|
|
The last field is a (listof Misc), but the maker accepts optional
|
|
arguments instead for backwards compatibility.
|
|
|
|
> Document-type = #f | (make-document-type Symbol External-dtd #f)
|
|
(define-struct document-type (name external inlined))
|
|
|
|
> External-dtd = (make-external-dtd/public str str)
|
|
| (make-external-dtd/system str)
|
|
| #f
|
|
(define-struct external-dtd (system))
|
|
(define-struct (external-dtd/public external-dtd) (public))
|
|
(define-struct (external-dtd/system external-dtd) ())
|
|
|
|
> Element = (make-element Location Location
|
|
Symbol
|
|
(listof Attribute)
|
|
(listof Content))
|
|
(define-struct (element struct:source) (name attributes content))
|
|
|
|
> Attribute = (make-attribute Location Location Symbol String)
|
|
(define-struct (attribute struct:source) (name value))
|
|
|
|
> Content = Pcdata
|
|
| Element
|
|
| Entity
|
|
| Misc
|
|
|
|
Misc = Comment
|
|
| Processing-instruction
|
|
|
|
> Pcdata = (make-pcdata Location Location String)
|
|
(define-struct (pcdata struct:source) (string))
|
|
|
|
> Entity = (make-entity (U Nat Symbol))
|
|
(define-struct entity (text))
|
|
|
|
> Processing-instruction = (make-pi Location Location String (list String))
|
|
(define-struct (pi struct:source) (target-name instruction))
|
|
|
|
> Comment = (make-comment String)
|
|
(define-struct comment (text))
|
|
|
|
Source = (make-source Location Location)
|
|
(define-struct source (start stop))
|
|
|
|
Location = Nat
|
|
| Symbol
|
|
|
|
|
|
The PList Library
|
|
=================
|
|
|
|
Files: plist.ss
|
|
|
|
The PList library provides the ability to read and write xml documents which
|
|
conform to the "plist" DTD, used to store 'dictionaries' of string - value
|
|
associations.
|
|
|
|
To Load
|
|
=======
|
|
|
|
(require (lib "plist.ss" "xml"))
|
|
|
|
Functions
|
|
=========
|
|
|
|
> read-plist : Port -> PLDict
|
|
reads a plist from a port, and produces a 'dict' x-expression
|
|
|
|
> write-plist : PLDict Port -> Void
|
|
writes a plist to the given port. May raise the exn:application:type
|
|
exception if the plist is badly formed.
|
|
|
|
Datatypes
|
|
=========
|
|
|
|
NB: all of these are subtypes of x-expression:
|
|
|
|
> PLDict = (list 'dict Assoc-pair ...)
|
|
|
|
> PLAssoc-pair = (list 'assoc-pair String PLValue)
|
|
|
|
> PLValue = String
|
|
|
|
| (list 'true)
|
|
| (list 'false)
|
|
| (list 'integer Integer)
|
|
| (list 'real Real)
|
|
| PLDict
|
|
| PLArray
|
|
|
|
> PLArray = (list 'array PLValue ...)
|
|
|
|
In fact, the PList DTD also defines Data and Date types, but we're ignoring
|
|
these for the moment.
|
|
|
|
Examples
|
|
========
|
|
|
|
Here's a sample PLDict:
|
|
|
|
(define my-dict
|
|
`(dict (assoc-pair "first-key"
|
|
"just a string
|
|
with some whitespace in it")
|
|
(assoc-pair "second-key"
|
|
(false))
|
|
(assoc-pair "third-key"
|
|
(dict ))
|
|
(assoc-pair "fourth-key"
|
|
(dict (assoc-pair "inner-key"
|
|
(real 3.432))))
|
|
(assoc-pair "fifth-key"
|
|
(array (integer 14)
|
|
"another string"
|
|
(true)))
|
|
(assoc-pair "sixth-key"
|
|
(array))))
|
|
|
|
Let's write it to disk:
|
|
|
|
(call-with-output-file "/Users/clements/tmp.plist"
|
|
(lambda (port)
|
|
(write-plist my-dict port))
|
|
'truncate)
|
|
|
|
Let's read it back from the disk:
|
|
|
|
(define new-dict
|
|
(call-with-input-file "/Users/clements/tmp.plist"
|
|
(lambda (port)
|
|
(read-plist port))))
|
|
|