151 lines
4.7 KiB
Plaintext
151 lines
4.7 KiB
Plaintext
This file documents names specified in uri.scm.
|
|
|
|
|
|
|
|
|
|
NOTES
|
|
|
|
URIs are of following syntax:
|
|
|
|
[scheme] : path [? search ] [# fragmentid]
|
|
|
|
Parts in [] may be ommitted. The last part is usually referred to as
|
|
fragid in this document.
|
|
|
|
|
|
|
|
DEFINITIONS AND DESCRIPTIONS
|
|
|
|
|
|
char-set
|
|
uri-reserved
|
|
|
|
A list of reserved characters (semicolon, slash, hash, question mark,
|
|
double colon and space).
|
|
|
|
procedure
|
|
parse-uri uri-string --> (scheme, path, search, frag-id)
|
|
|
|
Multiple-value return: scheme, path, search, frag-id, in this
|
|
order. scheme, search and frag-id are either #f or a string. path is a
|
|
nonempty list of strings. An empty path is a list containing the empty
|
|
string. parse-uri tries to be tolerant of the various ways people build broken URIs out there on the Net (so it is not absolutely conform with RFC 1630).
|
|
|
|
|
|
procedure
|
|
unescape-uri string [start [end]] --> string
|
|
|
|
Unescapes a string. This procedure should only be used *after* the url
|
|
(!) was parsed, since unescaping may introduce characters that blow
|
|
up the parse (that's why escape sequences are used in URIs ;).
|
|
Escape-sequences are of following scheme: %hh where h is a hexadecimal
|
|
digit. E.g. %20 is space (ASCII character 32).
|
|
|
|
|
|
procedure
|
|
hex-digit? character --> boolean
|
|
|
|
Returns #t if character is a hexadecimal digit (i.e., one of 1-9, a-f,
|
|
A-F), #f otherwise.
|
|
|
|
|
|
procedure
|
|
hexchar->int character --> number
|
|
|
|
Translates the given character to an integer, p.e. (hexchar->int \#a)
|
|
=> 10.
|
|
|
|
|
|
procedure
|
|
int->hexchar integer --> character
|
|
|
|
Translates the given integer from range 1-15 into an hexadecimal
|
|
character (uses uppercase letters), p.e. (int->hexchar 14) => E.
|
|
|
|
|
|
char-set
|
|
uri-escaped-chars
|
|
|
|
A set of characters that are escaped in URIs. These are the following
|
|
characters: dollar ($), minus (-), underscore (_), at (@), dot (.),
|
|
and-sign (&), exclamation mark (!), asterisk (*), backslash (\),
|
|
double quote ("), single quote ('), open brace ((), close brace ()),
|
|
comma (,) plus (+) and all other characters that are neither letters
|
|
nor digits (such as space and control characters).
|
|
|
|
|
|
procedure
|
|
escape-uri string [escaped-chars] --> string
|
|
|
|
Escapes characters of string that are given with escaped-chars.
|
|
escaped-chars default to uri-escaped-chars. Be careful with using this
|
|
procedure to chunks of text with syntactically meaningful reserved
|
|
characters (e.g., paths with URI slashes or colons) -- they'll be
|
|
escaped, and lose their special meaning. E.g. it would be a mistake to
|
|
apply escape-uri to "//lcs.mit.edu:8001/foo/bar.html" because the
|
|
slashes and colons would be escaped. Note that esacpe-uri doesn't
|
|
check this as it would lose his meaning.
|
|
|
|
|
|
procedure
|
|
resolve-uri cscheme cp scheme p --> (scheme, path)
|
|
|
|
Sorry, I can't figure out what resolve-uri is inteded to do. Perhaps
|
|
I find it out later.
|
|
|
|
The code seems to have a bug: In the body of receive, there's a
|
|
loop. j should, according to the comment, count sequential /. But j
|
|
counts nothing in the body. Either zero is added ((lp (cdr cp-tail)
|
|
(cons (car cp-tail) rhead) (+ j 0))) or j is set to 1 ((lp (cdr
|
|
cp-tail) (cons (car cp-tail) rhead) 1))). Nevertheless, j is expected
|
|
to reach value numsl that can be larger than one. So what? I am
|
|
confused.
|
|
|
|
|
|
procedure
|
|
rev-append list-a list-b --> list
|
|
|
|
Performs a (append (reverse list-a) list-b). The comment says it
|
|
should be defined in a list package but I am wondering how often this
|
|
will be used.
|
|
|
|
|
|
procedure
|
|
split-uri-path uri start end --> list
|
|
|
|
Splits uri at /'s. Only the substring given with start (inclusive) and
|
|
end (exclusive) is considered. Start and end - 1 have to be within the
|
|
range of the uri-string. Otherwise an index-out-of-range exception
|
|
will be raised. Example: (split-uri-path "foo/bar/colon" 4 11) ==>
|
|
'("bar" "col")
|
|
|
|
|
|
procedure
|
|
simplify-uri-path path --> list
|
|
|
|
Removes "." and ".." entries from path. The result is a (maybe empty)
|
|
list representing a path that does not contain any "." or "..". The
|
|
list can only be empty if the path did not start with "/" (for the
|
|
rare occasion someone wants to simplify a relative path). The result
|
|
is #f if the path tries to back up past root, for example by "/.." or
|
|
"/foo/../.." or just "..". "//" may occur somewhere in the path
|
|
referring to root but not being backed up.
|
|
Examples:
|
|
(simplify-uri-path (split-uri-path "/foo/bar/baz/.." 0 15))
|
|
==> '("" "foo" "bar")
|
|
|
|
(simplify-uri-path (split-uri-path "foo/bar/baz/../../.." 0 20))
|
|
==> '()
|
|
|
|
(simplify-uri-path (split-uri-path "/foo/../.." 0 10))
|
|
==> #f ; tried to back up root
|
|
|
|
(simplify-uri-path (split-uri-path "foo/bar//" 0 9))
|
|
==> '("") ; "//" refers to root
|
|
|
|
(simplify-uri-path (split-uri-path "foo/bar/" 0 8))
|
|
==> '("") ; last "/" also refers to root
|
|
|
|
(simplify-uri-path (split-uri-path "/foo/bar//baz/../.." 0 19))
|
|
==> #f ; tries to back up root
|