scsh-0.5/scsh/lib/srfi-1.txt

1913 lines
77 KiB
Plaintext

The SRFI-1 list library -*- outline -*-
Olin Shivers
98/10/16
Last Update: 99/9/11
Todo: carefully proofread.
Netscape prints with insufficient space between proc specs --
see list=, for example. Mess about with css some more.
Emacs should display this document is in outline mode. Say c-h m for
instructions on how to move through it by sections (e.g., c-c c-n, c-c c-p).
During the SRFI discussion period, the current draft may be found at
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.txt
* Table of contents
-------------------
Abstract
Introduction
Procedure index
General discussion
"Linear update" procedures
Improper lists
Errors
Not included in this library
The procedures
Constructors
Predicates
Selectors
Miscellaneous: length, append, reverse, zip & count
Fold, unfold & map
Filtering & partitioning
Searching
Deletion
Association lists
Set operations on lists
Primitive side-effects
Acknowledgements
References & links
Copyright
* Abstract
----------
R5RS Scheme has an impoverished set of list-processing utilities, which is a
problem for authors of portable code. This SRFI proposes a coherent and
comprehensive set of list-processing procedures; it is accompanied by a
reference implementation of the spec. The reference implementation is
- portable
- efficient
- completely open, public-domain source
* Introduction
--------------
The set of basic list and pair operations provided by R4RS/R5RS Scheme is far
from satisfactory. Because this set is so small and basic, most
implementations provide additional utilities, such as a list-filtering
function, or a "left fold" operator, and so forth. But, of course, this
introduces incompatibilities -- different Scheme implementations provide
different sets of procedures.
I have designed a full-featured library of procedures for list processing.
While putting this library together, I checked as many Schemes as I could get
my hands on. (I have a fair amount of experience with several of these
already.) I missed Chez -- no on-line manual that I can find -- but I hit most
of the other big, full-featured Schemes. The complete list of list-processing
systems I checked is:
R4RS/R5RS Scheme, MIT Scheme, Gambit, RScheme, MzScheme, slib, Common
Lisp, Bigloo, guile, T, APL and the SML standard basis
As a result, the library I am proposing is fairly rich.
Following this initial design phase, this library went through several
months of discussion on the SRFI mailing lists, and was altered in light
of the ideas and suggestions put forth during this discussion.
In parallel with designing this API, I have also written a reference
implementation. I have placed this source on the Net with an unencumbered,
"open" copyright. A few notes about the reference implementation:
- Although I got procedure names and specs from many Schemes, I wrote this
code myself. Thus, there are *no* entanglements. Any Scheme implementor
can pick this library up with no worries about copyright problems -- both
commercial and non-commercial systems.
- The code is written for portability and should be trivial to port to
any Scheme. It has only four deviations from R4RS, clearly discussed
in the comments:
- Use of an ERROR procedure;
- Use of the R5RS VALUES and a simple RECEIVE macro for producing
and consuming multiple return values;
- Use of simple :OPTIONAL and LET-OPTIONALS macros for optional
argument parsing and defaulting;
- Use of a simple CHECK-ARG procedure for argument checking.
- It is written for clarity and well-commented. The current source is
1436 lines of text, of which 690 are source code; the rest being comments
and blank lines.
- It is written for efficiency. Fast paths are provided for common
cases. Side-effecting procedures such as FILTER! avoid unnecessary,
redundant SET-CDR!s which would thrash a generational GC's write barrier
and the store buffers of fast processors. Functions reuse longest common
tails from input parameters to construct their results where
possible. Constant-space iterations are used in preference to recursions;
local recursions are used in preference to consing temporary intermediate
data structures.
This is not to say that the implementation can't be tuned up for
a specific Scheme implementation. There are notes in comments addressing
ways implementors can tune the reference implementation for performance.
In short, I've written the reference implementation to make it as painless
as possible for an implementor -- or a regular programmer -- to adopt this
library and get good results with it.
* Procedure index
-----------------
Here is a short list of the procedures provided by the list-lib package.
"#" marks R5RS procedures; "+" marks extended R5RS procedures
Constructors
# cons list
xcons cons* make-list list-tabulate
list-copy circular-list iota
Predicates
# pair? null?
proper-list? circular-list? dotted-list?
not-pair? null-list?
list=
Selectors
# car cdr ... cdddar cddddr list-ref
first second third fourth fifth sixth seventh eighth ninth tenth
car+cdr
take drop
take-right drop-right
take! drop-right!
last last-pair
Miscellaneous: length, append, reverse, zip & count
# length
length+
# append reverse
append! reverse!
append-reverse append-reverse!
zip unzip1 unzip2 unzip3 unzip4 unzip5
count
Fold, unfold & map
+ map for-each
fold unfold pair-fold reduce
fold-right unfold-right pair-fold-right reduce-right
append-map append-map!
map! pair-for-each filter-map map-in-order
Filtering & partitioning
filter partition remove
filter! partition! remove!
Searching
+ member
# memq memv
find find-tail
any every
list-index
Deleting
delete delete-duplicates
delete! delete-duplicates!
Association lists
+ assoc
# assq assv
alist-cons alist-copy
alist-delete alist-delete!
Set operations on lists
lset<= lset= lset-adjoin
lset-union lset-union!
lset-intersection lset-intersection!
lset-difference lset-difference!
lset-xor lset-xor!
lset-diff+intersection lset-diff+intersection!
Primitive side effects
# set-car! set-cdr!
------
Four R4RS/R5RS list-processing procedures are extended by this library in
backwards-compatible ways:
map for-each (Extended to take lists of unequal length)
member assoc (Extended to take an optional comparison procedure)
The following R4RS/R5RS list- and pair-processing procedures are also part of
list-lib's exports, as defined by the R5RS report:
cons pair? null? list length append reverse
car cdr ... cdddar cddddr set-car! set-cdr! list-ref
memq memv assq assv
The remaining two R4RS/R5RS list-processing procedures are *not* part of
this library:
list-tail (renamed DROP)
list? (see PROPER-LIST?, CIRCULAR-LIST? and DOTTED-LIST?)
* General discussion
--------------------
A set of general criteria guided the design of this library.
I don't require "destructive" (what I call "linear update") procedures to
alter and recycle cons cells from the argument lists. They are allowed to, but
not required to. (The reference implementations I have written *do* recycle
the argument lists.) See below for further discussion.
List-filtering procedures such as FILTER or DELETE do not disorder
lists. Elements appear in the answer list in the same order as they appear in
the argument list. This constrains implementation, but seems like a desirable
feature, since in many uses of lists, order matters. (In particular,
disordering an alist is definitely a bad idea.)
Contrariwise, although the reference implementations of the list-filtering
procedures share longest common tails between argument and answer lists,
it not is part of the spec.
Because lists are an inherently sequential data structure (unlike, say,
vectors), list-inspection functions such as FIND, FIND-TAIL, FOR-EACH, ANY
and EVERY commit to a left-to-right traversal order of their argument list.
However, constructor functions, such as LIST-TABULATE and the mapping
procedures (APPEND-MAP, APPEND-MAP!, MAP!, PAIR-FOR-EACH, FILTER-MAP,
MAP-IN-ORDER) do *not* specify the dynamic order in which their
procedural argument is applied to its various values.
Predicates return useful true values wherever possible. Thus ANY must return
the true value produced by its predicate, and EVERY returns the final true
value produced by applying its predicate argument to the last element of its
argument list.
Functionality is provided both in pure and linear-update (potentially
destructive) forms wherever this makes sense.
No special status accorded Scheme's built-in equality functions.
Any functionality provided in terms of EQ?, EQV?, EQUAL? is also
available using a client-provided equality function.
Proper design counts for more than backwards compatibility, but I have tried,
ceteris paribus, to be as backwards-compatible as possible with existing
list-processing libraries, in order to facilitate porting old code to run as a
client of the procedures in this library. Name choices and semantics are, for
the most part, in agreement with existing practice in many current Scheme
systems. I have indicated some incompatibilities in the following text.
These procedures are *not* "sequence generic" -- i.e., procedures that
operate on either vectors and lists. They are list-specific. I prefer to
keep the library simple and focussed.
I have named these procedures without a qualifying initial "list-"
lexeme, which is in keeping with the existing set of list-processing
utilities in Scheme. I follow the general Scheme convention
(VECTOR-LENGTH, STRING-REF) of placing the type-name before the action
when naming procedures -- so we have LIST-COPY and PAIR-FOR-EACH rather
than the perhaps more fluid, but less consistent, COPY-LIST, or
FOR-EACH-PAIR.
I have generally followed a regular and consistent naming scheme, composing
procedure names from a set of basic lexemes.
** "Linear update" procedures
=============================
Many procedures in this library have "pure" and "linear update" variants. A
"pure" procedure has no side-effects, and in particular does not alter its
arguments in any way. A "linear update" procedure is allowed -- but *not*
required -- to side-effect its arguments in order to construct its
result. "Linear update" procedures are typically given names ending with an
exclamation point. So, for example, (APPEND! list1 list2) is allowed to
construct its result by simply using SET-CDR! to set the cdr of the last pair
of list1 to point to list2, and then returning list1 (unless list1 is the
empty list, in which case it would simply return list2). However, APPEND! may
also elect to perform a pure append operation -- this is a legal definition
of APPEND!:
(define append! append)
This is why we do not call these procedures "destructive" -- because they
aren't *required* to be destructive. They are *potentially* destructive.
What this means is that you may only apply linear-update procedures to
values that you know are "dead" -- values that will never be used again
in your program. This must be so, since you can't rely on the value passed
to a linear-update procedure after that procedure has been called. It
might be unchanged; it might be altered.
The "linear" in "linear update" doesn't mean "linear time" or "linear space"
or any sort of multiple-of-n kind of meaning. It's a fancy term that type
theorists and pure functional programmers use to describe systems where you
are only allowed to have exactly one reference to each variable. This provides
a guarantee that the value bound to a variable is bound to no other
variable. So when you *use* a variable in a variable reference, you "use it
up." Knowing that no one else has a pointer to that value means the system
primitive is free to side-effect its arguments to produce what is,
observationally, a pure-functional result.
In the context of this library, "linear update" means you, the programmer,
know there are *no other* live references to the value passed to the
procedure -- after passing the value to one of these procedures, the
value of the old pointer is indeterminate. Basically, you are licensing
the Scheme implementation to alter the data structure if it feels like
it -- you have declared you don't care either way.
You get no help from Scheme in checking that the values you claim are "linear"
really are. So you better get it right. Or play it safe and use the non-!
procedures -- it doesn't do any good to compute quickly if you get the wrong
answer.
Why go to all this trouble to define the notion of "linear update" and use it
in a procedure spec, instead of the more common notion of a "destructive"
operation? First, note that destructive list-processing procedures are almost
always used in a linear-update fashion. This is in part required by the
special case of operating upon the empty list, which can't be side-effected.
This means that destructive operators are not pure side-effects -- they have
to return a result. Second, note that code written using linear-update
operators can be trivially ported to a pure, functional subset of Scheme by
simply providing pure implementations of the linear-update operators. Finally,
requiring destructive side-effects ruins opportunities to parallelise these
operations -- and the places where one has taken the trouble to spell out
destructive operations are usually exactly the code one would want a
parallelising compiler to parallelise: the efficiency-critical kernels of the
algorithm. Linear-update operations are easily parallelised. Going with a
linear-update spec doesn't close off these valuable alternative implementation
techniques. This list library is intended as a set of low-level, basic
operators, so we don't want to exclude these possible implementations.
The linear-update procedures in this library are
take! drop-right!
append! reverse! append-reverse!
append-map! map!
filter! partition! remove!
delete! alist-delete! delete-duplicates!
lset-adjoin! lset-union! lset-intersection! lset-difference! lset-xor!
lset-diff+intersection!
** Improper lists
=================
Scheme does not properly have a list type, just as C does not have a string
type. Rather, Scheme has a binary-tuple type, from which one can build binary
trees. There is an *interpretation* of Scheme values that allows one to treat
these trees as lists. Further complications ensue from the fact that Scheme
allows side-effects to these tuples, raising the possibility of lists of
unbounded length, and trees of unbounded depth (that is, circular data
structures).
However, there is a simple view of the world of Scheme values that considers
every value to be a list of some sort. That is, every value is either
- a "proper list" -- a finite, nil-terminated list, such as:
(a b c)
()
(32)
- a "dotted list" -- a finite, non-nil terminated list, such as
(a b c . d)
(x . y)
42
george
- or a "circular list" -- an infinite, unterminated list.
Note that the zero-length dotted lists are simply all the non-null, non-pair
values.
This view is captured by the predicates PROPER-LIST?, DOTTED-LIST?, and
CIRCULAR-LIST?. List-lib users should note that dotted lists are not commonly
used, and are considered by many Scheme programmers to be an ugly artifact of
Scheme's lack of a true list type. However, dotted lists do play a noticeable
role in the *syntax* of Scheme, in the "rest" parameters used by n-ary
lambdas: (lambda (x y . rest) ...).
Dotted lists are *not* fully supported by list-lib. Most procedures are
defined only on proper lists -- that is, finite, nil-terminated lists. The
procedures that will also handle circular or dotted lists are specifically
marked. While this design decision restricts the domain of possible arguments
one can pass to these procedures, it has the benefit of allowing the
procedures to catch the error cases where programmers inadvertently pass
scalar values to a list procedure by accident, e.g. by switching the arguments
to a procedure call.
** Errors
=========
Note that statements of the form "it is an error" merely mean "don't
do that." They are not a guarantee that a conforming implementation will
"catch" such improper use by, for example, raising some kind of exception.
Regrettably, R5RS Scheme requires no firmer guarantee even for basic operators
such as CAR and CDR, so there's little point in requiring these procedures to
do more. Here is the relevant section of the R5RS report:
When speaking of an error situation, this report uses the phrase "an
error is signalled" to indicate that implementations must detect and
report the error. If such wording does not appear in the discussion
of an error, then implementations are not required to detect or
report the error, though they are encouraged to do so. An error
situation that implementations are not required to detect is usually
referred to simply as "an error."
For example, it is an error for a procedure to be passed an argument
that the procedure is not explicitly specified to handle, even though
such domain errors are seldom mentioned in this report.
Implementations may extend a procedure's domain of definition to
include such arguments.
** Not included in this library
===============================
The following items are not in this library:
- Sort routines
- Destructuring/pattern-matching macro
- Tree-processing routines
They shound have their own SRFI specs.
* The procedures
----------------
In a Scheme system that has a module or package system, these procedures
should be contained in a module named "list-lib".
The templates given below obey the following conventions for procedure formals:
list A proper (finite, nil-terminated) list
clist A proper or circular list
flist A finite (proper or dotted) list
pair A pair
x, y, d, a Any value
object, value Any value
n, i A natural number (an integer >= 0)
proc A procedure
= A boolean procedure taking two arguments
pred A boolean procedure taking one argument
It is an error to pass a circular or dotted list to a procedure not
defined to accept such an argument. Such a procedure may either signal
an error or diverge when passed a circular list.
** Constructors
===============
cons a d -> pair R5RS
The primitive constructor. Returns a newly allocated pair whose car is A
and whose cdr is D. The pair is guaranteed to be different (in the sense
of EQV?) from every existing object.
(cons 'a '()) ==> (a)
(cons '(a) '(b c d)) ==> ((a) b c d)
(cons "a" '(b c)) ==> ("a" b c)
(cons 'a 3) ==> (a . 3)
(cons '(a b) 'c) ==> ((a b) . c)
list object ... -> list R5RS
Returns a newly allocated list of its arguments.
(list 'a (+ 3 4) 'c) ==> (a 7 c)
(list) ==> ()
xcons d a -> pair
(lambda (d a) (cons a d))
Of utility only as a value to be conveniently passed to higher-order
procedures.
(xcons '(b c) 'a) => (a b c)
The name stands for "eXchanged CONS."
cons* elt1 elt2 ... -> object
Like LIST, but the last argument provides the tail of the constructed
list, returning (cons elt1 (cons elt2 (cons ... eltn))).
This function is called LIST* in Common Lisp and about half of the
Schemes that provide it; and CONS* in the other half.
(cons* 1 2 3 4) => (1 2 3 . 4)
(cons* 1) => 1
make-list n [fill] -> list
Returns an N-element list, whose elements are all the value FILL.
If the FILL argument is not given, the elements of the list may
be arbitrary values.
(make-list 4 'c) => (c c c c)
(make-list 10) => (2 3 5 7 11 13 17 19 23 29)
list-tabulate n init-proc -> list
Returns an N-element list. Element i of the list, where 0 <= i < N,
is produced by (INIT-PROC i). No guarantee is made about the dynamic
order in which INIT-PROC is applied to these indices.
(list-tabulate 4 values) => (0 1 2 3)
list-copy flist -> flist
Copies the "spine" of the argument.
circular-list elt1 elt2 ... -> clist
Constructs a circular list of the elements.
(circular-list 'z 'q) => (z q z q z q ...)
iota count [start step] -> list
Returns a list containing the elements
(start start+step ... start+(count-1)*step)
The START and STEP parameters default to 0 and 1, respectively.
This procedure takes its name from the APL primitive.
(iota 5) => (0 1 2 3 4)
(iota 5 0 -0.1) => (0 -0.1 -0.2 -0.3 -0.4)
** Predicates
=============
Note: the predicates PROPER-LIST?, CIRCULAR-LIST?, and DOTTED-LIST?
partition the entire universe of Scheme values.
proper-list? x -> boolean
Returns true iff X is a proper list -- a finite, nil-terminated list.
More carefully: The empty list is a proper list. A pair whose cdr is a
proper list is also a proper list:
<proper-list> ::= () (Empty proper list)
| (cons <x> <proper-list>) (Proper-list pair)
Note that this definition rules out circular lists. This
function is required to detect this case and return false.
Nil-terminated lists are called "proper" lists by R5RS and Common Lisp.
The opposite of proper is improper.
R5RS binds this function to the variable LIST?.
(not (proper-list? x)) = (or (dotted-list? x) (circular-list? x))
circular-list? x -> boolean
True if X is a circular list. A circular list is a value such that
for every n >= 0, cdr^n(x) is a pair.
Terminology: The opposite of circular is finite.
(not (circular-list? x)) = (or (proper-list? x) (dotted-list? x))
dotted-list? x -> boolean
True if X is a finite, non-nil-terminated list. That is, there exists
an n >= 0 such that cdr^n(x) is neither a pair nor (). This includes
non-pair, non-() values (e.g. symbols, numbers), which are considered to
be dotted lists of length 0.
(not (dotted-list? x)) = (or (proper-list? x) (circular-list? x))
pair? object -> boolean R5RS
Returns #t if OBJECT is a pair; otherwise, #f.
(pair? '(a . b)) ==> #t
(pair? '(a b c)) ==> #t
(pair? '()) ==> #f
(pair? '#(a b)) ==> #f
(pair? 7) ==> #f
(pair? 'a) ==> #f
null? object -> boolean R5RS
Returns #t if OBJECT is the empty list; otherwise, #f.
null-list? list -> boolean
LIST is a proper or circular list. This procedure returns true if
the argument is the empty list (), and false otherwise. It is an
error to pass this procedure a value which is not a proper or
circular list.
This procedure is recommended as the termination condition for
list-processing procedures that are not defined on dotted lists.
not-pair? x -> boolean
(lambda (x) (not (pair? x)))
Provided as a procedure as it can be useful as the termination condition
for list-processing procedures that wish to handle all finite lists,
both proper and dotted.
list= elt= list1 ... -> boolean
Determines list equality, given an element-equality procedure.
Proper list A equals proper list B if they are of the same length,
and their corresponding elements are equal, as determined by ELT=.
If the element-comparison procedure's first argument is from LISTi,
then its second argument is from LISTi+1, i.e. it is always called as
(elt= a b)
for a an element of list A, and b an element of list B.
In the n-ary case, every LISTi is compared to LISTi+1 (as opposed,
for example, to comparing LIST1 to every LISTi, for i>1). If there
are no list arguments at all, LIST= simply returns true.
It is an error to apply LIST= to anything except proper lists. While
implementations may choose to extend it to circular lists, note that it
cannot reasonably be extended to dotted lists, as it provides no way to
specify an equality procedure for comparing the list terminators.
Note that the dynamic order in which the ELT= procedure is applied to
pairs of elements is not specified. For example, if LIST= is applied
to three lists, A, B, and C, it may first completely compare A to B,
then compare B to C, or it may compare the first elements of A and B,
then the first elements of B and C, then the second elements of A and
B, and so forth.
The equality procedure must be consistent with EQ?. That is,
it must be the case that
(eq? x y) => (elt= x y).
Note that this implies that two lists which are EQ? are always LIST=,
as well.
(list= eq?) => #t ; Trivial cases
(list= eq? '(a)) => #t
** Selectors
============
car pair -> value R5RS
cdr pair -> value R5RS
These procedures return the contents of the car and cdr field of
their argument, respectively. Note that it is an error to apply
them to the empty list.
(car '(a b c)) ==> a
(car '((a) b c d)) ==> (a)
(car '(1 . 2)) ==> 1
(car '()) ==> *error*
(cdr '(a b c)) ==> (b c)
(cdr '((a) b c d)) ==> (b c d)
(cdr '(1 . 2)) ==> 2
(cdr '()) ==> *error*
caar pair -> value R5RS
cadr pair -> value
:
cdddar pair -> value
cddddr pair -> value
These procedures are compositions of CAR and CDR, where for
example CADDR could be defined by
(define caddr (lambda (x) (car (cdr (cdr x))))).
Arbitrary compositions, up to four deep, are provided. There are
twenty-eight of these procedures in all.
list-ref clist i -> value R5RS
Returns the Ith element of CLIST. (This is the same as the car
of (DROP CLIST I).) It is an error if I >= N, where N is the length
of CLIST.
(list-ref '(a b c d) 2) ==> c
first second third fourth fifth
sixth seventh eighth ninth tenth: pair -> value
Synonyms for car, cadr, caddr, ...
(third '(a b c d e)) => c
car+cdr pair -> [x y]
The fundamental pair deconstructor:
(lambda (p) (values (car p) (cdr p)))
This can, of course, be implemented more efficiently by a compiler.
take x i -> list
drop x i -> object
TAKE returns the first I elements of list X.
DROP returns all but the first I elements of list X.
(take '(a b c d e) 2) => (a b)
(drop '(a b c d e) 2) => (c d e)
X may be any value -- a proper, circular, or dotted list:
(take '(1 2 3 . d) 2) => (1 2)
(drop '(1 2 3 . d) 2) => (3 . d)
(take '(1 2 3 . d) 3) => (1 2 3)
(drop '(1 2 3 . d) 3) => d
For a legal I, TAKE and DROP partition the list in a manner which
can be inverted with APPEND:
(append (take x i) (drop x i)) = x
DROP is exactly equivalent to performing I cdr operations on X;
the returned value shares a common tail with X.
If the argument is a list of non-zero length, TAKE is guaranteed to
return a freshly-allocated list, even in the case where the entire
list is taken, e.g. (TAKE LIS (LENGTH LIS)).
take-right flist i -> object
drop-right flist i -> list
TAKE-RIGHT returns the last I elements of FLIST.
DROP-RIGHT returns all but the last I elements of FLIST.
The returned list may share a common tail with the argument list.
(take-right '(a b c d e) 2) => (d e)
(drop-right '(a b c d e) 2) => (a b c)
FLIST may be any finite list, either proper or dotted:
(take-right '(1 2 3 . d) 2) => (2 3 . d)
(drop-right '(1 2 3 . d) 2) => (1)
(take-right '(1 2 3 . d) 0) => d
(drop-right '(1 2 3 . d) 0) => (1 2 3)
For a legal I, TAKE-RIGHT and DROP-RIGHT partition the list in a manner
which can be inverted with APPEND:
(append (take flist i) (drop flist i)) = flist
TAKE-RIGHT's return value is guaranteed to share a common tail with FLIST.
If the argument is a list of non-zero length, DROP-RIGHT is guaranteed to
return a freshly-allocated list, even in the case where nothing is
dropped, e.g. (DROP-RIGHT LIS 0).
take! x i -> list
drop-right! flist i -> list
TAKE! and DROP-RIGHT! are "linear-update" variants of TAKE and
DROP-RIGHT: the procedure is allowed, but not required, to alter the
argument list to produce the result.
If X is circular, TAKE! may return a shorter-than-expected list:
(take! (circular-list 1 3 5) 8) => (1 3)
(take! (circular-list 1 3 5) 8) => (1 3 5 1 3 5 1 3)
last pair -> object
last-pair pair -> pair
LAST returns the last element of the non-empty, finite list PAIR.
LAST-PAIR returns the last pair in the non-empty, finite list PAIR.
(last '(a b c)) => c
(last-pair '(a b c)) => (c)
(last-pair '(a b c . d)) => (c . d)
** Miscellaneous: length, append, reverse, zip & count
======================================================
length list -> integer R5RS
length+ clist -> integer or #f
Both LENGTH and LENGTH+ return the length of the argument.
It is an error to pass a value to LENGTH which is not a proper
list (finite and nil-terminated). In particular, this means an
implementation may diverge or signal an error when LENGTH is
applied to a circular list.
LENGTH+, on the other hand, returns #F when applied to a circular
list.
The length of a proper list is a non-negative integer N such that CDR
applied N times to the list produces the empty list.
(length '(a b c)) ==> 3
(length '(a (b) (c d e))) ==> 3
(length '()) ==> 0
append list1 ... -> value R5RS
append! list1 ... -> value
APPEND returns a list consisting of the elements of LIST1
followed by the elements of the other list parameters.
(append '(x) '(y)) ==> (x y)
(append '(a) '(b c d)) ==> (a b c d)
(append '(a (b)) '((c))) ==> (a (b) (c))
The resulting list is always newly allocated, except that it
shares structure with the final LISTi argument. This last argument
may be any value at all; an improper list results if it is not
a proper list. All other arguments must be proper lists.
(append '(a b) '(c . d)) ==> (a b c . d)
(append '() 'a) ==> a
APPEND! is the "linear-update" variant of APPEND -- it is allowed, but
not required, to alter cons cells in the argument lists to construct
the result list. The last argument is never altered; the result
list shares structure with this parameter.
reverse list -> list R5RS
reverse! list -> list
REVERSE returns a newly allocated list consisting of the elements of
LIST in reverse order.
(reverse '(a b c)) ==> (c b a)
(reverse '(a (b c) d (e (f))))
==> ((e (f)) d (b c) a)
REVERSE! is the linear-update variant of REVERSE. It is permitted,
but not required, to alter the argument's cons cells to produce the
reversed list.
append-reverse rev-head tail -> value
append-reverse! rev-head tail -> value
APPEND-REVERSE returns
(append (reverse rev-head) tail)
It it provided because it is a common operation -- a common
list-processing style calls for this exact operation to transfer values
accumulated in reverse order onto the front of another list, and because
the implementation is significantly more efficient than the simple
composition it replaces. (But note that this pattern of iterative
computation followed by a reverse can frequently be rewritten as a
recursion, dispensing with the REVERSE and APPEND-REVERSE steps, and
shifting temporary, intermediate storage from the heap to the stack,
which is typically a win for reasons of cache locality and eager storage
reclamation.)
APPEND-REVERSE! is just the linear-update variant -- it is allowed, but
not required, to alter REV-HEAD's cons cells to construct the result.
zip clist1 clist2 ... -> list
(lambda lists (apply map list lists))
If ZIP is passed N lists, it returns a list as long as the shortest
of these lists, each element of which is an N-element list comprised
of the corresponding elements from the parameter lists.
(zip '(one two three) '(1 2 3) '(odd even odd even odd even odd even)) =>
((one 1 odd) (two 2 even) (three 3 odd))
(zip '(1 2 3)) => ((1) (2) (3))
At least one of the argument lists must be finite:
(zip '(3 1 4 1) (circular-list #f #t)) =>
((3 #f) (1 #t) (4 #f) (1 #t))
unzip1 list -> list
unzip2 list -> [list list]
unzip3 list -> [list list list]
unzip4 list -> [list list list list]
unzip5 list -> [list list list list list]
UNZIP1 takes a list of lists, where every list must contain at least one
element, and returns a list containing the initial element of each such
list. That is, it returns (MAP CAR LISTS). UNZIP2 takes a list of lists,
where every list must contain at least two elements, and returns two
values: a list of the first elements, and a list of the second
elements. UNZIP3 does the same for the first three elements of the lists,
and so forth.
(unzip2 '((1 one) (2 two) (3 three))) =>
(1 2 3)
(one two three)
count pred clist1 clist2 ... -> integer
PRED is a procedure taking as many arguments as there are lists and
returning a single value. It is applied element-wise to the elements of
the LISTs, and a count is tallied of the number of elements that produce a
true value. This count is returned. COUNT is "iterative" in that it is
guaranteed to apply PRED to the LIST elements in a left-to-right order.
The counting stops when the shortest list expires.
(count even? '(3 1 4 1 5 9 2 5 6)) => 3
(count < '(1 2 4 8) '(2 4 6 8 10 12 14 16)) => 3
At least one of the argument lists must be finite:
(count < '(3 1 4 1) (circular-list 1 10)) => 2
** Fold, unfold & map
=====================
fold kons knil clist1 clist2 ... -> value
The fundamental list iterator.
First, consider the single list-parameter case. If CLIST1 = (e1 e2 ... en),
then this procedure returns
(kons en ... (kons e2 (kons e1 knil)) ... )
That is, it obeys the (tail) recursion
(fold kons knil lis) = (fold kons (kons (car lis) knil) (cdr lis))
(fold kons knil '()) = knil
Examples:
(fold + 0 lis) ; Add up the elements of LIS.
(fold cons '() lis) ; Reverse LIS.
(fold cons tail rev-head) ; See APPEND-REVERSE.
;; How many symbols in LIS?
(fold (lambda (x count) (if (symbol? x) (+ count 1) count))
0
lis)
;; Length of the longest string in LIS:
(fold (lambda (s max-len) (max max-len (string-length s)))
0
lis)
If N list arguments are provided, then the KONS function must take
N+1 parameters: one element from each list, and the "seed" or fold
state, which is initially KNIL. The fold operation terminates when
the shortest list runs out of values:
(fold cons* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1)
At least one of the list arguments must be finite.
fold-right kons knil clist1 clist2 ... -> value
The fundamental list recursion operator.
First, consider the single list-parameter case. If CLIST1 = (e1 e2 ... en),
then this procedure returns
(kons e1 (kons e2 ... (kons en knil)))
That is, it obeys the recursion
(fold-right kons knil lis) = (kons (car lis) (fold-right kons knil (cdr lis)))
(fold-right kons knil '()) = knil
Examples:
(fold-right cons '() lis) ; Copy LIS.
;; Filter the even numbers out of LIS.
(fold-right (lambda (x l) (if (even? x) (cons x l) l)) '() lis))
If N list arguments are provided, then the KONS function must take
N+1 parameters: one element from each list, and the "seed" or fold
state, which is initially KNIL. The fold operation terminates when
the shortest list runs out of values:
(fold-right cons* '() '(a b c) '(1 2 3 4 5)) => (a 1 b 2 c 3)
At least one of the list arguments must be finite.
pair-fold kons knil clist1 clist2 ... -> value
Analogous to FOLD, but KONS is applied to successive sublists of the
lists, rather than successive elements -- that is, KONS is applied to the
pairs making up the lists, giving this (tail) recursion:
(pair-fold kons knil lis) = (let ((tail (cdr lis)))
(pair-fold kons (kons lis knil) tail))
(pair-fold kons knil '()) = knil
The KONS function may reliably apply SET-CDR! to the pairs it is given
without altering the sequence of execution.
Example:
;;; Destructively reverse a list.
(pair-fold (lambda (pair tail) (set-cdr! pair tail) pair) '() lis))
At least one of the list arguments must be finite.
pair-fold-right kons knil clist1 clist2 ... -> value
Holds the same relationship with FOLD-RIGHT that PAIR-FOLD holds with FOLD.
Obeys the recursion
(pair-fold-right kons knil lis) =
(kons lis (pair-fold-right kons knil (cdr lis)))
(pair-fold-right kons knil '()) = knil
Example:
(pair-fold-right cons '() '(a b c)) => ((a b c) (b c) (c))
At least one of the list arguments must be finite.
reduce f ridentity list -> value
REDUCE is a variant of FOLD.
RIDENTITY should be a "right identity" of the procedure F -- that is,
for any value X acceptable to F,
(f x ridentity) = x
REDUCE has the following definition:
If LIST = (), return RIDENTITY.
Otherwise, return (fold F (car LIST) (cdr LIST)).
...in other words, we compute (fold F RIDENTITY LIST).
Note that RIDENTITY is used *only* in the empty-list case. You
typically use REDUCE when applying F is expensive and you'd like
to avoid the extra application incurred when FOLD applies F to the
head of LIST and the identity value, redundantly producing the
same value passed in to F. For example, if F involves searching a
file directory or performing a database query, this can be
significant. In general, however, FOLD is useful in many contexts
where REDUCE is not (consider the examples given in the FOLD
definition -- only one of the five folds uses function with a
right identity. The other four may not be performed with REDUCE).
Note: MIT Scheme and Haskell flip F's arg order for their REDUCE and
FOLD functions.
reduce-right f ridentity list -> value
REDUCE-RIGHT is the fold-right variant of REDUCE.
It obeys the following definition:
(reduce-right f ridentity '()) = ridentity
(reduce-right f ridentity '(e1)) = (f e1 ridentity) = e1
(reduce-right f ridentity '(e1 e2 ...)) =
(f e1 (reduce f ridentity (e2 ...)))
...in other words, we compute (fold-right F RIDENTITY LIST).
unfold p f g seed [tail] -> value
UNFOLD constructs a list with the following loop:
(let lp ((seed seed) (lis tail))
(if (p seed) lis
(lp (g seed)
(cons (f seed) lis))))
P: Determines when to stop unfolding.
F: Maps each seed value to the corresponding list element.
G: Maps each seed value to next seed value.
SEED: The "state" value for the unfold.
TAIL: list terminator; defaults to '().
UNFOLD is the fundamental iterative list constructor, just as FOLD is the
fundamental iterative list consumer. While UNFOLD may seem a bit abstract
to novice functional programmers, it can be used in a number of ways:
(unfold zero? ; List of squares: 1^2 ... 10^2
(lambda (x) (* x x))
(lambda (x) (- x 1))
10)
(unfold null-list? car cdr lis) ; Reverse a proper list.
;; Read current input port into a list of values.
(unfold eof-object? values (lambda (x) (read)) (read))
;; (APPEND-REVERSE rev-head tail)
(unfold null-list? car cdr rev-head tail)
Interested functional programmers may enjoy noting that FOLD and UNFOLD
are in some sense inverses. That is, given operations KNULL?, KAR, KDR,
KONS, and KNIL satisfying
(kons (kar x) (kdr x)) = x and (knull? knil) = #t
then
(FOLD kons knil (UNFOLD knull? kar kdr x)) = x
and
(UNFOLD knull? kar kdr (FOLD kons knil x)) = x.
This combinator presumably has some pretentious mathematical name;
interested readers are invited to communicate it to the author.
unfold-right p f g seed [tail-gen]-> list
UNFOLD-RIGHT is best described by its basic recursion:
(unfold-right p f g seed) = (if (p seed) (tail-gen seed)
(cons (f seed)
(unfold-right p f g (g seed))))
P: Determines when to stop unfolding.
F: Maps each seed value to the corresponding list element.
G: Maps each seed value to next seed value.
SEED: The "state" value for the unfold.
TAIL-GEN: creates the tail of the list; defaults to (lambda (x) '())
UNFOLD-RIGHT is the fundamental recursive list constructor, just as
FOLD-RIGHT is the fundamental recursive list consumer. While UNFOLD-RIGHT
may seem a bit abstract to novice functional programmers, it can be used
in a number of ways:
(unfold-right (lambda (x) (> x 10)) ; List of squares: 1^2 ... 10^2.
(lambda (x) (* x x))
(lambda (x) (+ x 1))
1)
(unfold-right null-list? car cdr lis) ; Copy a proper list.
;; Read current input port into a list of values.
(unfold-right eof-object? values (lambda (x) (read)) (read))
;; Copy a possibly non-proper list:
(unfold-right not-pair? car cdr lis
values)
;; Append HEAD onto TAIL:
(unfold-right null-list? car cdr head
(lambda (x) tail))
Interested functional programmers may enjoy noting that FOLD-RIGHT and
UNFOLD-RIGHT are in some sense inverses. That is, given operations KNULL?,
KAR, KDR, KONS, and KNIL satisfying
(kons (kar x) (kdr x)) = x and (knull? knil) = #t
then
(FOLD-RIGHT kons knil (UNFOLD-RIGHT knull? kar kdr x)) = x
and
(UNFOLD-RIGHT knull? kar kdr (FOLD-RIGHT kons knil x)) = x.
This combinator sometimes is called an "anamorphism;" when an
explicit TAIL-GEN procedure is supplied, it is called an
"apomorphism."
map proc clist1 clist2 ... -> list R5RS+
PROC is a procedure taking as many arguments as there are list arguments
and returning a single value. MAP applies PROC element-wise to the
elements of the lists and returns a list of the results, in order. The
dynamic order in which PROC is applied to the elements of the lists is
unspecified.
(map cadr '((a b) (d e) (g h)))
==> (b e h)
(map (lambda (n) (expt n n))
'(1 2 3 4 5))
==> (1 4 27 256 3125)
(map + '(1 2 3) '(4 5 6)) ==> (5 7 9)
(let ((count 0))
(map (lambda (ignored)
(set! count (+ count 1))
count)
'(a b))) ==> (1 2) OR (2 1)
This procedure is extended from its R5RS specification
to allow the arguments to be of unequal length; it terminates
when the shortest list runs out.
At least one of the argument lists must be finite:
(map + '(3 1 4 1) (circular-list 1 0)) => (4 1 5 1)
for-each proc clist1 clist2 ... -> unspecified R5RS+
The arguments to FOR-EACH are like the arguments to MAP, but
FOR-EACH calls PROC for its side effects rather than for its
values. Unlike MAP, FOR-EACH is guaranteed to call PROC on
the elements of the CLISTs in order from the first element(s) to
the last, and the value returned by FOR-EACH is unspecified.
(let ((v (make-vector 5)))
(for-each (lambda (i)
(vector-set! v i (* i i)))
'(0 1 2 3 4))
v) ==> #(0 1 4 9 16)
This procedure is extended from its R5RS specification
to allow the arguments to be of unequal length; it terminates
when the shortest list runs out.
At least one of the argument lists must be finite:
(map + '(3 1 4 1) (circular-list 1 0)) => (4 1 5 1)
append-map f clist1 clist2 ... -> value
append-map! f clist1 clist2 ... -> value
Equivalent to
(apply append (map f clist1 clist2 ...))
and
(apply append! (map f clist1 clist2 ...))
Map F over the elements of the lists, just as in the MAP function.
However, the results of the applications are appended together to
make the final result. APPEND-MAP uses APPEND to append the results
together; APPEND-MAP! uses APPEND!.
The dynamic order in which the various applications of F are made is
not specified.
Example:
(append-map! (lambda (x) (list x (- x))) '(1 3 8))
=> (1 -1 3 -3 8 -8)
At least one of the list arguments must be finite.
map! f list1 clist2 ... -> list
Linear-update variant of MAP -- MAP! is allowed, but not required, to
alter the cons cells of LIST1 to construct the result list.
The dynamic order in which the various applications of F are made is
not specified.
In the n-ary case, CLIST2, CLIST3, ... must have at least as many
elements as LIST1.
map-in-order f clist1 clist2 ... -> list
A variant of the MAP procedure that guarantees to apply F across
the elements of the LISTi arguments in a left-to-right order. This
is useful for mapping procedures that both have side effects and
return useful values.
At least one of the list arguments must be finite.
pair-for-each f clist1 clist2 ... -> unspecific
Like FOR-EACH, but F is applied to successive sublists of the argument
lists. That is, F is applied to the cons cells of the lists, rather
than the lists' elements. These applications occur in left-to-right
order.
The F procedure may reliably apply SET-CDR! to the pairs it is given
without altering the sequence of execution.
(pair-for-each (lambda (pair) (display pair) (newline)) '(a b c)) ==>
(a b c)
(b c)
(c)
At least one of the list arguments must be finite.
filter-map f clist1 clist2 ... -> list
Like MAP, but only true values are saved.
(filter-map (lambda (x) (and (number? x) (* x x))) '(a 1 b 3 c 7))
=> (1 9 49)
The dynamic order in which the various applications of F are made is
not specified.
At least one of the list arguments must be finite.
** Filtering & partitioning
===========================
filter pred list -> list
Return all the elements of LIST that satisfy predicate PRED.
The list is not disordered -- elements that appear in the result list
occur in the same order as they occur in the argument list.
The returned list may share a common tail with the argument list.
The dynamic order in which the various applications of PRED are made is
not specified.
(filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4)
partition pred list -> [list list]
Partitions the elements of LIST with predicate PRED, and returns two
values: the list of in-elements and the list of out-elements.
The list is not disordered -- elements occur in the result lists
in the same order as they occur in the argument list.
The dynamic order in which the various applications of PRED are made is
not specified. One of the returned lists may share a common tail with the
argument list.
(partition symbol? '(one 2 3 four five 6))
=> (one four five)
(2 3 6)
remove pred list -> list
Returns LIST without the elements that satisfy predicate PRED:
(lambda (pred list) (filter (lambda (x) (not (pred x))) list))
The list is not disordered -- elements that appear in the result list
occur in the same order as they occur in the argument list.
The returned list may share a common tail with the argument list.
The dynamic order in which the various applications of PRED are made is
not specified.
(remove even? '(0 7 8 8 43 -4)) => (7 43)
filter! pred list -> list
partition! pred list -> [list list]
remove! pred list -> list
Linear-update variants of FIND, PARTITION and REMOVE.
These procedures are allowed, but not required, to alter the cons cells
in the argument list to construct the result lists.
** Searching
============
The following procedures all search lists for a leftmost element satisfying
some criteria. This means they do not always examine the entire list; thus,
there is no efficient way for them to reliably detect and signal an error when
passed a dotted or circular list. Here are the general rules describing how
these procedures work when applied to different kinds of lists:
Proper lists: The standard, canonical behavior happens in this case.
Dotted lists: It is an error to pass these procedures a dotted list
that does not contain an element satisfying the search
criteria. That is, it is an error if the procedure has
to search all the way to the end of the dotted list.
However, this SRFI does *not* specify anything at all
about the behavior of these procedures when passed a
dotted list containing an element satisfying the search
criteria. It may finish successfully, signal an error,
or perform some third action. Different implementations
may provide different functionality in this case; code
which is compliant with this SRFI may not rely on any
particular behavior. Future SRFI's may refine SRFI-1
to define specific behavior in this case.
In brief, SRFI-1 compliant code may not pass a dotted
list argument to these procedures.
Circular lists: It is an error to pass these procedures a circular list
that does not contain an element satisfying the search
criteria. Note that the procedure is not required to
detect this case; it may simply diverge. It is, however,
acceptable to search a circular list *if the search is
successful* -- that is, if the list contains an element
satisfying the search criteria.
Here are some examples, using the FIND and ANY procedures as a canonical
representatives:
;; Proper list -- success
(find even? '(1 2 3)) => 2
(any even? '(1 2 3)) => #t
;; proper list -- failure
(find even? '(1 7 3)) => #f
(any even? '(1 7 3)) => #f
;; Failure is error on a dotted list.
(find even? '(1 3 . x)) => error
(any even? '(1 3 . x)) => error
;; The dotted list contains an element satisfying the search.
;; This case is not specified -- it could be success, an error,
;; or some third possibility.
(find even? '(1 2 . x)) => error/undefined
(any even? '(1 2 . x)) => error/undefined ; success, error or other.
;; circular list -- success
(find even? (circular-list 1 6 3)) => 6
(any even? (circular-list 1 6 3)) => #t
;; circular list -- failure is error. Procedure may diverge.
(find even? (circular-list 1 3)) => error
(any even? (circular-list 1 3)) => error
find pred clist -> value
Return the first element of CLIST that satisfies predicate PRED;
false if no element does.
(find even? '(3 1 4 1 5 9)) => 4
Note that FIND has an ambiguity in its lookup semantics -- if FIND
returns #F, you cannot tell (in general) if it found a #F element
that satisfied PRED, or if it did not find any element at all. In
many situations, this ambiguity cannot arise -- either the list being
searched is known not to contain any #F elements, or the list is
guaranteed to have an element satisfying PRED. However, in cases
where this ambiguity can arise, you should use FIND-TAIL instead of
FIND -- FIND-TAIL has no such ambiguity:
(cond ((find-tail pred lis) => (lambda (pair) ...)) ; Handle (CAR PAIR)
(else ...)) ; Search failed.
find-tail pred clist -> pair or false
Return the first pair of CLIST whose car satisfies PRED. If no pair does,
return false.
FIND-TAIL can be viewed as a general-predicate variant of the MEMBER
function.
Examples:
(find-tail even? '(3 1 37 -8 -5 0 0)) => (-8 -5 0 0)
(find-tail even? '(3 1 37 -5)) => #f
;; MEMBER X LIS:
(find-tail (lambda (elt) (equal? x elt)) lis)
In the circular-list case, this procedure "rotates" the list.
any pred clist1 clist2 ... -> value
Applies the predicate across the lists, returning true if the predicate
returns true on any application.
If there are N list arguments CLIST1 ... CLISTn, then PRED must be a
procedure taking N arguments and returning a boolean result.
ANY applies PRED to the first elements of the CLISTi parameters. If this
application returns a true value, ANY immediately returns that value.
Otherwise, it iterates, applying PRED to the second elements of the CLISTi
parameters, then the third, and so forth. The iteration stops when a true
value is produced or one of the lists runs out of values; in the latter
case, ANY returns #F. The application of PRED to the last element of the
lists is a tail call.
Note the difference between FIND and ANY -- FIND returns the element
that satisfied the predicate; ANY returns the true value that the
predicate produced.
Like EVERY, ANY's name does not end with a question mark -- this is to
indicate that it does not return a simple boolean (#T or #F), but a
general value.
(any integer? '(a 3 b 2.7)) => #T
(any integer? '(a 3.1 b 2.7)) => #F
(any < '(3 1 4 1 5)
'(2 7 1 8 2)) => #T
every pred clist1 clist2 ... -> value
Applies the predicate across the lists, returning true if the predicate
returns true on every application.
If there are N list arguments CLIST1 ... CLISTn, then PRED must be a
procedure taking N arguments and returning a boolean result.
EVERY applies PRED to the first elements of the CLISTi parameters. If
this application returns false, EVERY immediately returns false.
Otherwise, it iterates, applying PRED to the second elements of the CLISTi
parameters, then the third, and so forth. The iteration stops when a false
value is produced or one of the lists run out of values. In the latter
case, EVERY returns the true value produced by its final application of
PRED. The application of PRED to the last element of the lists is a tail
call.
If one of the CLISTi has no elements, EVERY simply returns #T.
Like ANY, EVERY's name does not end with a question mark -- this is to
indicate that it does not return a simple boolean (#T or #F), but a
general value.
list-index pred clist1 clist2 ... -> integer or false
Return the index of the leftmost element that satisfies PRED.
If there are N list arguments CLIST1 ... CLISTn, then PRED must be a
function taking N arguments and returning a boolean result.
LIST-INDEX applies PRED to the first elements of the CLISTi parameters.
If this application returns true, LIST-INDEX immediately returns zero.
Otherwise, it iterates, applying PRED to the second elements of the
CLISTi parameters, then the third, and so forth. When it finds a tuple of
list elements that cause PRED to return true, it stops and returns the
zero-based index of that position in the lists.
The iteration stops when one of the lists runs out of values; in this
case, LIST-INDEX returns #F.
(list-index even? '(3 1 4 1 5 9)) => 2
(list-index < '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => 1
(list-index = '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => #f
member x list [=] -> list or #f R5RS+
memq x list -> list or #f R5RS
memv x list -> list or #f R5RS
These procedures return the first sublist of LIST whose car is
X, where the sublists of LIST are the non-empty lists returned
by (DROP LIST I) for I less than the length of LIST. If X does
not occur in LIST, then #f is returned. MEMQ uses EQ? to compare X
with the elements of LIST, while MEMV uses EQV? and MEMBER uses EQUAL?.
(memq 'a '(a b c)) ==> (a b c)
(memq 'b '(a b c)) ==> (b c)
(memq 'a '(b c d)) ==> #f
(memq (list 'a) '(b (a) c)) ==> #f
(member (list 'a)
'(b (a) c)) ==> ((a) c)
(memq 101 '(100 101 102)) ==> *unspecified*
(memv 101 '(100 101 102)) ==> (101 102)
MEMBER is extended from its R5RS definition to allow the client to pass
in an optional equality procedure = used to compare keys.
The comparison procedure is used to compare the elements Ei of LIST
to the key X in this way:
(= X Ei) ; list is (E1 ... En)
That is, the first argument is always X, and the second argument is
one of the list elements. Thus one can reliably find the first element
of LIST that is greater than five with
(member 5 LIST <)
Note that fully general list searching may be performed with
the FIND-TAIL and FIND procedures, e.g.
(find-tail even? list) ; Find the first elt with an even key.
** Deletion
===========
delete x list [=] -> list
delete! x list [=] -> list
DELETE uses the comparison procedure =, which defaults to EQUAL?, to find
all elements of LIST that are equal to X, and deletes them from LIST. The
dynamic order in which the various applications of = are made is not
specified.
The list is not disordered -- elements that appear in the result list
occur in the same order as they occur in the argument list.
The result may share a common tail with the argument list.
Note that fully general element deletion can be performed with the REMOVE
and REMOVE! procedures, e.g.:
;; Delete all the even elements from LIS:
(remove even? lis)
The comparison procedure is used in this way:
(= X Ei)
That is, X is always the first argument, and a list element is always the
second argument. The comparison procedure will be used to compare each
element of LIST exactly once; the order in which it is applied to the
various Ei is not specified. Thus, one can reliably remove all the
numbers greater than five from a list with
(delete 5 list <)
DELETE! is the linear-update variant of DELETE. It is allowed, but not
required, to alter the cons cells in its argument list to construct the
result.
delete-duplicates list [=] -> list
delete-duplicates! list [=] -> list
DELETE-DUPLICATES removes duplicate elements from the list argument.
If there are multiple equal elements in the argument list, the result list
only contains the first or leftmost of these elements in the result.
The order of these surviving elements is the same as in the original
list -- DELETE-DUPLICATES does not disorder the list (hence it is useful
for "cleaning up" association lists).
The = parameter is used to compare the elements of the list; it defaults
to EQUAL?. If X comes before Y in LIST, then the comparison is performed
(= X Y)
The comparison procedure will be used to compare each pair of
elements in LIST no more than once; the order in which it is
applied to the various pairs is not specified.
Implementations of DELETE-DUPLICATE are allowed to share common tails
between argument and result lists -- for example, if the list argument
contains only unique elements, it may simply return exactly this list.
Be aware that, in general, DELETE-DUPLICATES runs in time O(n^2)
for N-element lists. Uniquifying long lists can be accomplished
in O(n lg n) time by sorting the list to bring equal elements
together, then using a linear-time algorithm to remove equal
elements. Alternatively, one can use algorithms based on
element-marking, with linear-time results.
DELETE-DUPLICATES! is the linear-update variant of DELETE-DUPLICATES; it
is allowed, but not required, to alter the cons cells in its argument
list to construct the result.
(delete-duplicates '(a b a c a b c z)) => (a b c z)
;; Clean up an alist:
(delete-duplicates '((a . 3) (b . 7) (a . 9) (c . 1))
(lambda (x y) (eq? (car x) (car y))))
=> ((a . 3) (b . 7) (c . 1))
** Association lists
====================
An "association list" (or "alist") is a list of pairs. The car of each pair
contains a key value, and the cdr contains the associated data value. They can
be used to construct simple look-up tables in Scheme. Note that association
lists are probably inappropriate for performance-critical use on large data;
in these cases, hash tables or some other alternative should be employed.
assoc key alist [=] -> pair or #f R5RS+
assq key alist -> pair or #f R5RS
assv key alist -> pair or #f R5RS
ALIST must be an association list -- a list of pairs. These procedures
find the first pair in ALIST whose car field is KEY, and returns that
pair. If no pair in ALIST has KEY as its car, then #f is returned. ASSQ
uses EQ? to compare KEY with the car fields of the pairs in ALIST, while
ASSV uses EQV? and ASSOC uses EQUAL?.
(define e '((a 1) (b 2) (c 3)))
(assq 'a e) ==> (a 1)
(assq 'b e) ==> (b 2)
(assq 'd e) ==> #f
(assq (list 'a) '(((a)) ((b)) ((c)))) ==> #f
(assoc (list 'a) '(((a)) ((b)) ((c)))) ==> ((a))
(assq 5 '((2 3) (5 7) (11 13))) ==> *unspecified*
(assv 5 '((2 3) (5 7) (11 13))) ==> (5 7)
ASSOC is extended from its R5RS definition to allow the client to pass in
an optional equality procedure = used to compare keys.
The comparison procedure is used to compare the elements Ei of LIST
to the KEY parameter in this way:
(= KEY (CAR Ei)) ; list is (E1 ... En)
That is, the first argument is always KEY, and the second argument is
one of the list elements. Thus one can reliably find the first entry
of ALIST whose key is greater than five with
(assoc 5 ALIST <)
Note that fully general alist searching may be performed with
the FIND-TAIL and FIND procedures, e.g.
;; Look up the first association in ALIST with an even key:
(find (lambda (a) (even? (car a))) alist)
alist-cons key datum alist -> alist
(lambda (key datum alist) (cons (cons key datum) alist))
Cons a new alist entry mapping KEY -> DATUM onto ALIST.
alist-copy alist -> alist
Make a fresh copy of ALIST. This means copying each pair that
forms an association as well as the spine of the list, i.e.
(lambda (a) (map (lambda (elt) (cons (car elt) (cdr elt))) a))
alist-delete key alist [=] -> alist
alist-delete! key alist [=] -> alist
ALIST-DELETE deletes all associations from ALIST with the given
KEY, using key-comparison procedure =, which defaults to EQUAL?.
The dynamic order in which the various applications of = are made
is not specified.
Return values may share common tails with the ALIST argument.
The alist is not disordered -- elements that appear in the result alist
occur in the same order as they occur in the argument alist.
The comparison procedure is used to compare the element keys Ki of ALIST's
entries to the KEY parameter in this way:
(= KEY Ki)
Thus, one can reliably remove all entries of ALIST whose key is greater
than five with
(alist-delete 5 alist <)
ALIST-DELETE! is the linear-update variant of ALIST-DELETE. It
is allowed, but not required, to alter the cons cells from the ALIST
parameter to construct the result.
** Set operations on lists
==========================
These procedures implement operations on sets represented as lists of
elements. They all take an = argument used to compare elements of
lists. This equality procedure is required to be consistent with
EQ?. That is, it must be the case that
(eq? x y) => (= x y).
Note that this implies, in turn, that two lists that are EQ? are
also set-equal by any legal comparison procedure. This allows for
constant-time determination of set operations on EQ? lists.
Be aware that these procedures typically run in time O(n * m) for N-
and M-element list arguments. Performance-critical applications
operating upon large sets will probably wish to use other data
structures and algorithms.
lset<= = list1 ... -> boolean
Returns true iff every LISTi is a subset of LISTi+1, using = for the
element-equality procedure. List A is a subset of list B if every
element in A is equal to some element of B. When performing an
element comparison, the = procedure's first argument is an element
of A; its second, an element of B.
(lset<= eq? '(a) '(a b a) '(a b c c)) => #t
(lset<= eq?) => #t ; Trivial cases
(lset<= eq? '(a)) => #t
lset= = list1 ... -> boolean
Returns true iff every LISTi is set-equal to LISTi+1, using = for
the element-equality procedure. "Set-equal" simply means that
LISTi is a subset of LISTi+1, and LISTi+1 is a subset of LISTi.
(lset= eq? '(b e a) '(a e b) '(e e b a)) => #t
(lset= eq?) => #t ; Trivial cases
(lset= eq? '(a)) => #t
lset-adjoin = list elt1 ... -> list
Adds the ELTi elements not already in the list parameter to the
result list. The result shares a common tail with the list parameter.
The new elements are added to the front of the list, but no guarantees
are made about their order. The = parameter is an equality procedure
used to determine if an ELTi is already a member of LIST. Its first
argument is an element of LIST; its second is one of the ELTi.
The list parameter is always a suffix of the result -- even if the list
parameter contains repeated elements, these are not reduced.
(lset-adjoin eq? '(a b c d c e) 'a 'e 'i 'o 'u) => (u o i a b c d c e)
lset-union = list1 ... -> list
Returns the union of the lists, using = for the element-equality
procedure.
The union of lists A and B is constructed as follows:
- If A is the empty list, the answer is B (or a copy of B).
- Otherwise, the result is initialised to be list A (or a copy of A).
- Proceed through the elements of list B in a left-to-right order.
If b is such an element of B, compare every element r of the current
result list to b: (= r b). If all comparisons fail, b is consed
onto the front of the result.
However, there is no guarantee that = will be applied to every pair
of arguments from A and B. In particular, if A is EQ? to B, the operation
may immediately terminate.
In the n-ary case, the two-argument list-union operation is simply
folded across the argument lists.
(lset-union eq? '(a b c d e) '(a e i o u)) => (u o i a b c d e)
;; Repeated elements in LIST1 are preserved.
(lset-union eq? '(a a c) '(x a x)) => (x a a c)
(lset-union eq?) => () ; Trivial cases
(lset-union eq? '(a b c)) => (a b c)
lset-intersection = list1 list2 ... -> list
Returns the intersection of the lists, using = for the element-equality
procedure.
The intersection of lists A and B is comprised of every element of A
that is = to some element of B: (= a b), for a in A, and b in B.
Note this implies that an element which appears in B and multiple times
in list A will also appear multiple times in the result.
The order in which elements appear in the result is the same as
they appear in LIST1 -- that is, LSET-INTERSECTION essentially
filters LIST1, without disarranging element order. The result may
share a common tail with LIST1.
In the n-ary case, the two-argument list-intersection operation is simply
folded across the argument lists. However, the dynamic order in which the
applications of = are made is not specified. The procedure may check an
element of LIST1 for membership in every other list before proceeding to
consider the next element of LIST1, or it may completely intersect LIST1
and LIST2 before proceeding to LIST3, or it may go about its work in some
third order.
(lset-intersection eq? '(a b c d e) '(a e i o u)) => (a e)
;; Repeated elements in LIST1 are preserved.
(lset-intersection eq? '(a x y a) '(x a x z)) => '(a x a)
(lset-intersection eq? '(a b c)) => (a b c) ; Trivial case
lset-difference = list1 list2 ... -> list
Returns the difference of the lists, using = for the element-equality
procedure -- all the elements of LIST1 that are not = to any element from
one of the other LISTi parameters.
The = procedure's first argument is always an element of LIST1; its second
is an element of one of the other LISTi. Elements that are repeated
multiple times in the LIST1 parameter will occur multiple times in the
result.
The order in which elements appear in the result is the same as
they appear in LIST1 -- that is, LSET-DIFFERENCE essentially
filters LIST1, without disarranging element order. The result may
share a common tail with LIST1.
The dynamic order in which the applications of = are made is not
specified. The procedure may check an element of LIST1 for membership in
every other list before proceeding to consider the next element of LIST1,
or it may completely compute the difference of LIST1 and LIST2 before
proceeding to LIST3, or it may go about its work in some third order.
(lset-difference eq? '(a b c d e) '(a e i o u)) => (b c d)
(lset-difference eq? '(a b c)) => (a b c) ; Trivial case
lset-xor = list1 ... -> list
Returns the exclusive-or of the sets, using = for the element-equality
procedure. If there are exactly two lists, this is all the elements
that appear in exactly one of the two lists. The operation is associative,
and thus extends to the n-ary case -- the elements that appear in an
odd number of the lists. The result may share a common tail with any of
the LISTi parameters.
More precisely, for two lists A and B, A xor B is a list of
- every element a of A such that there is no element b of B
such that (= a b)
- every element b of B such that there is no element a of A
such that (= b a)
However, an implementation is allowed to assume that = is
symmetric -- that is, that
(= a b) => (= b a).
This means, for example, that if a comparison (= a b) produces
true for some a in A and b in B, both a and b may be removed from
inclusion in the result.
In the n-ary case, the binary-xor operation is simply folded across
the lists.
(lset-xor eq? '(a b c d e) '(a e i o u)) => (d c b i o u)
;; Trivial cases.
(lset-xor eq?) => ()
(lset-xor eq? '(a b c d e)) => (a b c d e)
lset-diff+intersection = list1 list2 ... -> [list list]
Returns two values -- the difference and the intersection of the lists.
Is equivalent to
(values (lset-difference = list1 list2 ...)
(lset-intersection = list1
(lset-union = list2 ...)))
but can be implemented more efficiently.
The = procedure's first argument is an element of LIST1; its second is
an element of one of the other LISTi.
Either of the answer lists may share a common tail with LIST1.
This operation essentially partitions LIST1.
lset-union! = list1 ... -> list
lset-intersection! = list1 list2 ... -> list
lset-difference! = list1 list2 ... -> list
lset-xor! = list1 ... -> list
lset-diff+intersection! = list1 list2 ... -> [list list]
These are linear-update variants. They are allowed, but not required,
to use the cons cells in their first list parameter to construct their
answer. LSET-UNION! is permitted to recycle cons cells from *any* of its
list arguments.
** Primitive side-effects
=========================
These two procedures are the primitive, R5RS side-effect operations on pairs.
set-car! pair object -> unspecified R5RS
set-cdr! pair object -> unspecified R5RS
These procedures store OBJECT in the car and cdr field of PAIR,
respectively. The value returned is unspecified.
(define (f) (list 'not-a-constant-list))
(define (g) '(constant-list))
(set-car! (f) 3) ==> *unspecified*
(set-car! (g) 3) ==> *error*
* Acknowledgements
------------------
The design of this library benefited greatly from the feedback provided during
the SRFI discussion phase. Among those contributing thoughtful commentary and
suggestions, both on the mailing list and by private discussion, were Mike
Ashley, Darius Bacon, Alan Bawden, Phil Bewig, Jim Blandy, Dan Bornstein, Per
Bothner, Anthony Carrico, Doug Currie, Kent Dybvig, Sergei Egorov, Doug Evans,
Marc Feeley, Matthias Felleisen, Will Fitzgerald, Matthew Flatt, Dan Friedman,
Lars Thomas Hansen, Brian Harvey, Erik Hilsdale, Wolfgang Hukriede, Richard
Kelsey, Donovan Kolbly, Shriram Krishnamurthi, Dave Mason, Jussi Piitulainen,
David Pokorny, Duncan Smith, Mike Sperber, Maciej Stachowiak, Harvey J. Stein,
John David Stone, and Joerg F. Wittenberger. I am grateful to them for their
assistance.
I am also grateful the authors, implementors and documentors of all the systems
mentioned in the introduction. Aubrey Jaffer and Kent Pitman should be noted
for their work in producing Web-accessible versions of the R5RS and Common
Lisp spec, which was a tremendous aid.
This is not to imply that these individuals necessarily endorse the final
results, of course.
* References & Links
--------------------
This document, in HTML:
http://srfi.schemers.org/srfi-1/srfi-1.html
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.html (draft)
This document, in simple text format:
http://srfi.schemers.org/srfi-1/srfi-1.txt
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.txt (draft)
Source code for the reference implementation:
http://srfi.schemers.org/srfi-1/srfi-1-reference.scm
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1-reference.scm (draft)
Archive of SRFI-1 discussion-list email:
http://srfi.schemers.org/srfi-1/mail-archive/maillist.html
SRFI web site:
http://srfi.schemers.org/
[CLtL2]
Common Lisp: the Language
Guy L. Steele Jr. (editor).
Digital Press, Maynard, Mass., second edition 1990.
Available at http://www.harlequin.com/education/books/HyperSpec/
[R5RS]
Revised^5 Report on the Algorithmic Language Scheme,
R. Kelsey, W. Clinger, J. Rees (editors).
Higher-Order and Symbolic Computation, Vol. 11, No. 1, September, 1998.
and ACM SIGPLAN Notices, Vol. 33, No. 9, October, 1998.
Available at http://www.schemers.org/Documents/Standards/
* Copyright
-----------
Certain portions of this document -- the specific, marked segments of text
describing the R5RS procedures -- were adapted with permission from the R5RS
report.
All other text is copyright (C) Olin Shivers (1998, 1999).
All Rights Reserved.
This document and translations of it may be copied and furnished to others,
and derivative works that comment on or otherwise explain it or assist in its
implementation may be prepared, copied, published and distributed, in whole or
in part, without restriction of any kind, provided that the above copyright
notice and this paragraph are included on all such copies and derivative
works. However, this document itself may not be modified in any way, such as
by removing the copyright notice or references to the Scheme Request For
Implementation process or editors, except as needed for the purpose of
developing SRFIs in which case the procedures for copyrights defined in the
SRFI process must be followed, or as required to translate it into languages
other than English.
The limited permissions granted above are perpetual and will not be revoked by
the authors or their successors or assigns.
This document and the information contained herein is provided on an "AS IS"
basis and THE AUTHORS AND THE SRFI EDITORS DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
* Ispell "buffer local" dictionary
----------------------------------
Ispell dumps "buffer local" words here. Please ignore.
LocalWords: RS SRFI Chez RScheme MzScheme slib Bigloo APL SML API CDR GC's Ei
LocalWords: EQ consing lib xcons unzip del delq delv mem lset lset xor diff lp
LocalWords: alist assq assv assoc cdr cdddar cddddr ref memq memv george iff
LocalWords: proc lis accessor ary TAIL's NCONS EQV rcons Contrariwise clist
LocalWords: paribus lexeme parallelise Destructuring init FP flist eof CLISTn
LocalWords: generalisation elt cadr caddr rev kons knil len rzero LZERO Ki Ith
LocalWords: arg LISTi pred cond LISTn ANY's EVERY's Uniquifying lg ridentity
LocalWords: eq netnews generalise Maciej Stachowiak al Bewig LocalWords ELTi
LocalWords: anamorphism apomorphism CLISTi ALIST's url ceteris eltn caar KNULL
LocalWords: deconstructor RIGHT's KAR KDR kar kdr knull HTML CLtL Clinger
LocalWords: Rees Bawden Blandy Bornstein Bothner Carrico Currie Dybvig
LocalWords: Egorov Feeley Matthias Felleisen Flatt Hilsdale Hukriede
LocalWords: Kolbly Shriram Krishnamurthi Jussi Piitulainen Pokorny Joerg
LocalWords: Sperber Wittenberger documentors Jaffer