The SRFI-32 sort libraries					-*- outline -*-
Olin Shivers
First draft: 1998/10/19
Last update: 2002/7/21

[Todo: del-list-neighbor-dups!
       vector-copy -> subvector
       use srfi-23 for reporting errors
       use srfi-16 for n-aries?

Emacs should display this document in outline mode. Say c-h m for
instructions on how to move through it by sections (e.g., c-c c-n, c-c c-p).

* Table of contents
-------------------
Abstract
Procedure index
Introduction
What's wrong with the current state of affairs?
Design rules
  What vs. how
  Consistency across function signatures
  Data parameter first, less-than parameter after
  Ordering, comparison functions & stability
  All vector operations accept optional subrange parameters
  Required vs. allowed side-effects
Procedure specification
  Procedure naming and functionality
  Types of parameters and return values
  sort-lib - general sorting package
  Algorithm-specific sorting packages
Algorithmic properties
Topics to be resolved during discussion phase
Porting and optimisation
References & Links
Acknowledgements
Copyright


* Abstract
----------
Current Scheme sorting packages are, every one of them, surprisingly bad. I've
designed the API for a full-featured sort toolkit, which I propose as a SRFI.

The spec comes with 1200 lines of high-quality reference code: tightly
written, highly commented, portable code, available for free. Implementors
want this code. It's better than what you have.

-------------------------------------------------------------------------------
* Procedure index
-----------------
list-sorted?			vector-sorted?

list-merge			vector-merge
list-sort			vector-sort
list-stable-sort		vector-stable-sort
list-delete-neighbor-dups	vector-delete-neighbor-dups

list-merge!			vector-merge!
list-sort!			vector-sort!
list-stable-sort!		vector-stable-sort!
list-delete-neighbor-dups!	vector-delete-neighbor-dups!

quick-sort    heap-sort   insert-sort   list-merge-sort   vector-merge-sort
quick-sort!   heap-sort!  insert-sort!  list-merge-sort!  vector-merge-sort!
quick-sort3!             

vector-binary-search
vector-binary-search3

-------------------------------------------------------------------------------
* Introduction
--------------
As I'll detail below, I wasn't very happy with the state of the Scheme
world for sorting and merging lists and vectors. So I have designed and 
written a fairly comprehensive sorting & merging toolkit. It is

    - very portable,

    - much better code than what is currently in Elk, Gambit, Bigloo, 
      Scheme->C, MzScheme, RScheme, Scheme48, MIT Scheme, or slib, and 

    - priced to move: free code.

The package includes
    - Vector insert sort (stable)
    - Vector heap sort
    - Vector quick sort (with median-of-3 pivot picking)
    - Vector merge sort (stable)
    - Pure and destructive list merge sort (stable)
    - Stable vector and list merge
    - Miscellaneous sort-related procedures: Vector and list merging, 
      sorted? predicates, vector binary search, vector and list 
      delete-equal-neighbor procedures.
    - A general, non-algorithmic set of procedure names for general sorting
      and merging.

Scheme programmers may want to adopt this package. I'd like Scheme
implementors to adopt this code and its API -- in fact, the code is a bribe to
make it easy for implementors to converge on the suggested API. I mean, you'd
really have to be a boor to take this free code I wrote and mutate its
interface over to your incompatible, unportable API, wouldn't you? But you
could, of course -- it's freely available. More in the spirit of the offering,
you could make this API available, and then also write a little module
providing your old interface that is defined in terms of this API. "Scheme
implementors," in this context, includes slib, which is not a standalone
implementation of Scheme, but rather an influential collection of API's and
code.

The code is tightly bummed. It is clearly written, and commented in my usual
voluminous style. This includes notes on porting and implementation-specific
optimisations.


-------------------------------------------------------------------------------
* What's wrong with the current state of affairs?
-------------------------------------------------

It's just amazing to me that in 2002, sorting and merging hasn't been
completely put to bed. These are well-understood algorithms, each of them well
under a page of code. The straightforward algorithms are basic, core stuff --
sophomore-level. But if you tour the major Scheme implementations out there on
the Net, you find badly written code that provides extremely spotty coverage
of the algorithm space. One implementation even has a buggy implementation
that has been in use for about 20 years. Another has an O(n^2) algorithm...
implemented in C for speed.

Open source-code is a wonderful thing. In a couple of hours, I was able to
download and check the sources of 9 Scheme systems. Here are my notes from the
systems I checked. You can skip to the next section if you aren't morbidly
curious.

slib
  sorted? vector-or-list <
  merge  list1 list2 <
  merge! list1 list2 <
  sort  vector-or-list <
  sort! vector-or-list <

  Richard O'Keefe's stable list merge sort is right idea, but implemented 
  using gratuitous variable side effects. It also does redundant SET-CDR!s.
  The vector sort converts to list, merge sorts, then reconverts
  to vector. This is a bad idea -- non-local pointer chasing bad; vector 
  shuffling good. If you must allocate temp storage, might as well allocate
  a temp vector and use vector merge sort.

MIT Scheme
  sort!       vector <
  merge-sort! vector <
  quick-sort! vector <

  sort       vector-or-list <
  merge-sort vector-or-list <
  quick-sort vector-or-list <

  Naive vector quicksort: loser, for worst-case performance reasons.
  List sort by "list->vector; quicksort; vector->list," hence also loser.
  A clever stable vector merge sort, albeit not very bummed.

Scheme 48 & T
  sort-list  list <
  sort-list! list <
  list-merge! list1 list2 <

  Bob Nix's implementation of online merge-sort, written in the early 80's. 
  Conses unnecessary bookkeeping structure, which isn't necessary with a
  proper recursive formulation. Also, does redundant SET-CDR!s. No vector
  sort. Also, has a bug -- is claimed to be a stable sort, but isn't! To see
  this, get the S48 code, and try
    (define (my< x y) (< (abs x) (abs y)))
    (list-merge! (list 0 2) (list   -2) my<)		; -> (0  2 -2)
    (list-merge! (list   2) (list 0 -2) my<)		; -> (0 -2  2)
  This could be fixed very easily, but it isn't worth it given the 
  other problems with the algorithm.

RScheme
  vector-sort! vector <
  sort collection <

  Good basic implementation of vector heapsort, which has O(n lg n)
  worst-case time. Code ugly, needs tuning. List sort by "list->vector; 
  sort; vector->list." Nothing for stable sorting.

MzScheme
  quicksort lis <
  mergesort alox <

  Sorts lists with (list->vector; quicksort; vector->list) -- but the core
  quicksort is not available for vector sorting. Nothing for stable sorting.
  Quicksort picks pivot naively, inducing O(n^2) worse-case behaviour on a
  fairly common case: an already-sorted list.

Bigloo, STK
  sort vector-or-list <
  Uses an O(n^2) algorithm... implemented in C for speed. Hmm.
  (See runtime/Ieee/vector.scm and runtime/Clib/cvector.c)

Gambit
  sort-list list <
  Nothing for vectors. Simple, slow, unstable merge sort for lists.

Elk
  Another naive quicksort. Lists handled by converting to vector.
  sort  vector-or-list <
  sort! vector-or-list <

Chez Scheme
  merge  < list1 list2
  merge! < list1 list2
  sort  < list
  sort! < list

  These are stable. I have not seen the source code.

Common Lisp
  sort        sequence < [key]
  stable-sort sequence < [key]
  merge result-type sequence1 sequence2 < [key]

  The sort procedures are allowed, but not required, to be destructive.

SML/NJ
  sort: ('a*'a -> bool) -> 'a list -> 'a list
  "Smooth applicative merge sort," which is stable.
  There is also a highly bummed quicksort for vectors.

The right solution: Implement a full toolbox of carefully written standard sort
routines.

Having the source of all these above-cited Schemes available for study made
life a lot easier writing this code. I appreciate the authors making their
source available under such open terms.


-------------------------------------------------------------------------------
* Design rules
--------------

** What vs. how
===============
There are two different interfaces: "what" (simple) & "how" (detailed).

    - Simple: you specify semantics: datatype (list or vector), 
      mutability, and stability.

    - Detailed: you specify the actual algorithm (quick, heap,
      insert, merge). Different algorithms have different properties,
      both semantic & pragmatic, so these exports are necessary.

      It is necessarily the case that the specifications of these procedures
      make statements about execution "pragmatics." For example, the sole
      distinction between heap sort and quick sort -- both of which are
      provided by this library -- is one of execution time, which is not a
      "semantic" distinction. Similar resource-use statements are made about
      "iterative" procedures, meaning that they can execute on input of
      arbitrary size in a constant number of stack frames.

** Consistency across function signatures
=========================================
The two interfaces share common function signatures wherever
possible, to facilitate switching a given call from one procedure
to another.
	
** Less-than parameter first, data parameter after
==================================================
These procedures uniformly observe the following parameter order:
the data to be sorted comes after the comparison function.
That is, we write
  (sort < lis)
not
  (sort lis <).

With the sole exception of Chez Scheme, this is the exact opposite of
every sort function out there in current use in the Scheme world. (See
the summary of related APIs above.) However, it is consistent with common
practice across Scheme libraries in general to put the ordering function
first -- the "operation currying" convention. (E.g., consider FOR-EACH or
MAP or FIND.) 

The original draft of this SRFI used the data-first/comparison-last convention
for backwards compatibility -- a decision I made with internal misgivings.
Happily, however, the overwhelming response from the discussion phase
supported "cleaning up" this issue and re-converging the parameter order with
the general Scheme "op currying" convention. So the original decision was
inverted in favor of the comparison-first/data-last convention.

** Ordering, comparison functions & stability
=============================================
These routines take a < comparison function, not a <= comparison
function, and they sort into increasing order. The difference between
a < spec and a <= spec comes up in three places: 
  - the definition of an ordered or sorted data set,
  - the definition of a stable sorting algorithm, and
  - correctness of quicksort.

+ We say that a data set (a list or vector) is *sorted* or *ordered*
  if it contains no adjacent pair of values ... X Y ... such that Y < X. 

  In other words, scanning across the data never takes a "downwards" step.

  If you use a <= procedure where these algorithms expect a <
  procedure, you may not get the answers you expect. For example,
  the LIST-SORTED? function will return false if you pass it a <= comparison
  function and an ordered list containing adjacent equal elements.

+ A "stable" sort is one that preserves the pre-existing order of equal
  elements. Suppose, for example, that we sort a list of numbers by 
  comparing their absolute values, i.e., using comparison function
    (lambda (x y) (< (abs x) (abs y)))
  If we sort a list that contains both 3 and -3:
    ... 3 ... -3 ...
  then a stable sort is an algorithm that will not swap the order
  of these two elements, that is, the answer is guaranteed to to look like
    ... 3 -3 ...
  not
    ... -3 3 ...

  Choosing < for the comparison function instead of <= affects how stability
  is coded. Given an adjacent pair X Y, (< y x) means "Y should be moved in
  front of X" -- otherwise, leave things as they are. So using a <= function
  where a < function is expected will *invert* stability.

  This is due to the definition of equality, given a < comparator:
    (and (not (< x y))
         (not (< y x)))
  The definition is rather different, given a <= comparator:
    (and (<= x y)
         (<= y x))

+ A "stable" merge is one that reliably favors one of its data sets
  when equal items appear in both data sets. *All merge operations in
  this library are stable*, breaking ties between data sets in favor
  of the first data set -- elements of the first list come before equal 
  elements in the second list.

  So, if we are merging two lists of numbers ordered by absolute value,
  the stable merge operation LIST-MERGE
    (list-merge (lambda (x y) (< (abs x) (abs y)))
                '(0 -2 4 8 -10) '(-1 3 -4 7))
  reliably places the 4 of the first list before the equal-comparing -4
  of the second list:
    (0 -1 -2 4 -4 7 8 -10)

+ Some sort algorithms will *not work correctly* if given a <= when they
  expect a < comparison (or vice-versa). For example, violating quicksort's
  spec may cause it to produce wrong answers, diverge, raise an error, or do
  some fourth thing. To see why, consider the left-scan part of the standard
  quicksort partition step:
	    (let ((i (let scan ((i i)) (if (elt< (vector-ref v i) pivot)
					   (scan (+ i 1))
					   i))))
	      ...)
  Consider applying this loop to a vector of all zeroes (hence, PIVOT, as
  well, is zero), but erroneously using <= for the ELT< function. The loop
  will scan right off the end of the vector, producing a vector-index error.
  The guarantee that the scan loop will terminate before running off the end
  of the vector depends critically upon ELT< performing as a true, irreflexive
  < relation. Running off the end of the vector is only one of a variety of 
  possibly ways to lose -- other, variant implementations of quicksort can, 
  instead, loop forever on some data sets if ELT< is a <= predicate.

In short, if your comparison function F answers true to (F x x), then 
  - using a stable sorting or merging algorithm will not give you a 
    stable sort or merge, 
  - LIST-SORTED? may surprise you, and 
  - quicksort may fail in a variety of possible ways.
Note that  you can synthesize a < function from a <= function with
    (lambda (x y) (not (<= y x)))
if need be. 

Precise definitions give sharp edges to tools, but require care in use. 
"Measure twice, cut once."

I have adopted the choice of < from Common Lisp. One would assume the definers
of Common Lisp had a good reason for adopting < instead of <=, but canvassing
several of the principal actors in the definition process has turned up no
better reason than "an arbitrary but consistent choice." At minimum, then,
this SRFI extends the coverage of that consistent choice.

** All vector operations accept optional subrange parameters
============================================================
The vector operations specified below all take optional START/END arguments
indicating a selected subrange of a vector's elements. If a START parameter or
START/END parameter pair is given to such a procedure, they must be exact,
non-negative integers, such that
    0 <= START <= END <= (VECTOR-LENGTH V)
where V is the related vector parameter. If not specified, they default to 0
and the length of the vector, respectively. They are interpreted to select the
range [START,END), that is, all elements from index START (inclusive) up to,
but not including, index END.

** Required vs. allowed side-effects
====================================
LIST-SORT! and LIST-STABLE-SORT! are allowed, but not required,
to alter their arguments' cons cells to construct the result list. This is
consistent with the what-not-how character of the group of procedures
to which they belong (the "sort-lib" package).

The LIST-DELETE-NEIGHBOR-DUPS!, LIST-MERGE! and LIST-MERGE-SORT! procedures,
on the other hand, provide specific algorithms, and, as such, explicitly
commit to the use of side-effects on their input lists in order to guarantee
their key algorithmic properties (e.g., linear-time operation, constant-space
stack use).

-------------------------------------------------------------------------------
* Procedure specification
-------------------------
The procedures are split into several packages. In a Scheme system that has a
module or package system, these procedures should be contained in modules
named as follows:
   Package name			Functionality
   ------------			-------------
   sort-lib			General sorting for lists & vectors
   sorted?-lib			Sorted predicates for lists & vectors
   list-merge-sort-lib		List merge sort
   vector-merge-sort-lib	Vector merge sort
   vector-heap-sort-lib		Vector heap sort
   vector-quick-sort-lib	Vector quick sort
   vector-insert-sort-lib	Vector insertion sort
   delndup-lib			List and vector delete neighbor duplicates
   binsearch-lib		Vector binary search

A Scheme system without a module system should provide all of the bindings
defined in all of these modules as components of the "SRFI-32" package.

Note that there is no "list insert sort" package, as you might as well always
use list merge sort. The reference implementation's destructive list merge
sort will do fewer SET-CDR!s than a destructive insert sort.

** Procedure naming and functionality
=====================================
Almost all of the procedures described below are variants of two basic
operations: sorting and merging. These procedures are consistently named
by composing a set of basic lexemes to indicate what they do.

Lexeme	  Meaning
------	  -------
"sort"	  The procedure sorts its input data set by some < comparison function.

"merge"	  The procedure merges two ordered data sets into a single ordered
	  result.

"stable"  This lexeme indicates that the sort is a stable one.

"vector"  The procedure operates upon vectors.

"list"	  The procedure operates upon lists.

"!"	  Procedures that end in "!" are allowed, and sometimes required, 
	  to reuse their input storage to construct their answer.

** Types of parameters and return values
========================================
In the procedures specified below,
  - A LIS parameter is a list;
      
  - A V parameter is a vector;
      
  - A < or = parameter is a procedure accepting two arguments taken from the
    specified procedure's data set(s), and returning a boolean;
      
  - START and END parameters are exact, non-negative integers that 
    serve as vector indices selecting a subrange of some associated vector.
    When specified, they must satisfy the relation
        0 <= start <= end <= (vector-length v)
    where V is the associated vector.

Passing values to procedures with these parameters that do not satisfy these
types is an error.

If a procedure is said to return "unspecified," this means that nothing at all
is said about what the procedure returns, not even the number of return
values. Such a procedure is not even required to be consistent from call to
call in the nature or number of its return values. It is simply required to
return a value (or values) that may be passed to a command continuation, e.g.
as the value of an expression appearing as a non-terminal subform of a BEGIN
expression. Note that in R5RS, this restricts such a procedure to returning a
single value; non-R5RS systems may not even provide this restriction.

** sort-lib - general sorting package
=====================================
This library provides basic sorting and merging functionality suitable for
general programming. The procedures are named by their semantic properties,
i.e., what they do to the data (sort, stable sort, merge, and so forth).

    Procedure						Suggested algorithm
    -------------------------------------------------------------------------
    list-sorted? < lis -> boolean
    list-merge  < lis1 lis2 -> list
    list-merge! < lis1 lis2 -> list
    list-sort  < lis -> list				(vector heap or quick)
    list-sort! < lis -> list				(list merge sort)
    list-stable-sort  < lis -> list			(vector merge sort)
    list-stable-sort! < lis -> list			(list merge sort)
    list-delete-neighbor-dups  = lis -> list
    list-delete-neighbor-dups! = lis -> list

    vector-sorted? < v [start end]	-> boolean
    vector-merge  < v1 v2 [start1 end1 start2 end2] -> vector
    vector-merge! < v v1 v2 [start start1 end1 start2 end2] -> unspecified
    vector-sort  < v [start end] 	-> vector	(heap or quick sort)
    vector-sort! < v [start end]	-> unspecified	(heap or quick sort)
    vector-stable-sort  < v [start end] -> vector	(vector merge sort)
    vector-stable-sort! < v [start end] -> unspecified	(vector merge sort)
    vector-delete-neighbor-dups  = v [start end] -> vector
    vector-delete-neighbor-dups! = target source [t-start s-start s-end] -> t-end

    LIST-SORTED? and VECTOR-SORTED? return true if their input list or vector
    is in sorted order, as determined by their < comparison parameter.

    All four merge operations are stable: an element of the initial list LIS1
    or vector V1 will come before an equal-comparing element in the second
    list LIS2 or vector V2 in the result.

    The procedures
      LIST-MERGE
      LIST-SORT
      LIST-STABLE-SORT
      LIST-DELETE-NEIGHBOR-DUPS
    do not alter their inputs and are allowed to return a value that shares 
    a common tail with a list argument.

    The procedures
      LIST-SORT!
      LIST-STABLE-SORT!
    are "linear update" operators -- they are allowed, but not required, to
    alter the cons cells of their arguments to produce their results. 

    On the other hand, the procedures
      LIST-DELETE-NEIGHBOR-DUPS!
      LIST-MERGE!
    make only a single, iterative, linear-time pass over their argument lists,
    using SET-CDR!s to rearrange the cells of the lists into the final result
    -- they work "in place." Hence, any cons cell appearing in the result must
    have originally appeared in an input. The intent of this
    iterative-algorithm commitment is to allow the programmer to be sure that
    if, for example, LIST-MERGE! is asked to merge two ten-million-element
    lists, the operation will complete without performing some extremely
    (possibly twenty-million) deep recursion.

    The vector procedures
      VECTOR-SORT
      VECTOR-STABLE-SORT
      VECTOR-DELETE-NEIGHBOR-DUPS
    do not alter their inputs, but allocate a fresh vector for their result,
    of length END - START. 

    The vector procedures
      VECTOR-SORT!
      VECTOR-STABLE-SORT!
    sort their data in-place. (But note that VECTOR-STABLE-SORT! may 
    allocate temporary storage proportional to the size of the input --
    I am not aware of O(n lg n) stable vector-sorting algorithms that
    run in constant space.)

    VECTOR-MERGE returns a vector of length (END1-START1)+(END2-START2).
    
    VECTOR-MERGE! writes its result into vector V, beginning at index START,
    for indices less than END = START + (END1-START1) + (END2-START2). The 
    target subvector 
      V[start,end)
    may not overlap either source subvector 
      V1[start1,end1)
      V2[start2,end2).

    The ...-DELETE-NEIGHBOR-DUPS-... procedures:
    These procedures delete adjacent duplicate elements from a list or a
    vector, using a given element-equality procedure. The first/leftmost
    element of a run of equal elements is the one that survives. The list or
    vector is not otherwise disordered.
    
    These procedures are linear time -- much faster than the O(n^2) general
    duplicate-element deletors that do not assume any "bunching" of elements
    (such as the ones provided by SRFI-1). If you want to delete duplicate
    elements from a large list or vector, you can sort the elements to bring
    equal items together, then use one of these procedures, for a total time
    of O(n lg n).
    
    The comparison function = passed to these procedures is always applied
      (= x y)
    where X comes before Y in the containing list or vector.

    - LIST-DELETE-NEIGHBOR-DUPS does not alter its input list; its answer
      may share storage with the input list.

    - VECTOR-DELETE-NEIGHBOR-DUPS does not alter its input vector, but
      rather allocates a fresh vector to hold the result.

    - LIST-DELETE-NEIGHBOR-DUPS! is permitted, but not required, to
      mutate its input list in order to construct its answer.

    - VECTOR-DELETE-NEIGHBOR-DUPS! reuses its input vector to hold the
      answer, packing its answer into the index range [start,end'), where
      END' is the non-negative exact integer returned as its value. It
      returns END' as its result. The vector is not altered outside the range
      [start,end').

    - VECTOR-DELETE-NEIGHBOR-DUPS! scans vector SOURCE in range
      [S-START,S-END), writing its result to vector TARGET beginning at index
      T-START. It returns exact, non-negative integer T-END, which indicates
      that the results of the operation are found in index range
      [T-START,T-END) of TARGET; elements of TARGET outside this range
      are unaltered.
      
      It is an error for memory cell TARGET[T-START] to be a memory cell in
      the region SOURCE[1 + S-START, S-END). In a Scheme implementation
      that does not allow distinct vectors to share storage, this means
      that one of the following must be true:
        1. (not (eq? source target))
        2. t-start not-in [s-start + 1, s-end)

    - Examples:
	(list-delete-neighbor-dups = '(1 1 2 7 7 7 0 -2 -2))
          => (1 2 7 0 -2)

	(vector-delete-neighbor-dups = '#(1 1 2 7 7 7 0 -2 -2))
          => #(1 2 7 0 -2)

	(vector-delete-neighbor-dups = '#(1 1 2 7 7 7 0 -2 -2) 3 7)
          => #(7 0 -2)

        ;; Result left in v[3,9):
        (let ((v (vector 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6)))
          (cons (vector-delete-neighbor-dups! = v 3)
                v))
           => (9 . #(0 0 0 1 2 3 4 5 6 4 4 5 5 6 6))


** Algorithm-specific sorting packages
======================================
These packages provide more specific sorting functionality, that is,
specific committment to particular algorithms that have particular
pragmatic consequences (such as memory locality, asymptotic running time)
beyond their semantic behaviour (sorting, stable sorting, merging, etc.).
Programmers that need a particular algorithm can use one of these packages.

sorted?-lib - sorted predicates
    list-sorted?   < lis	   -> boolean
    vector-sorted? < v [start end] -> boolean

    Return #f iff there is an adjacent pair ... X Y ... in the input
    list or vector such that Y < X. The optional START/END range 
    arguments restrict VECTOR-SORTED? to the indicated subvector.

list-merge-sort-lib - list merge sort
    list-merge-sort  < lis	-> list
    list-merge-sort! < lis	-> list
    list-merge  lis1 < lis2	-> list
    list-merge! lis1 < lis2	-> list

    The sort procedures sort their data using a list merge sort, which is
    stable. (The reference implementation is, additionally, a "natural" sort.
    See below for the properties of this algorithm.)

    The ! procedures are destructive -- they use SET-CDR!s to rearrange the
    cells of the lists into the proper order. As such, they do not allocate
    any extra cons cells -- they are "in place" sorts. Additionally,
    LIST-MERGE! is iterative -- it can operate on arguments of arbitrary size
    with a constant number of stack frames.

    The merge operations are stable: an element of LIS1 will come before an
    equal-comparing element in LIS2 in the result list.

vector-merge-sort-lib - vector merge sort
    vector-merge-sort  < v [start end temp]			-> vector
    vector-merge-sort! < v [start end temp]			-> unspecified
    vector-merge  < v1 v2 [start1 end1 start2 end2]		-> vector
    vector-merge! < v v1 v2 [start start1 end1 start2 end2]	-> unspecified

    The sort procedures sort their data using vector merge sort, which is
    stable. (The reference implementation is, additionally, a "natural" sort.
    See below for the properties of this algorithm.)

    The optional START/END arguments provide for sorting of subranges, and
    default to 0 and the length of the corresponding vector.
    
    Merge-sorting a vector requires the allocation of a temporary "scratch"
    work vector for the duration of the sort. This scratch vector can be
    passed in by the client as the optional TEMP argument; if so, the supplied
    vector must be of size >= END, and will not be altered outside the range
    [start,end). If not supplied, the sort routines allocate one themselves.

    The merge operations are stable: an element of V1 will come before an
    equal-comparing element in V2 in the result vector.

    VECTOR-MERGE-SORT! leaves its result in V[start,end).

    VECTOR-MERGE-SORT returns a vector of length END-START.
    
    VECTOR-MERGE returns a vector of length (END1-START1)+(END2-START2).
    
    VECTOR-MERGE! writes its result into vector V, beginning at index START,
    for indices less than END = START + (END1-START1) + (END2-START2). The 
    target subvector 
      V[start,end)
    may not overlap either source subvector 
      V1[start1,end1)
      V2[start2,end2).

vector-heap-sort-lib - vector heap sort
    heap-sort  < v [start end] -> vector
    heap-sort! < v [start end] -> unspecified

    These procedures sort their data using heap sort, 
    which is not a stable sorting algorithm.
    
    HEAP-SORT returns a vector of length END-START. 
    HEAP-SORT! is in-place, leaving its result in V[start,end).

vector-quick-sort-lib - vector quick sort
    quick-sort   < v [start end] -> vector
    quick-sort!  < v [start end] -> unspecified
    quick-sort3! c v [start end] -> unspecified

    These procedures sort their data using quick sort, 
    which is not a stable sorting algorithm.
    
    QUICK-SORT returns a vector of length END-START. 
    QUICK-SORT! is in-place, leaving its result in V[start,end).

    QUICK-SORT3! is a variant of quick-sort that takes a three-way
    comparison function C. C compares a pair of elements and returns
    an exact integer whose sign indicates their relationship:
      (c x y) < 0   =>   x<y
      (c x y) = 0   =>   x=y
      (c x y) > 0   =>   x>y
    To help remember the relationship between the sign of the result and
    the relation, use the function - as the model for C: (- x y) < 0
    means that x < y; (- x y) > 0 means that x > y.

    The extra discrimination provided by the three-way comparison can
    provide significant speedups when sorting data sets with many duplicates,
    especially when the comparison function is relatively expensive (e.g.,
    comparing long strings).

    WARNING: Some sort algorithms, such as insertion sort or heap sort,
    can tolerate being passed a <= comparison function when they expect a < 
    function -- insertion and merge sort may simply invert stability; and
    heap sort will run a bit slower, but otherwise produce a correct answer.

    Quicksort, however, is much more critically sensitive to the distinction
    between a < and a <= comparison. If QUICK-SORT or QUICK-SORT! expect a <
    comparison function, and are erroneously given a <= function, they may,
    depending on implementation, produce an unsorted result, go into an 
    infinite loop, cause a run-time error, occasionally produce a correct
    result, or do some fifth thing.

    Implementors may wish to write QUICKSORT3! so that it (a) tests the
    comparison function (by checking that (c v[start] v[start]) produces
    false), or (b) is tolerant of an erroneous <= function, or (c) both.
    Clients of this function, however, should not count on this.

vector-insert-sort-lib - vector insertion sort
    insert-sort  < v [start end] -> vector
    insert-sort! < v [start end] -> unspecified
    
    These procedures stably sort their data using insertion sort.
    
    INSERT-SORT returns a vector of length END-START.
    INSERT-SORT! is in-place, leaving its result in V[start,end).

delndup-lib - list and vector delete neighbor duplicates
    list-delete-neighbor-dups  = lis -> list
    list-delete-neighbor-dups! = lis -> list
    
    vector-delete-neighbor-dups  = v [start end] -> vector
    vector-delete-neighbor-dups! = v [start end] -> end'

    These procedures delete adjacent duplicate elements from a list or
    a vector, using a given element-equality procedure =. The first/leftmost
    element of a run of equal elements is the one that survives. The list
    or vector is not otherwise disordered.

    These procedures are linear time -- much faster than the O(n^2) general
    duplicate-element deletors that do not assume any "bunching" of elements
    (such as the ones provided by SRFI-1). If you want to delete duplicate
    elements from a large list or vector, you can sort the elements to bring
    equal items together, then use one of these procedures, for a total time
    of O(n lg n).
    
    The comparison function = passed to these procedures is always applied
      (= x y)
    where X comes before Y in the containing list or vector.

    LIST-DELETE-NEIGHBOR-DUPS does not alter its input list; its answer
    may share storage with the input list.

    VECTOR-DELETE-NEIGHBOR-DUPS does not alter its input vector, but
    rather allocates a fresh vector to hold the result.

    LIST-DELETE-NEIGHBOR-DUPS! is permitted, but not required, to
    mutate its input list in order to construct its answer.

    VECTOR-DELETE-NEIGHBOR-DUPS! reuses its input vector to hold the
    answer, packing its answer into the index range [start,end'), where
    END' is the non-negative exact integer returned as its value. It
    returns END' as its result. The vector is not altered outside the range
    [start,end').

    Examples:
	(list-delete-neighbor-dups = '(1 1 2 7 7 7 0 -2 -2))
          => (1 2 7 0 -2)

	(vector-delete-neighbor-dups = '#(1 1 2 7 7 7 0 -2 -2))
          => #(1 2 7 0 -2)

	(vector-delete-neighbor-dups = '#(1 1 2 7 7 7 0 -2 -2) 3 7)
          => #(7 0 -2)

        ;; Result left in v[3,9):
        (let ((v (vector 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6)))
          (cons (vector-delete-neighbor-dups! = v 3)
                v))
           => (9 . #(0 0 0 1 2 3 4 5 6 4 4 5 5 6 6))

binsearch-lib - vector binary search lib
    vector-binary-search  elt< elt->key key v [start end] -> integer-or-false
    vector-binary-search3 c                 v [start end] -> integer-or-false

    VECTOR-BINARY-SEARCH searches vector V in range [START,END) (which
    default to 0 and the length of V, respectively) for an element whose
    associated key is equal to KEY. The procedure ELT->KEY is used to map
    an element to its associated key. The elements of the vector are assumed
    to be ordered by the ELT< relation on these keys. That is, 
      (vector-sorted? (lambda (x y) (elt< (elt->key x) (elt->key y)))
                      v start end) => true
    An element E of V is a match for KEY if it's neither less nor greater
    than the key:
      (and (not (elt< (elt->key e) key))
           (not (elt< key (elt->key e))))
    If there is such an element, the procedure returns its index in the
    vector as an exact integer. If there is no such element in the searched 
    range, the procedure returns false.

    (vector-binary-search < car 4 '#((1 . one) (3 . three)
				     (4 . four) (25 . twenty-five)))
    => 2

    (vector-binary-search < car 7 '#((1 . one) (3 . three)
				     (4 . four) (25 . twenty-five)))
    => #f
    
    VECTOR-BINARY-SEARCH3 is a variant that uses a three-way comparison
    function C. C compares its parameter to the search key, and returns an
    exact integer whose sign indicates its relationship to the search key.
      (c x) < 0   =>   x < search-key
      (c x) = 0   =>   x = search-key
      (c x) > 0   =>   x > search-key

    (vector-binary-search3 (lambda (elt) (- (car elt) 4))
                           '#((1 . one) (3 . three)
			      (4 . four) (25 . twenty-five)))
    => 2
    
    Rationale:
    - Why isn't VECTOR-BINARY-SEARCH's ELT->KEY computation simply absorbed
      into the < function? It is separated out because the < function is
      applied twice inside the binary-search inner loop, once with the search
      key for the first argument and the element key for the second argument,
      and once, with the reverse argument order. This is not necessary for
      VECTOR-BINARY-SEARCH3.
  
    - When a comparison operation is able to produce a three-way
      discrimination, the inner loop of the binary search can trim the number
      of per-iteration comparisons from an average of 1.5 to a guaranteed
      single comparison per iteration. This can be a significant savings when
      searching with an expensive comparison operation (e.g., one that
      uses string compare, sends email, references a database, or queries
      a network service such as a web server).

    - Failure is signaled by false (rather than, say, -1) so that searches
      can be used in conditional forms such as
        (or (vector-binary-search ...) ...)
      or
        (cond ((vector-binary-search ...) => index-consumer)
              ...)

-------------------------------------------------------------------------------
* Algorithmic properties
------------------------
Different sort and merge algorithms have different properties.
Choose the algorithm that matches your needs:

Vector insert sort
    Stable, but only suitable for small vectors -- O(n^2).

Vector quick sort
    Not stable. Is fast on average -- O(n lg n) -- but has bad worst-case
    behaviour. Has good memory locality for big vectors (unlike heap sort). 
    A clever pivot-picking trick (median of three samples) helps avoid 
    worst-case behaviour, but pathological cases can still blow up.

Vector heap sort
    Not stable. Guaranteed fast -- O(n lg n) *worst* case. Poor locality
    on large vectors. A very reliable workhorse.

Vector merge sort
    Stable. Not in-place -- requires a temporary buffer of equal size. 
    Fast -- O(n lg n) -- and has good memory locality for large vectors.
    
    The implementation of vector merge sort provided by this SRFI's reference
    implementation is, additionally, a "natural" sort, meaning that it
    exploits existing order in the input data, providing O(n) best case.

Destructive list merge sort
    Stable, fast and in-place (i.e., allocates no new cons cells). "Fast"
    means O(n lg n) worse-case, and substantially better if the data
    is already mostly ordered, all the way down to linear time for
    a completely-ordered input list (i.e., it is a "natural" sort).

    Note that sorting lists involves chasing pointers through memory, which
    can be a loser on modern machine architectures because of poor cache &
    page locality. Pointer *writing*, which is what the SET-CDR!s of a
    destructive list-sort algorithm do, is even worse, especially if your
    Scheme has a generational GC -- the writes will thrash the write-barrier.
    Sorting vectors has inherently better locality.

    This SRFI's destructive list merge and merge sort implementations are
    opportunistic -- they avoid redundant SET-CDR!s, and try to take long
    already-ordered runs of list structure as-is when doing the merges.

Pure list merge sort
    Stable and fast -- O(n lg n) worst-case, and possibly O(n), depending
    upon the input list (see discussion above).


Algorithm     Stable?  Worst case  Average case  In-place
------------------------------------------------------
Vector insert Yes      O(n^2)      O(n^2)        Yes
Vector quick  No       O(n^2)      O(n lg n)     Yes
Vector heap   No       O(n lg n)   O(n lg n)     Yes
Vector merge  Yes      O(n lg n)   O(n lg n)     No
List merge    Yes      O(n lg n)   O(n lg n)     Either


-------------------------------------------------------------------------------
* Porting and optimisation
--------------------------
This package should be trivial to port.

This code is tightly bummed, as far as I can go in portable Scheme.

You could speed up the vector code a lot by error-checking the procedure
parameters and then shifting over to fixnum-specific arithmetic and dangerous
vector-indexing and vector-setting primitives. The comments in the code
indicate where the initial error checks would have to be added. There are
several (QUOTIENT N 2)'s that could be changed to a fixnum right-shift, as
well, in both the list and vector code (SRFI 33 provides such an operator).
The code is designed to enable this -- each file usually exports one or two
"safe" procedures that end up calling an internal "dangerous" primitive. The
little exported cover procedures are where you move the error checks.

This should provide *big* speedups. In fact, all the code bumming I've done
pretty much disappears in the noise unless you have a good compiler and also
can dump the vector-index checks and generic arithmetic -- so I've really just
set things up for you to exploit.

The optional-arg parsing, defaulting, and error checking is done with a
portable R4RS macro. But if your Scheme has a faster mechanism (e.g., Chez),
you should definitely port over to it. Note that argument defaulting and
error-checking are interleaved -- you don't have to error-check defaulted
START/END args to see if they are fixnums that are legal vector indices for
the corresponding vector, etc.


-------------------------------------------------------------------------------
* References & Links
--------------------

This document, in HTML:
    http://srfi.schemers.org/srfi-32/srfi-32.html
    [This link may not be valid while the SRFI is in draft form.]

This document, in simple text format:
    http://srfi.schemers.org/srfi-32/srfi-32.txt

Archive of SRFI-32 discussion-list email:
    http://srfi.schemers.org/srfi-32/mail-archive/maillist.html

SRFI web site:
    http://srfi.schemers.org/

[CommonLisp]
    Common Lisp: the Language
    Guy L. Steele Jr. (editor).
    Digital Press, Maynard, Mass., second edition 1990.
    Available at http://www.elwood.com/alu/table/references.htm#cltl2

    The Common Lisp "HyperSpec," produced by Kent Pitman, is essentially
    the ANSI spec for Common Lisp:
    http://www.xanalys.com/software_tools/reference/HyperSpec/

[R5RS]
    Revised^5 Report on the Algorithmic Language Scheme, 
    R. Kelsey, W. Clinger, J. Rees (editors).
    Higher-Order and Symbolic Computation, Vol. 11, No. 1, September, 1998.
    and ACM SIGPLAN Notices, Vol. 33, No. 9, October, 1998.

    Available at http://www.schemers.org/Documents/Standards/


-------------------------------------------------------------------------------
* Acknowledgements
------------------

I thank the authors of the open source I consulted when designing this
library, particularly Richard O'Keefe, Donovan Kolby and the MIT Scheme Team.


-------------------------------------------------------------------------------
* Copyright
-----------

** SRFI text
============
This document is copyright (C) Olin Shivers (1998, 1999). 
All Rights Reserved. 

This document and translations of it may be copied and furnished to others,
and derivative works that comment on or otherwise explain it or assist in its
implementation may be prepared, copied, published and distributed, in whole or
in part, without restriction of any kind, provided that the above copyright
notice and this paragraph are included on all such copies and derivative
works. However, this document itself may not be modified in any way, such as
by removing the copyright notice or references to the Scheme Request For
Implementation process or editors, except as needed for the purpose of
developing SRFIs in which case the procedures for copyrights defined in the
SRFI process must be followed, or as required to translate it into languages
other than English.

The limited permissions granted above are perpetual and will not be revoked by
the authors or their successors or assigns.

This document and the information contained herein is provided on an "AS IS"
basis and THE AUTHORS AND THE SRFI EDITORS DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

** Reference implementation
===========================
Short summary: no restrictions.

While I wrote all of this code myself, I read a lot of code before I began
writing. However, all such code is, itself, either open source or public
domain, rendering irrelevant any issue of "copyright taint."

The natural merge sorts (pure list, destructive list, and vector) are not only
my own code, but are implementations of an algorithm of my own devising. They
run in O(n lg n) worst case, O(n) best case, and require only a logarithmic
number of stack frames. And they are stable. And the destructive-list variant
allocates zero cons cells; it simply rearranges the cells of the input list.

Hence the reference implementation is
    Copyright (c) 1998 by Olin Shivers.
and made available under the same copyright as the SRFI text (see above).