diff --git a/scsh/treaps/AUTHORS b/scsh/treaps/AUTHORS new file mode 100644 index 0000000..eae3d05 --- /dev/null +++ b/scsh/treaps/AUTHORS @@ -0,0 +1 @@ +Martin Gasbichler, Oleg Kiselyov \ No newline at end of file diff --git a/scsh/treaps/BLURB b/scsh/treaps/BLURB new file mode 100644 index 0000000..bba5bd4 --- /dev/null +++ b/scsh/treaps/BLURB @@ -0,0 +1,2 @@ +treaps: An ordered dictionary data structure, based on randomized +search trees (treaps). diff --git a/scsh/treaps/README b/scsh/treaps/README new file mode 100644 index 0000000..8da39c0 --- /dev/null +++ b/scsh/treaps/README @@ -0,0 +1,261 @@ +[This file and the accompanying source code were derived from + Oleg's code for Gambit available from + + http://okmij.org/ftp/Scheme/lib/treap.scm + +] + + +An implementation of an ordered dictionary data structure, based +on randomized search trees (treaps) by Seidel and Aragon: + + R. Seidel and C. R. Aragon. Randomized Search Trees. + Algorithmica, 16(4/5):464-497, 1996. + + +This code defines a treap object that implements an ordered dictionary +mapping of keys to values. The object responds to a variety of query and +update messages, including efficient methods for finding the minimum and +maximum keys and their associated values as well as traversing of a +treap in an ascending or descending order of keys. Looking up an arbitrary +or the min/max keys, and deleting the min/max keys require no more +key comparisons than the depth of the treap, which is O(log n) where +n is the total number of keys in the treap. Arbitrary key deletion and +insertions run in O(log n) _amortized_ time. + +This code is inspired by a Stefan Nilsson's article "Treaps in Java" +(Dr.Dobb's Journal, July 1997, p.40-44) and by the Java implementation +of treaps described in the article. Yet this Scheme code has been +developed completely from scratch, using the description of the algorithm +given in the article, and insight gained from examining the Java source +code. As a matter of fact, treap insertion and deletion algorithms +implemented in this code differ from the ones described in the article +and implemented in the Java code; this Scheme implementation uses fewer +assignments and comparisons (see below for details). Some insight as +to a generic tree interface gleaned from wttree.scm, "Weight balanced +trees" by Stephen Adams (a part of The Scheme Library, slib2b1). + +A treap is a regular binary search tree, with one extension. The extension +is that every node in a tree, beside a key, a value, and references to +its left and right children, has an additional constant field, a priority. +The value of this field is set (to a random integer number) when the node +is constructed, and is not changed afterwards. At any given moment, +the priority of every non-leaf node never exceeds the priorities of its +children. When a new node is inserted, we check that this invariant holds; +otherwise, we perform a right or left rotation that swaps a parent and +its kid, keeping the ordering of keys intact; the changes may need to be +propagated recursively up. The priority property, and the fact they are +chosen at random, makes a treap look like a binary search tree built by +a random sequence of insertions. As the article shows, this makes a treap +a balanced tree: the expected length of an average search path is roughly +1.4log2(n)-1, and the expected length of the longest search path is about +4.3log2(n). See the Stefan Nilsson's article for more details. + + +(make-treap key-compare) -> treap + +Creates a treap object. Here KEY-COMPARE-PROC is a user-supplied +function + + KEY-COMPARE-PROC key1 key2 + +that takes two keys and returns a negative, positive, or zero number +depending on how the first key compares to the second. The treap uses +SRFI-27's DEFAULT-RANDOM-SOURCE so you may want to initialize this +once with (random-source-randomize! default-random-source). + + +(treap? thing) -> boolean + +Type predicate for treaps. + + +(treap-get treap key [default-clause]) -> key-value pair or value of default-clause + +Searches the treap for an association with a given KEY, and returns a +(key . value) pair of the found association. If an association with +the KEY cannot be located in the treap, the PROC returns the result of +evaluating the DEFAULT-CLAUSE. If the default clause is omitted, an +error is signalled. The KEY must be comparable to the keys in the +treap by a key-compare predicate (which has been specified when the +treap was created) + + +(treap-get-min treap) -> key-value pair +(treap-get-max treap) -> key-value pair + +return a (key . value) pair for an association in the treap with the +smallest/largest key. If the treap is empty, an error is signalled. + + +(treap-delete-min! treap) -> key-value pair +(treap-delete-max! treap) -> key-value pair + +remove the min/max key and the corresponding association from the +treap. Return a (key . value) pair of the removed association. If +the treap is empty, an error is signalled. + + +(treap-empty? treap) -> boolean + +returns #t if the treap is empty. + + +(treap-size treap) -> integer + +returns the size (the number of associations) in the treap. + + +(treap-depth treap) -> integer + +returns the depth of the tree. It requires the complete traversal of +the tree, so use sparingly. + + +(treap-clear! treap) -> unspecific + +removes all associations from the treap (thus making it empty). + + +(treap-put! treap key value) -> key-value pair or #f + +adds the corresponding association to the treap. If an association +with the same KEY already exists, its value is replaced with the VALUE +(and the old (key . value) association is returned). Otherwise, the +return value is #f. + + +(treap-delete! treap key [default-clause]) -> key-value pair + +searches the treap for an association with a given KEY, deletes it, +and returns a (key . value) pair of the found and deleted association. +If an association with the KEY cannot be located in the treap, the +PROC returns the result of evaluating the DEFAULT-CLAUSE. If the +default clause is omitted, an error is signalled. + + +(treap-for-each-ascending treap proc) -> unspecific + +applies the given procedure PROC to each (key . value) association of +the treap, from the one with the smallest key all the way to the one +with the max key, in an ascending order of keys. The treap must not +be empty. + + +(treap-for-each-descending treap proc) -> unspecific + +applies the given procedure PROC to each (key . value) association of +the treap, in the descending order of keys. The treap must not be +empty. + + +(treap-debugprint treap) -> unspecific + +prints out all the nodes in the treap, for debug purposes. + + + +Notes on the algorithm +As the DDJ paper shows, insertion of a node into a treap is a simple +recursive algorithm, Example 1 of the paper. This algorithm is implemented +in the accompanying source [Java] code as +
+ private Tree insert(Tree node, Tree tree) { + if (tree == null) return node; + int comp = node.key.compareTo(tree.key); + if (comp < 0) { + tree.left = insert(node, tree.left); + if (tree.prio > tree.left.prio) + tree = tree.rotateRight(); + } else if (comp > 0) { + tree.right = insert(node, tree.right); + if (tree.prio > tree.right.prio) + tree = tree.rotateLeft(); + } else { + keyFound = true; + prevValue = tree.value; + tree.value = node.value; + } + return tree; + } ++ +This algorithm, however, is not as efficient as it could be. Suppose we +try to insert a node which turns out to already exist in the tree, +at a depth D. The algorithm above would descend into this node in the +winding phase of the algorithm, replace the node's value, and, in the +unwinding phase of the recursion, would perform D assignments of the kind + tree.left = insert(node, tree.left); +and D comparisons of nodes' priorities. None of these priority checks and +assignments are actually necessary: as we haven't added any new node, +the tree structure hasn't changed. + +Therefore, the present Scheme code implements a different insertion +algorithm, which avoids doing unnecessary operations. The idea is simple: +if we insert a new node into some particular branch of the treap and verify +that this branch conforms to the treap priority invariant, we are certain +the invariant holds throughout the entire treap, and no further checks +(up the tree to the root) are necessary. In more detail: + - Starting from the root, we recursively descend until we find + a node with a given key, or a place a new node can be inserted. + - We insert the node and check to see if its priority is less than + that of its parent. If this is the case, we left- or right-rotate + the tree to make the old parent a child of the new node, and the + new node a new root of this particular branch. We return this new + parent as an indication that further checks up the tree are + necessary. If the new node conforms to the parent's priority, we + insert it and return #f + - On the way up, we check the priorities again and rotate the tree + to restore the priority invariant at the current level if needed. + - If no changes are made at the current level, we return a flag #f + meaning that no further changes or checks are necessary at the + higher levels. +Thus, if a new node was originally inserted at a level D in the tree (level +0 being the root) but then migrated up by L levels (because of its priority), +the original insertion algorithm would perform (D-1) assignments, +(D-1) priority checks, plus L rotations (at a cost of 2 assignments in the +treap each). Our algorithm does only (L+1) node priority checks and +max(2(L-1)+2,1) assignments. +Note if priorities are really (uniformly) random, L is uniformly distributed +over [0,D], so the average cost of our algorithm is + D/2 +1 checks and D assignments +compared to + D-1 checks and 2D-1 assignments +for the original algorithm described in the DDJ paper. + +The similar gripe applies to the Java implementation of a node deletion +algorithm: +
+ private Tree delete(Ordered key, Tree t) { + if (t == null) return null; + int comp = key.compareTo(t.key); + if (comp < 0) + t.left = delete(key, t.left); + else if (comp > 0) + t.right = delete(key, t.right); + else { + keyFound = true; + prevValue = t.value; + t = t.deleteRoot(); + } + return t; + } ++ +The algorithm as implemented looks fully-recursive. Furthermore, deletion +of a node at a level D in the treap involves at least D assignments, most +of them being unnecessary. Indeed, if a node being deleted is a leaf, only +one change to the tree is needed to detach the node. Deleting a node +obviously requires a left- or a right-kid field of the node's parent be +modified (cleared). This change, however does NOT need to be propagated up: +deleting of a node does not violate neither ordering nor priority invariants +of the treap; all changes are confined to a branch rooted at the +parent of the deleted node. +This Scheme code implements node deletion algorithm in the optimal way, +performing only those assignments which are absolutely necessary, and +replacing full recursions with tail recursions (which are simply iterations). +Our implementation is also simpler and clearer, making use of a helper +procedure join! to join two treap branches (while keeping both treap +invariants intact). The deletion algorithm can then be expressed as +replacing a node with a join of its two kids; compare this explanation +to the one given in the DDJ paper! diff --git a/scsh/treaps/interfaces.scm b/scsh/treaps/interfaces.scm new file mode 100644 index 0000000..9e76191 --- /dev/null +++ b/scsh/treaps/interfaces.scm @@ -0,0 +1,15 @@ +(define-interface treaps-interface + (export make-treap + treap-get + treap-delete! + treap-get-min + treap-get-max + treap-delete-min! + treap-delete-max! + treap-empty? + treap-depth + treap-clear! + treap-put! + treap-for-each-ascending + treap-for-each-descending + treap-debugprint debugprint)) diff --git a/scsh/treaps/packages.scm b/scsh/treaps/packages.scm new file mode 100644 index 0000000..53d1312 --- /dev/null +++ b/scsh/treaps/packages.scm @@ -0,0 +1,6 @@ +(define-structure treaps treaps-interface + (open scheme + srfi-9 + srfi-23 + srfi-27) + (files treap)) diff --git a/scsh/treaps/treap.scm b/scsh/treaps/treap.scm new file mode 100644 index 0000000..d1689d0 --- /dev/null +++ b/scsh/treaps/treap.scm @@ -0,0 +1,359 @@ +(program + (requires srfi-9 srfi-23 srfi-27) +(code +;;; This file and the accompanying README were derived from +;;; Oleg's code for Gambit available from +;;; +;;; http://okmij.org/ftp/Scheme/lib/treap.scm +;;; + +(define-record-type treap + (really-make-treap key-compare size root) + treap? + (key-compare treap-key-compare) + (size treap-size set-treap-size!) + (root treap-root set-treap-root!)) + +(define (make-treap key-compare) + (really-make-treap key-compare 0 #f)) + +;; a node of a tree, a record of +;; key, anything that key-compare could be applied to +;; value, any object associated with the key +;; left-kid, #f if absent +;; right-kid +;; prio, a priority of the node (a FIXNUM random number) + + +(define-record-type node + (make-node key value left-kid right-kid priority) + node? + (key node:key) + (value node:value node:value-set!) + (left-kid node:left-kid node:left-kid-set!) + (right-kid node:right-kid node:right-kid-set!) + (priority node:priority)) + +(define random + (let ((max (- (expt 2 15) 1))) + (lambda () + (random-integer max)))) + +(define (new-leaf key value) + (make-node key value #f #f (random))) + +(define (node:key-value node) + (cons (node:key node) + (node:value node))) + +(define (node:unsubordination? parent kid) + (> (node:priority parent) (node:priority kid))) + +(define-syntax node:dispatch-on-key + (syntax-rules () + ((node:dispatch-on-key treap node key on-less on-equal on-greater) + (let ((result ((treap-key-compare treap) key (node:key node)))) + (cond + ((zero? result) on-equal) + ((positive? result) on-greater) + (else on-less)))))) + +(define (n-display . args) + (for-each display args)) + +(define (node:debugprint node) + (n-display " " (node:key-value node) ", kids " + (cons (not (not (node:left-kid node))) + (not (not (node:right-kid node)))) + ", prio " (node:priority node) #\newline)) + + + + +;; Looking up assocaitions in a treap: just like in any search tree +;; Given a key, return the corresponding (key . value) association +;; in the treap, or #f if the treap does not contain an association +;; with that key +;; This procedure takes as many comparisons (evaluations of the +;; key-compare procedure) as the depth of the found node +(define (locate-assoc treap key) + (let loop ((node (treap-root treap))) + (and node + (node:dispatch-on-key treap node key + (loop (node:left-kid node)) + (node:key-value node) + (loop (node:right-kid node)))))) + +(define (locate-extremum-node treap branch-selector) + (let ((root (treap-root treap))) + (if (not root) (error "empty tree") + (let loop ((node root) (parent #f)) + (if node (loop (branch-selector node) node) + (node:key-value parent)))))) + + ; in-order traversal of the treap +(define (for-each-inorder treap primary-branch-selector secondary-branch-selector) + (let ((root (treap-root treap))) + (lambda (proc) + (if (not root) (error "empty tree") + (let loop ((node root)) + (if node + (begin + (loop (primary-branch-selector node)) + (proc (node:key-value node)) + (loop (secondary-branch-selector node))))))))) + +(define (get-depth treap) + (let ((root (treap-root treap))) + (let loop ((node root) (level 0)) + (if (not node) level + (max (loop (node:left-kid node) (+ 1 level)) + (loop (node:right-kid node) (+ 1 level))))))) + +;; debug printing of all nodes of the tree in-order +;; in an ascending order of keys +(define (debugprint treap) + (let ((root (treap-root treap))) + (n-display #\newline + "The treap contains " (treap-size treap) " nodes" + #\newline) + (let loop ((node root) (level 0)) + (if node + (begin + (loop (node:left-kid node) (+ 1 level)) + (n-display " level " level) + (node:debugprint node) + (loop (node:right-kid node) (+ 1 level)))) + (newline)))) + +;; Adding a new association to the treap (or replacing the old one +;; if existed). Return the (key . value) pair of an old (existed +;; and replaced association), or #f if a new association was really +;; added +(define (insert! treap key value) + (let ((root (treap-root treap))) + (letrec ((new-node (new-leaf key value)) + (old-key-value #f) + ;; If the left branch of parent is empty, insert the + ;; new node there, check priorities + ;; Otherwise, descend recursively + ;; If the parent got inverted due to a right rotation, + ;; return the new parent of the branch; otherwise, + ;; return #f (indicating no further checks are necessary) + (insert-into-left-branch + (lambda (key parent) + (let ((old-left-kid (node:left-kid parent))) + ;; Found a place to insert the 'new-node': as the left + ;; leaf of the parent + (if (not old-left-kid) + (cond + ((node:unsubordination? parent new-node) + ;; Right rotation over the new-leaf + (node:right-kid-set! new-node parent) + new-node) ;; becomes a new parent + (else + (node:left-kid-set! parent new-node) + #f)) + ;; Insert the new-leaf into a branch rooted + ;; on old-left-kid + (let ((new-left-kid + (node:dispatch-on-key treap old-left-kid key + (insert-into-left-branch key old-left-kid) + (update-existing-node old-left-kid) + (insert-into-right-branch key old-left-kid)))) + (and new-left-kid + ;; That branch got a new root + (cond + ((node:unsubordination? parent new-left-kid) + ;; Right rotation over the new-left-kid + (node:left-kid-set! parent + (node:right-kid new-left-kid)) + (node:right-kid-set! new-left-kid parent) + new-left-kid) ;; becomes a new parent + (else + (node:left-kid-set! parent new-left-kid) + #f)))) + )))) + + ;; If the right branch of parent is empty, insert the + ;; new node there, check priorities + ; Otherwise, descend recursively + ;; If the parent got inverted due to a left rotation, + ;; return the new parent of the branch; otherwise, + ;; return #f (indicating no further checks are necessary) + (insert-into-right-branch + (lambda (key parent) + (let ((old-right-kid (node:right-kid parent))) + ;; Found a place to insert the 'new-node': as the right + ;; leaf of the parent + (if (not old-right-kid) + (cond + ((node:unsubordination? parent new-node) + ;; Left rotation over the new-leaf + (node:left-kid-set! new-node parent) + new-node) ; becomes a new parent + (else + (node:right-kid-set! parent new-node) + #f)) + ;; Insert the new-leaf into a branch rooted + ;; on old-right-kid + (let ((new-right-kid + (node:dispatch-on-key treap old-right-kid key + (insert-into-left-branch key old-right-kid) + (update-existing-node old-right-kid) + (insert-into-right-branch key old-right-kid)))) + (and new-right-kid + ;; That branch got a new root + (cond + ((node:unsubordination? parent new-right-kid) + ;; Left rotation over the new-right-kid + (node:right-kid-set! parent + (node:left-kid new-right-kid)) + (node:left-kid-set! new-right-kid parent) + new-right-kid) ; becomes a new parent + (else + (node:right-kid-set! parent new-right-kid) + #f)))) + )))) + + (update-existing-node + (lambda (node) + (set! old-key-value (node:key-value node)) + (node:value-set! node value) + #f)) + ) ; end of letrec + + ;; insert's body + (cond + ;; insert into an empty tree + ((not root) (set-treap-root! treap new-node)) + + (else + (let ((new-root + (node:dispatch-on-key treap root key + (insert-into-left-branch key root) + (update-existing-node root) + (insert-into-right-branch key root)))) + (if new-root + (set-treap-root! treap new-root))))) + (if (not old-key-value) + (set-treap-size! treap (+ (treap-size treap) 1))) ; if the insertion has really occurred + old-key-value))) + + +;; Deleting existing associations from the treap + +(define (delete-extremum-node! treap branch-selector + branch-setter the-other-branch-selector) + (let ((root (treap-root treap))) + (cond + ((not root) (error "empty tree")) + ((not (branch-selector root)) ; root is the extreme node + (let ((result (node:key-value root))) + (set-treap-root! treap (the-other-branch-selector root)) + (set-treap-size! treap (- (treap-size treap) 1)) + result)) + (else + (let loop ((node (branch-selector root)) (parent root)) + (let ((kid (branch-selector node))) + (if kid (loop kid node) + (let ((result (node:key-value node))) + (branch-setter parent (the-other-branch-selector node)) + (set-treap-size! treap (- (treap-size treap) 1)) + result)))))))) + +;; Given two treap branches (both of which could be empty) +;; which satisfy both the order invariant and the priority invariant +;; (all keys of all the nodes in the right branch are strictly bigger +;; than the keys of left branch nodes), join them +;; while keeping the sorted and priority orders intact +(define (join! treap left-branch right-branch) + (cond + ((not left-branch) right-branch) ; left-branch was empty + ((not right-branch) left-branch) ; right-branch was empty + ((node:unsubordination? left-branch right-branch) + ;; the root of the right-branch should be the new root + (node:left-kid-set! right-branch + (join! treap left-branch (node:left-kid right-branch))) + right-branch) + (else + ;; the root of the left-branch should be the new root + (node:right-kid-set! left-branch + (join! treap (node:right-kid left-branch) right-branch)) + left-branch))) + + +;; Find an association with a given KEY, and delete it. +;; Return the (key . value) pair of the deleted association, or +;; #f if it couldn't be found +(define (delete! treap key) + (define (delete-node! node parent from-left?) + (let ((old-assoc (node:key-value node)) + (new-kid (join! treap (node:left-kid node) (node:right-kid node)))) + (set-treap-size! treap (- (treap-size treap) 1)) + (if parent + (if from-left? + (node:left-kid-set! parent new-kid) + (node:right-kid-set! parent new-kid)) + ;; Deleting of the root node + (set-treap-root! treap new-kid)) + old-assoc)) + + (let loop ((node (treap-root treap)) (parent #f) (from-left? #t)) + (and node + (node:dispatch-on-key treap node key + (loop (node:left-kid node) node #t) + (delete-node! node parent from-left?) + (loop (node:right-kid node) node #f))))) + +(define (apply-default-clause key default-clause) + (cond + ((null? default-clause) + (error "key " key " was not found in the treap ")) + ((pair? (cdr default-clause)) + (error "default argument must be a single clause")) + ((procedure? (car default-clause)) ((car default-clause))) + (else (car default-clause)))) + +(define (treap-get treap key . default-clause) + (or (locate-assoc treap key) (apply-default-clause key default-clause))) + +(define (treap-delete! treap key . default-clause) + (or (delete! treap key) (apply-default-clause key default-clause))) + +(define (treap-get-min treap) + (locate-extremum-node treap node:left-kid)) + +(define (treap-get-max treap) + (locate-extremum-node treap node:right-kid)) + +(define (treap-delete-min! treap) + (delete-extremum-node! treap + node:left-kid node:left-kid-set! + node:right-kid)) + +(define (treap-delete-max! treap) + (delete-extremum-node! treap + node:right-kid node:right-kid-set! + node:left-kid)) + +(define (treap-empty? treap) + (not (treap-root treap))) + +(define (treap-depth treap) + (get-depth treap)) + +(define (treap-clear! treap) + (set-treap-root! treap #f) + (set-treap-size! treap 0)) + +(define treap-put! insert!) + +(define (treap-for-each-ascending treap proc) + ((for-each-inorder treap node:left-kid node:right-kid) proc)) + +(define (treap-for-each-descending treap proc) + ((for-each-inorder treap node:right-kid node:left-kid) proc)) + +(define treap-debugprint debugprint) +)) \ No newline at end of file