; Date: 15 Nov 88 23:03:24 GMT
; From: uoregon!markv@beaver.cs.washington.edu  (Mark VandeWettering)
; Organization: University of Oregon, Computer Science, Eugene OR
; Subject: The Paradoxical Combinator -- Y (LONG)
; 
; Alternatively entitled:
; 	"Y?  Why Not?" :-)
; 
; The discussion that has been going on in regards to the Y combinator as
; the basic operation in implementing recursive functions are interesting.
; The practical tests that people have made have shown that the Y
; combinator is orders of magnitude slower for implementing recursion than
; directly compiling it.
; 
; This is true for Scheme.  I hold that for an interesting set of
; languages, (lazy languages) that this result will not necessarily hold.
; 
; The problem with Y isn't its complexity, it is the fact that it is an
; inherently lazy operation.  Any implementation in Scheme is clouded by
; the fact that Scheme is an applicative order evaluator, while Y prefers
; to be evaluated in normal order.
; 
; 	
 (define Y
   (lambda (g)	
     ((lambda (h) (g (lambda (x) ((h h) x))))
      (lambda (h) (g (lambda (x) ((h h) x)))))))
; 
 (define fact
   (lambda (f)
     (lambda (n)
       (if (= n 1)
 	  1
 	(* n (f (- n 1)))))))
; 
; 
; Evaluating (Y fact) 2 results in the following operations in
; Scheme:
; 
; The argument is (trivially) evaluated, and returns two.
; (Y fact) must be evaluated.  What is it?  Y and fact each evaluate
; to closures.  When applied, Y binds g to fact, and executes the
; body.
; 
; The body is an application of a closure to another closure.  The
; operator binds h to the operand, and executes its body which....
; 
; Evaluates (g (lambda (x) ((h h) x))).  The operand is a closure,
; which gets built and then returns.  g evaluates to fact.  We
; substitute the closure (lambda (x) ((h h) x)) in for the function
; f in the definition of fact, giving...
; 
; (lambda (n)
;   (if (= n 1) 
;       1
;     (* n ((lambda (x) ((h h) x)) (- n 1)))))
; 
; Which we return as the value of (Y fact).  When we apply this to 2, we get
; 
; (* 2 ((lambda (x) ((h h) x)) 1))
; 
; We then have to evaluate
; ((lambda (x) ((h h) x)) 1)
; 
; or 
; ((h h) 1)
; 
; But remembering that h was (lambda (h) (g (lambda (x) ((h h) x)))), 
; we have 
; 
; (((lambda (h) (g (lambda (x) ((h h) x))))
;   (lambda (h) (g (lambda (x) ((h h) x)))))
;  1) ....
; 
; So, we rebind h to be the right stuff, and evaluate the body, which is
; 
; ((g (lambda (x) ((h h) x))) 1)
; 
; Which by the definition of g (still == fact) is just 1.
; 
; (* 2 1) = 2.
; 
; ########################################################################
; 
; Summary:  If you didn't follow this, performing this evaluation
; was cumbersome at best.  As far as compiler or interpreter is
; concerned, the high cost of evaluating this function is related
; to two different aspects:
; 
; It is necessary to create "suspended" values.  These suspended
; values are represented as closures, which are in general heap
; allocated and expensive.
; 
; For every level of recursion, new closures are created (h gets
; rebound above).  While this could probably be optimized out by a
; smart compiler, it does seem like the representation of suspended
; evaluation by lambdas is inefficient.
; 
; 	   
; ########################################################################
; 
; You can try to figure out how all this works.  It is complicated, I
; believe I understand it.  The point in the derivation above is that in
; Scheme, to understand how the implementation of Y works, you have to
; fall back on the evaluation mechanism of Scheme.  Suspended values must
; be represented as closures.  It is the creation of these closures that
; cause the Scheme implementation to be slow.
; 
; If one wishes to abandon Scheme (or at least applicative order
; evaluators of Scheme) one can typically do much better.  My thesis work
; is in graph reduction, and trying to understand better the issues having
; to do with implementation.
; 
; In graph reduction, all data items (evaluated and unevaluated) have the
; same representation: as graphs in the heap.  We choose to evaluate using
; an outermost, leftmost strategy.  This allows the natural definition of
; (Y h) = (h (Y h)) to be used.  An application node of the form:
; 
; 			    @
; 			   / \
; 			  /   \
; 			 Y     h
; 
; can be constructed in the obvious way:
;                             @
; 			   / \
; 			  /   \
; 			 h     @
; 			      /	\
; 			     /   \
; 			    Y     h
; 
; costing one heap allocation per level of recursion, which is
; certainly cheaper than the multiple allocations of scheme
; closures above.  More efficiently, we might choose to implement
; it using a "knot tying" version:
; 
; 
; 			      /\
;                              / 	\
; 			    @	 |
; 			   / \ 	/
; 			  /   \/
; 			 h
; 
; Which also works quite well.  Y has been eliminated, and will
; cause no more reductions.  
; 
; The basic idea is somehow that recursion in functional languages
; is analogous to cycles in the graph in a graph reduction engine.
; Therefore, the Y combinator is a specific "textual" indicator of
; the graph.
; 
; The G-machine (excellently described in Peyton Jones' book "The
; Implementation of Functional Programming Languages") also
; described the Y combinator as being efficient.  He chose letrecs
; as being a primitive in the extended lambda calculus.  His
; methodology behind compiling these recursive definitions was
; basically to compile fixed code which directly built these cyclic
; structures, rather than having them built at runtime.
; 
; I think (and my thesis work is evolving into this kind of
; argument) that Y is overlooked for trivial reasons.  Partial
; evaluation and smarter code generation could make an SK based
; compiler generate code which is equal in quality to that produced
; by supercombinator based compilation.
; 
; 
; This is too long already, ciao for now.
; 
; Mark VandeWettering

(print ((Y fact) 10))