I would be grateful for additional comments.
> This article will show the design of the fastest ever interpreted
> subsets function. Sorry I'm too excited about this. Again, we start
> with the mathematical definition of the problem, which leads to a
> simple, correct and stunningly efficient solution. The final, fastest
> ever solution is still pure functional.
> We start with the definition of the problem: given a set 'l' and a
> number 'n', return the set of all subsets of 'l' of cardinality
> 'n'. Sets are represented by lists.
> The key was to choose the right definition.
> Let ps(L) is a powerset of L that does not include {empty set} (that
> is, a singular set whose element is the empty set).
> length(ps(L)) = (- (expt 2 (length L)) 1)
> Let L be a non-empty set and let L = A U B where A and B are two
> disjoint subsets of L.
> Obviously,
> ps(L) = ps(A U B) =
> ps(A) U ps(B) U { y U x | x <- ps(A), y <- ps(B) }
> Let (subsets L n) = (filter (lambda (el) (= n (length el))) ps(L) )
> Thus the desired function 'subsets' is a filtered powerset ps. This
> seems to be a stupid definition. Let's not be hasty however. Note that
> filter and union commute: the filter of a union of sets is the union
> of filtered sets. Therefore, from the previous expression
> (subsets L n) = (subsets (union A B) n)
> = (subsets A n) U (subsets B n)
> Union{ y U x | x <- (subsets A k), y <- (subsets B (- n k)),
> k=1,n-1 }
> Well, this is it. Note, we didn't say how to split L into two disjoint
> subsets A and B. We can do as we wish. For example, we can choose to
> split in such a way so that (length B) is n. In the most difficult
> case where n = (/ (length L) 2), this corresponds to a "divide and
> conquer" strategy, so to speak (at least in the first stages).
> The Scheme code below implements this idea verbatim. It uses the
> accumulator-passing style that was expounded earlier.
> (define (subsets-v5 l n)
> ; The initialization function. Check the boundary conditions
> (define (loop l ln n accum)
> (cond
> ((<= n 0) (cons '() accum))
> ((< ln n) accum)
> ((= ln n) (cons l accum))
> ((= n 1)
> (let fold ((l l) (accum accum))
> (if (null? l) accum
> (fold (cdr l) (cons (cons (car l) '()) accum)))))
> (else
> (split l ln n accum))))
> ; split l in two parts a and b so that (length b) is n
> ; Invariant: (equal? (append a b) l)
> ; ln is the length of l
> (define (split l ln n accum)
> (let loop ((a '()) (b l) (lna 0) (lnb ln))
> (if (= lnb n) (cont a lna b lnb n accum)
> (loop (cons (car b) a) (cdr b) (+ 1 lna) (- lnb 1)))))
> ; The main body of the algorithm
> (define (cont a lna b lnb n accum)
> (let* ((accum
> (loop a lna n accum))
> (accum ; this is actually (loop b lnb n accum)
> (cons b accum))
> )
> (let inner ((k 1) (accum accum))
> (if (> k (min lna (- n 1))) ; don't loop past meaningful boundaries
> accum
> (let ((as (loop a lna k '()))
> (bs (loop b lnb (- n k) '())))
> (inner (+ 1 k)
> ; compute the cross-product of as and bs
> ; and append it to accum
> (let fold ((bs bs) (accum accum))
> (if (null? bs) accum
> (fold (cdr bs)
> (append
> (map (lambda (lst) (append lst (car bs))) as)
> accum))))))))))
> (loop l (length l) n '()))
> The benchmark runs in 993 ms of user time and allocates only 36.5 MB
> of memory, on Gambit-C interpreter. This is the absolute, incredible
> record. Under SCM:
> subsets-v3 (called combos by John David Stone)
> ;Evaluation took 1596 mSec (98 in gc) 657662 cells work, 4721364 env, 97
bytes other
> subsets-v5:
> ;Evaluation took 700 mSec (322 in gc) 1708112 cells work, 610264 env, 105
bytes
> other
> That is, more than twice as fast.
> Continuing the table from the previous post:
> Procedure Gambit-C Bigloo 2.4b Bigloo 2.4b
> interpreter, s interpreter, s compiler, s
> subsets-v0 285.0 11.59 5.62 3.14
> subsets-v1 6.3 5.45 2.22 0.34
> subsets-v3 8.1 4.78 0.96 0.27
> subsets-v20 14.1 5.53 0.96 0.26
> subsets-v21 7.7 4.88 0.66 0.26
> subsets-v22 5.0 3.18 0.62 0.25
> subsets-v23 4.1 2.86 0.82 0.25
> subsets-v5 0.9 1.56 1.10 0.76
> Well, compiled code isn't that fast -- but that because subsets-v5
> isn't too optimal. The append and map ought to be converted into the
> accumulation-passing style. That will reduce the amount of garbage as
> well. But it's past midnight.
> The conclusion of this Friday night exercise is astonishingly trite.
> What a Math teacher told us is true: we have to attack the algorithm
> if we want to really big improvements. And Math rules!