Google Groups Home
Help | Sign in
Message from discussion The FASTEST subsets function [Was: Subsets of a list]
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Ji-Yong D. Chung  
View profile
 More options Jan 12 2002, 2:38 pm
Newsgroups: comp.lang.scheme
From: "Ji-Yong D. Chung" <virtualcy...@erols.com>
Date: Sat, 12 Jan 2002 14:43:41 -0500
Local: Sat, Jan 12 2002 2:43 pm
Subject: Re: The FASTEST subsets function [Was: Subsets of a list]
    I did not google on ths topic -- it is probably
too esoteric.

    But what _is_ subsets function?  What does it
compute, and what is it used for?

    I would be grateful for additional comments.

<o...@pobox.com> wrote in message

news:7eb8ac3e.0201120056.3fc231c8@posting.google.com...
> This article will show the design of the fastest ever interpreted
> subsets function. Sorry I'm too excited about this. Again, we start
> with the mathematical definition of the problem, which leads to a
> simple, correct and stunningly efficient solution. The final, fastest
> ever solution is still pure functional.

> We start with the definition of the problem: given a set 'l' and a
> number 'n', return the set of all subsets of 'l' of cardinality
> 'n'. Sets are represented by lists.

> The key was to choose the right definition.

> Let ps(L) is a powerset of L that does not include {empty set} (that
> is, a singular set whose element is the empty set).
> length(ps(L)) = (- (expt 2 (length L)) 1)

> Let L be a non-empty set and let L = A U B where A and B are two
> disjoint subsets of L.

> Obviously,
> ps(L) = ps(A U B) =
> ps(A) U ps(B) U { y U x | x <- ps(A), y <- ps(B) }
> Let (subsets L n) = (filter (lambda (el) (= n (length el))) ps(L) )

> Thus the desired function 'subsets' is a filtered powerset ps. This
> seems to be a stupid definition. Let's not be hasty however. Note that
> filter and union commute: the filter of a union of sets is the union
> of filtered sets. Therefore, from the previous expression

> (subsets L n) = (subsets (union A B) n)
> = (subsets A n) U (subsets B n)
> Union{ y U x | x <- (subsets A k), y <- (subsets B (- n k)),
>        k=1,n-1 }

> Well, this is it. Note, we didn't say how to split L into two disjoint
> subsets A and B. We can do as we wish. For example, we can choose to
> split in such a way so that (length B) is n. In the most difficult
> case where n = (/ (length L) 2), this corresponds to a "divide and
> conquer" strategy, so to speak (at least in the first stages).

> The Scheme code below implements this idea verbatim. It uses the
> accumulator-passing style that was expounded earlier.

> (define (subsets-v5 l n)

>   ; The initialization function. Check the boundary conditions
>   (define (loop l ln n accum)
>     (cond
>      ((<= n 0) (cons '() accum))
>      ((< ln n) accum)
>      ((= ln n) (cons l accum))
>      ((= n 1)
>       (let fold ((l l) (accum accum))
> (if (null? l) accum
>     (fold (cdr l) (cons (cons (car l) '()) accum)))))
>      (else
>       (split l ln n accum))))

>   ; split l in two parts a and b so that (length b) is n
>   ; Invariant: (equal? (append a b) l)
>   ; ln is the length of l
>   (define (split l ln n accum)
>     (let loop  ((a '()) (b l) (lna 0) (lnb ln))
>       (if (= lnb n) (cont a lna b lnb n accum)
>   (loop (cons (car b) a) (cdr b) (+ 1 lna) (- lnb 1)))))

>   ; The main body of the algorithm
>   (define (cont a lna b lnb n accum)
>     (let* ((accum
>     (loop a lna n accum))
>    (accum ; this is actually (loop b lnb n accum)
>     (cons b accum))
>    )
>       (let inner ((k 1) (accum accum))
> (if (> k (min lna (- n 1))) ; don't loop past meaningful boundaries
>     accum
>     (let ((as (loop a lna k '()))
>   (bs (loop b lnb (- n k) '())))
>       (inner (+ 1 k)
>      ; compute the cross-product of as and bs
>      ; and append it to accum
>      (let fold ((bs bs) (accum accum))
>        (if (null? bs) accum
>    (fold (cdr bs)
> (append
>   (map (lambda (lst) (append lst (car bs))) as)
>   accum))))))))))

>   (loop l (length l) n '()))

> The benchmark runs in 993 ms of user time and allocates only 36.5 MB
> of memory, on Gambit-C interpreter. This is the absolute, incredible
> record. Under SCM:

> subsets-v3 (called combos by John David Stone)
> ;Evaluation took 1596 mSec (98 in gc) 657662 cells work, 4721364 env, 97
bytes other

> subsets-v5:
> ;Evaluation took 700 mSec (322 in gc) 1708112 cells work, 610264 env, 105
bytes
> other
> That is, more than twice as fast.

> Continuing the table from the previous post:

> Procedure       Gambit-C               Bigloo 2.4b      Bigloo 2.4b
>                 interpreter, s         interpreter, s   compiler, s
> subsets-v0        285.0                   11.59          5.62    3.14
> subsets-v1     6.3                    5.45          2.22    0.34
> subsets-v3     8.1                    4.78          0.96    0.27
> subsets-v20        14.1                    5.53          0.96    0.26
> subsets-v21         7.7                    4.88          0.66    0.26
> subsets-v22         5.0                    3.18          0.62    0.25
> subsets-v23         4.1                    2.86          0.82    0.25
> subsets-v5     0.9                    1.56          1.10 0.76

> Well, compiled code isn't that fast -- but that because subsets-v5
> isn't too optimal. The append and map ought to be converted into the
> accumulation-passing style. That will reduce the amount of garbage as
> well. But it's past midnight.

> The conclusion of this Friday night exercise is astonishingly trite.
> What a Math teacher told us is true: we have to attack the algorithm
> if we want to really big improvements. And Math rules!


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google