#| Coupon collecter's problem, or card collecter's problem in Racket Gamble. There are N different collecter's cards hidden in a package, but we don't know which card there is in the package we buy. We want to collect all of them, how many packages must one buy to collect all the different cards? See https://en.wikipedia.org/wiki/Coupon_collector%27s_problem """ In probability theory, the coupon collector's problem describes 'collect all coupons and win' contests. It asks the following question: If each box of a brand of cereals contains a coupon, and there are n different types of coupons, what is the probability that more than t boxes need to be bought to collect all n coupons? An alternative statement is: Given n coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once? The mathematical analysis of the problem reveals that the expected number of trials needed grows as Θ(n log(n). For example, when n = 50 it takes about 225[b] trials on average to collect all 50 coupons. ... [b]: E(50) = 50(1 + 1/2 + 1/3 + ... + 1/50) = 224.9603, the expected number of trials to collect all 50 coupons. The approximation n*log(n) + γ*n + 1/2 for this expected number gives in this case ≈ 195.6011 + 28.8608 + 0.5 ≈ 224.9619. [log is the natural logarithm] """ This is a port of my WebPPL model coupon_collector2.wppl This version draw random integers until all integers 0..N-1 have been selected. We only study the length of the generated array. It's fairly good, but is slow for larger n. * n=10 (theoretical 29.28968253968254) var : len 22: 0.048900000000000034 24: 0.046400000000000025 25: 0.044000000000000025 21: 0.04320000000000003 19: 0.04130000000000003 ... 92: 0.00010000000000000007 98: 0.00010000000000000007 102: 0.00010000000000000007 104: 0.00010000000000000007 109: 0.00010000000000000007 mean: 29.049100000000013 * n=50 (9.0s) (theoretical 224.96026691647126) var : len 214: 0.01699999999999999 189: 0.013999999999999992 181: 0.01299999999999999 206: 0.01299999999999999 217: 0.01299999999999999 ... 455: 0.0009999999999999994 478: 0.0009999999999999994 484: 0.0009999999999999994 235: 0.0009999999999999994 240: 0.0009999999999999994 mean: 224.70099999999982 * n=100 (31.5s) (theoretical 518.737751763962) var : len 415: 0.008999999999999994 471: 0.008999999999999994 435: 0.007999999999999995 475: 0.007999999999999995 484: 0.007999999999999995 ... 1014: 0.0009999999999999994 503: 0.0009999999999999994 1019: 0.0009999999999999994 506: 0.0009999999999999994 509: 0.0009999999999999994 mean: 518.8399999999997 Cf some of these models: * gamble_coupon_collectors_problem.rkt * gamble_geometric_cereal_box.rkt * (and gamble_coupon_collector.rkt but it's not a especially good model) This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Racket page: http://www.hakank.org/racket/ |# #lang gamble ; (require gamble/viz) (require racket) (require "gamble_utils.rkt") ;"Exact" probability ; from https://en.wikipedia.org/wiki/Coupon_collector%27s_problem (footnote [b]) (define (cc-theoretical n) (* n (for/sum ([i (range n)]) (/ (+ i 1)))) ) (define (coupon-collector2 n) (displayln (list "theoretical" (exact->inexact (cc-theoretical n)))) (; enumerate ; rejection-sampler importance-sampler ; mh-sampler (define (collect lst) (if (= (length (remove-duplicates lst)) n) lst (collect (cons (random-integer n) lst)))) (define len (length (collect '()))) (list len) ) ) (show-marginals (coupon-collector2 100) (list "len" ) #:num-samples 1000 #:truncate-output 5 )