#| Order statistics without replacement in Racket/Gamble From Siegrist "Probability Mathematical Statisics and Stochastic Processes", Chapter "12.4: Order Statistics" This is for order statistics without replacement. (Note: This is not the same as Mathematica's OrderDistribution which is for order statistics with replacement. See gamble_order_statistics_with_replacement_dist.rkt) This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Racket page: http://www.hakank.org/racket/ |# #lang gamble ; (require gamble/viz) (require racket) (require "gamble_utils.rkt") (require "gamble_distributions.rkt") (order_statistics_without_replacement_pdf 25 5 3 4) (* 1.0 (order_statistics_without_replacement_pdf 25 5 3 4)) #| """ Suppose that in a lottery, tickets numbered from 1 to 25 are placed in a bowl. Five tickets are chosen at random and without replacement. 1. Find the probability density function of X(3) 2. Find E [ X(3) ] . 3. Find var [ X(3) ] . Answer 1. P[X(3)=x] = binomial(x-1,2) * binomial(25-x,2) / binomial(25,5) for x in 3..23 2. E[X(3)] = 13 3. var[X(3)] = 130/7 """ |# (let ([m 25] [n 5] [i 3] ; [m 6] ; [n 4] ; [i 1] ) (displayln (format "order_statistics_without_replacement m:~a n:~a i:~a" m n i)) (displayln "Samples") (displayln (repeat (lambda () (order_statistics_without_replacement_dist m n i)) 30)) (newline) (displayln "PDF") (show "sum" (sum (for/list ([x (range i (add1 (+ (- m n) i)))]) (let ([pdf (order_statistics_without_replacement_pdf m n i x)]) (show x (* 1.0 pdf)) pdf) ))) (newline) (displayln "CDF") (for/list ([x (range i (add1 (+ (- m n) i)))]) (let ([cdf (order_statistics_without_replacement_cdf m n i x)]) (show x (* 1.0 cdf)) cdf) ) (newline) (displayln "Quantile") (define ps '(0.0001 0.001 0.01 0.05 0.25 0.5 0.75 0.9 0.95 0.99 0.999 0.9999 0.99999)) (for ([q ps]) (show q (order_statistics_without_replacement_quantile m n i q)) ) (newline) (show2 "Mean: " (order_statistics_without_replacement_mean m n i) (* 1.0 (order_statistics_without_replacement_mean m n i))) (show2 "Variance: " (order_statistics_without_replacement_variance m n i) (* 1.0 (order_statistics_without_replacement_variance m n i))) (newline) ) (displayln "Model") #| The mean should be 13. * Without replacement (enumerate, 38s) variable : v 12: 9/115 (0.0782608695652174) 13: 9/115 (0.0782608695652174) 11: 7/92 (0.07608695652173914) 14: 7/92 (0.07608695652173914) 10: 945/13156 (0.071830343569474) 15: 945/13156 (0.071830343569474) 9: 216/3289 (0.0656734569778048) 16: 216/3289 (0.0656734569778048) 8: 952/16445 (0.05788993615080572) 17: 952/16445 (0.05788993615080572) 7: 3213/65780 (0.048844633627242326) 18: 3213/65780 (0.048844633627242326) 6: 513/13156 (0.0389936150805716) 19: 513/13156 (0.0389936150805716) 5: 95/3289 (0.028884159318941928) 20: 95/3289 (0.028884159318941928) 4: 63/3289 (0.019154758285193068) 21: 63/3289 (0.019154758285193068) 3: 63/5980 (0.010535117056856187) 22: 63/5980 (0.010535117056856187) 2: 1/260 (0.0038461538461538464) 23: 1/260 (0.0038461538461538464) mean: 25/2 (12.5) HPD interval (0.5): 8..14 HPD interval (0.84): 4..17 HPD interval (0.9): 5..19 HPD interval (0.95): 4..20 HPD interval (0.99): 3..22 * With replacement (enumerate 36s) The mean is still 13, but the probabilities are slighly different. (This is the probability that OrderDistribution[{DiscreteUniformDistribution[{1, 25}], 5}, 3] gives) See gamble_order_statistics_with_replacement_discrete_dist.rkt variable : v 13: 731641/9765625 (0.0749200384) 12: 722311/9765625 (0.0739646464) 14: 722311/9765625 (0.0739646464) 11: 694681/9765625 (0.0711353344) 15: 694681/9765625 (0.0711353344) 10: 649831/9765625 (0.0665426944) 16: 649831/9765625 (0.0665426944) 9: 589561/9765625 (0.0603710464) 17: 589561/9765625 (0.0603710464) 8: 516391/9765625 (0.0528784384) 18: 516391/9765625 (0.0528784384) 7: 433561/9765625 (0.0443966464) 19: 433561/9765625 (0.0443966464) 6: 345031/9765625 (0.0353311744) 20: 345031/9765625 (0.0353311744) 5: 255481/9765625 (0.0261612544) 21: 255481/9765625 (0.0261612544) 4: 170311/9765625 (0.0174398464) 22: 170311/9765625 (0.0174398464) 3: 95641/9765625 (0.0097936384) 23: 95641/9765625 (0.0097936384) 2: 38311/9765625 (0.0039230464) 24: 38311/9765625 (0.0039230464) 1: 5881/9765625 (0.0006022144) 25: 5881/9765625 (0.0006022144) mean: 13 (13.0) HPD interval (0.5): 7..14 HPD interval (0.84): 6..19 HPD interval (0.9): 5..20 HPD interval (0.95): 4..21 HPD interval (0.99): 2..23 A simpler problem: 4 dice are rolled, what is the min value (i=1) or max value (i=4)) * without replacement variable : v 1: 2/3 (0.6666666666666666) 2: 4/15 (0.26666666666666666) 3: 1/15 (0.06666666666666667) mean: 7/5 (1.4) HPD interval (0.5): 1..1 HPD interval (0.84): 1..2 HPD interval (0.9): 1..2 HPD interval (0.95): 1..3 HPD interval (0.99): 1..3 * with replacement variable : v 1: 671/1296 (0.5177469135802469) 2: 41/144 (0.2847222222222222) 3: 175/1296 (0.13503086419753085) 4: 65/1296 (0.05015432098765432) 5: 5/432 (0.011574074074074073) 6: 1/1296 (0.0007716049382716049) mean: 2275/1296 (1.7554012345679013) HPD interval (0.5): 1..1 HPD interval (0.84): 1..3 HPD interval (0.9): 1..3 HPD interval (0.95): 1..4 HPD interval (0.99): 1..5 |# (define (model) (enumerate ; rejection-sampler ; importance-sampler ; mh-sampler ;; (define m 25) ;; (define n 5) ;; ; Note: In the context of order stastistics i is 1-based ;; ; adjusted below ;; (define k 3) ; 4 dice are rolled, what is the min (i=1) or max (i=4)) (define m 6) (define n 4) (define k 1) ; The distributions are without replacement (define x (draw-without-replacement n (range 1 (add1 m)))) ; (define x (for/list ([i n]) (add1 (random-integer m)))) ; with replacement (define x_sorted (sort x <)) ; (show "x" x_sorted) ; Order statistics is 1-based based (define v (ith-smallest x_sorted (sub1 k) #t)) (list v ; x ) ) ) (show-marginals (model) (list "v" "x" ) #:num-samples 10000 ; #:truncate-output 5 ; #:skip-marginals? #t ; #:show-stats? #t ; #:credible-interval 0.84 #:hpd-interval (list 0.5 0.84 0.9 0.95 0.99) ; #:show-histogram? #t ; #:show-percentiles? #t ; #:burn 0 ; #:thin 0 )