/* Mailing in Picat. From "Resampling Stats Illustrations" (http://www.statistics101.net/PeterBruce_05-illus.pdf) Page 34 """ A confidence interval for net profit (program “mailing”) To test potential response to a new book offer, a mail order com- pany sends a mailing to 10,000 potential customers, randomly selected from lists of millions. If the offering is successful, it will be mailed to the much larger lists of recipients. The results of the test are categorized into “no” (negative re- sponse), “silent” (no response), “order/return,” “order/bad debt,” and “order/pay.” The value of a customer who orders the title is $45 (this includes the immediate profit from the book, plus the net present value of possible future orders). The customers who order and return, and those who order and never pay, are worth $8.50 and $9.50 respectively, reflecting the processing costs of the customer transactions and correspondence, and the value of material shipped. The silent customer costs the firm just the outgoing mailing costs — $.40. The customer who responds “no” costs costs the firm the outgoing mailing costs, plus the cost of the postpaid reply. Action number proportion profit rate profit No 500 0.05 -$0.95 -$ 475 Silent 9200 0.47 -$0.40 -$3680 Order/return 90 0.009 -$8.50 -$ 765 Order/bad debt 30 0.003 -$9.50 -$ 285 Order/pay 180 0.018 $45.00 $8100 Profit $2895 The profit from the test mailing is $2895. How reliable an estimate is this? How different might it be with a different sample from the same population? We answer this question by constituting a hypothetical population of outcomes, and drawing samples of 10,000 from it so we can see how those samples behave. Our best guess about what the popu- lation of outcomes looks like is the sample itself, so we could replicate the sample many times and constitute a hypothetical larger universe. Alternatively, we can sample with replacement from the sample itself (effectively the same thing as replicating the sample an infinite number of times and sampling without replacement). -> INT = 1920 3896 """ This is slow with a resample of 10_000, so just 100 runs are done. Using categorical_n (in go2/0) is much faster Cf my Gamble model gamble_mailing.rkt This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Picat page: http://www.hakank.org/picat/ */ import ppl_distributions, ppl_utils. import util. % import ordset. main => go. /* This model using resample/2 is too slow. Use go2/0 instead. var : profit Probabilities (truncated): 4479.199999999718784: 0.0100000000000000 4176.949999999396823: 0.0100000000000000 4145.049999999422653: 0.0100000000000000 4052.499999999423835: 0.0100000000000000 ......... 1786.949999999679449: 0.0100000000000000 1682.549999999858073: 0.0100000000000000 1666.649999999797728: 0.0100000000000000 1638.499999999690772: 0.0100000000000000 mean = 2863.69 [len = 100,min = 1638.5,mean = 2863.69,median = 2809.77,max = 4479.2,variance = 392613.0,stdev = 626.588] HPD intervals: HPD interval (0.84): 1901.54999999956613..3650.29999999953679 HPD interval (0.9): 1638.49999999969077..3650.29999999953679 HPD interval (0.94): 1786.94999999967945..4052.49999999942384 HPD interval (0.99): 1638.49999999969077..4176.94999999939682 HPD interval (0.99999): 1638.49999999969077..4479.19999999971878 var : diff Probabilities (truncated): 1584.199999999718784: 0.0100000000000000 1281.949999999396823: 0.0100000000000000 1250.049999999422653: 0.0100000000000000 1157.499999999423835: 0.0100000000000000 ......... -1108.050000000320551: 0.0100000000000000 -1212.450000000141927: 0.0100000000000000 -1228.350000000202272: 0.0100000000000000 -1256.500000000309228: 0.0100000000000000 mean = -31.309 [len = 100,min = -1256.5,mean = -31.309,median = -85.225,max = 1584.2,variance = 392613.0,stdev = 626.588] HPD intervals: HPD interval (0.84): -993.45000000043387..755.29999999953679 HPD interval (0.9): -1256.50000000030923..755.29999999953679 HPD interval (0.94): -1108.05000000032055..1157.49999999942384 HPD interval (0.99): -1256.50000000030923..1281.94999999939682 HPD interval (0.99999): -1256.50000000030923..1584.19999999971878 */ go ?=> Orders = rep(500,-0.95) ++ rep(9200,-0.4) ++ rep(90,-8.5) ++ rep(30,-9.5) ++ rep(180,45), reset_store, run_model(100,$model(Orders),[show_probs_trunc,mean, show_simple_stats, % show_percentiles, show_hpd_intervals,hpd_intervals=[0.84,0.9,0.94,0.99,0.99999] % show_histogram, % min_accepted_samples=1000,show_accepted_samples=true ]), nl, % show_store_lengths,nl, % fail, nl. go => true. model(Orders) => SampleOrders = resample(10_000,Orders), Profit = sum(SampleOrders), Diff = Profit - 2895, add("profit",Profit), add("diff",Diff). /* Using categorical for 100 runs is much faster (9.9s vs 2min04s). So we use 1000 runs (1min17s). At HPD 0.9 the intervals are 1881.54999999962115..3826.39999999940846, similar to the text. var : profit Probabilities (truncated): 4978.250000000251021: 0.0010000000000000 4795.649999999934153: 0.0010000000000000 4562.599999999744796: 0.0010000000000000 4471.699999999604188: 0.0010000000000000 ......... 1334.249999999917236: 0.0010000000000000 1197.450000000007776: 0.0010000000000000 1148.000000000034561: 0.0010000000000000 1112.749999999859483: 0.0010000000000000 mean = 2876.57 [len = 1000,min = 1112.75,mean = 2876.57,median = 2887.05,max = 4978.25,variance = 367628.0,stdev = 606.323] HPD intervals: HPD interval (0.84): 2056.79999999968186..3756.39999999944257 HPD interval (0.9): 1881.54999999962115..3826.39999999940846 HPD interval (0.94): 1723.94999999993934..3918.09999999937963 HPD interval (0.99): 1334.24999999991724..4280.89999999988413 HPD interval (0.99999): 1112.74999999985948..4978.25000000025102 var : diff Probabilities (truncated): 2083.250000000251021: 0.0010000000000000 1900.649999999934153: 0.0010000000000000 1667.599999999744796: 0.0010000000000000 1576.699999999604188: 0.0010000000000000 ......... -1560.750000000082764: 0.0010000000000000 -1697.549999999992224: 0.0010000000000000 -1746.999999999965439: 0.0010000000000000 -1782.250000000140517: 0.0010000000000000 mean = -18.433 [len = 1000,min = -1782.25,mean = -18.433,median = -7.95,max = 2083.25,variance = 367628.0,stdev = 606.323] HPD intervals: HPD interval (0.84): -838.20000000031814..861.39999999944257 HPD interval (0.9): -1013.45000000037885..931.39999999940846 HPD interval (0.94): -1171.05000000006066..1023.09999999937963 HPD interval (0.99): -1560.75000000008276..1385.89999999988413 HPD interval (0.99999): -1782.25000000014052..2083.25000000025102 */ go2 ?=> reset_store, run_model(1000,$model2,[show_probs_trunc,mean, show_simple_stats, % show_percentiles, show_hpd_intervals,hpd_intervals=[0.84,0.9,0.94,0.99,0.99999] % show_histogram, % min_accepted_samples=1000,show_accepted_samples=true ]), nl, % show_store_lengths,nl, % fail, nl. go2 => true. model2() => SampleOrders = categorical_n([500,9200,90,30,180],[-0.95,-0.4,-8.5,-9.5,45],10_000), Profit = sum(SampleOrders), Diff = Profit - 2895, add("profit",Profit), add("diff",Diff).