/* 

  Mailing in Picat.

  From "Resampling Stats Illustrations" (http://www.statistics101.net/PeterBruce_05-illus.pdf)
  Page 34
  """
  A confidence interval for net profit
  (program “mailing”)
  To test potential response to a new book offer, a mail order com-
  pany sends a mailing to 10,000 potential customers, randomly
  selected from lists of millions. If the offering is successful, it will
  be mailed to the much larger lists of recipients.
  The results of the test are categorized into “no” (negative re-
  sponse), “silent” (no response), “order/return,” “order/bad debt,”
  and “order/pay.” The value of a customer who orders the title is
  $45 (this includes the immediate profit from the book, plus the net
  present value of possible future orders). The customers who order
  and return, and those who order and never pay, are worth $8.50
  and $9.50 respectively, reflecting the processing costs of the
  customer transactions and correspondence, and the value of
  material shipped. The silent customer costs the firm just the
  outgoing mailing costs — $.40. The customer who responds “no”
  costs costs the firm the outgoing mailing costs, plus the cost of the
  postpaid reply.
  Action        number   proportion  profit rate   profit
    No              500      0.05     -$0.95        -$ 475
    Silent         9200      0.47     -$0.40        -$3680
    Order/return     90      0.009    -$8.50        -$ 765
    Order/bad debt   30      0.003    -$9.50        -$ 285
    Order/pay       180      0.018    $45.00         $8100
    Profit                                           $2895

  The profit from the test mailing is $2895. How reliable an estimate
  is this? How different might it be with a different sample from the
  same population?
  We answer this question by constituting a hypothetical population
  of outcomes, and drawing samples of 10,000 from it so we can see
  how those samples behave. Our best guess about what the popu-
  lation of outcomes looks like is the sample itself, so we could
  replicate the sample many times and constitute a hypothetical
  larger universe. Alternatively, we can sample with replacement
  from the sample itself (effectively the same thing as replicating the
  sample an infinite number of times and sampling without replacement).  
  -> INT = 1920 3896
  """

  This is slow with a resample of 10_000, so just 100 runs are done.
  Using categorical_n (in go2/0) is much faster


  Cf my Gamble model gamble_mailing.rkt

  This program was created by Hakan Kjellerstrand, hakank@gmail.com
  See also my Picat page: http://www.hakank.org/picat/

*/

import ppl_distributions, ppl_utils.
import util.
% import ordset.

main => go.

/*
  This model using resample/2 is too slow. Use go2/0 instead.

  var : profit
  Probabilities (truncated):
  4479.199999999718784: 0.0100000000000000
  4176.949999999396823: 0.0100000000000000
  4145.049999999422653: 0.0100000000000000
  4052.499999999423835: 0.0100000000000000
  .........
  1786.949999999679449: 0.0100000000000000
  1682.549999999858073: 0.0100000000000000
  1666.649999999797728: 0.0100000000000000
  1638.499999999690772: 0.0100000000000000
  mean = 2863.69
  [len = 100,min = 1638.5,mean = 2863.69,median = 2809.77,max = 4479.2,variance = 392613.0,stdev = 626.588]
  HPD intervals:
  HPD interval (0.84): 1901.54999999956613..3650.29999999953679
  HPD interval (0.9): 1638.49999999969077..3650.29999999953679
  HPD interval (0.94): 1786.94999999967945..4052.49999999942384
  HPD interval (0.99): 1638.49999999969077..4176.94999999939682
  HPD interval (0.99999): 1638.49999999969077..4479.19999999971878

  var : diff
  Probabilities (truncated):
  1584.199999999718784: 0.0100000000000000
  1281.949999999396823: 0.0100000000000000
  1250.049999999422653: 0.0100000000000000
  1157.499999999423835: 0.0100000000000000
  .........
  -1108.050000000320551: 0.0100000000000000
  -1212.450000000141927: 0.0100000000000000
  -1228.350000000202272: 0.0100000000000000
  -1256.500000000309228: 0.0100000000000000
  mean = -31.309
  [len = 100,min = -1256.5,mean = -31.309,median = -85.225,max = 1584.2,variance = 392613.0,stdev = 626.588]
  HPD intervals:
  HPD interval (0.84): -993.45000000043387..755.29999999953679
  HPD interval (0.9): -1256.50000000030923..755.29999999953679
  HPD interval (0.94): -1108.05000000032055..1157.49999999942384
  HPD interval (0.99): -1256.50000000030923..1281.94999999939682
  HPD interval (0.99999): -1256.50000000030923..1584.19999999971878


*/
go ?=>
  Orders =  rep(500,-0.95) ++ rep(9200,-0.4) ++ rep(90,-8.5) ++ rep(30,-9.5) ++ rep(180,45),
  reset_store,
  run_model(100,$model(Orders),[show_probs_trunc,mean,
                           show_simple_stats,
                           % show_percentiles,
                           show_hpd_intervals,hpd_intervals=[0.84,0.9,0.94,0.99,0.99999]
                           % show_histogram,
                           % min_accepted_samples=1000,show_accepted_samples=true
                          ]),
  nl,
  % show_store_lengths,nl,
  % fail,
  nl.
go => true.
  
  
model(Orders) =>
  SampleOrders = resample(10_000,Orders),
  Profit = sum(SampleOrders),
  Diff = Profit - 2895,

  add("profit",Profit),
  add("diff",Diff).


/*
  Using categorical for 100 runs is much faster (9.9s vs 2min04s).
  So we use 1000 runs (1min17s).

  At HPD 0.9 the intervals are 1881.54999999962115..3826.39999999940846,
  similar to the text.

  var : profit
  Probabilities (truncated):
  4978.250000000251021: 0.0010000000000000
  4795.649999999934153: 0.0010000000000000
  4562.599999999744796: 0.0010000000000000
  4471.699999999604188: 0.0010000000000000
  .........
  1334.249999999917236: 0.0010000000000000
  1197.450000000007776: 0.0010000000000000
  1148.000000000034561: 0.0010000000000000
  1112.749999999859483: 0.0010000000000000
  mean = 2876.57
  [len = 1000,min = 1112.75,mean = 2876.57,median = 2887.05,max = 4978.25,variance = 367628.0,stdev = 606.323]
  HPD intervals:
  HPD interval (0.84): 2056.79999999968186..3756.39999999944257
  HPD interval (0.9): 1881.54999999962115..3826.39999999940846
  HPD interval (0.94): 1723.94999999993934..3918.09999999937963
  HPD interval (0.99): 1334.24999999991724..4280.89999999988413
  HPD interval (0.99999): 1112.74999999985948..4978.25000000025102

  var : diff
  Probabilities (truncated):
  2083.250000000251021: 0.0010000000000000
  1900.649999999934153: 0.0010000000000000
  1667.599999999744796: 0.0010000000000000
  1576.699999999604188: 0.0010000000000000
  .........
  -1560.750000000082764: 0.0010000000000000
  -1697.549999999992224: 0.0010000000000000
  -1746.999999999965439: 0.0010000000000000
  -1782.250000000140517: 0.0010000000000000
  mean = -18.433
  [len = 1000,min = -1782.25,mean = -18.433,median = -7.95,max = 2083.25,variance = 367628.0,stdev = 606.323]
  HPD intervals:
  HPD interval (0.84): -838.20000000031814..861.39999999944257
  HPD interval (0.9): -1013.45000000037885..931.39999999940846
  HPD interval (0.94): -1171.05000000006066..1023.09999999937963
  HPD interval (0.99): -1560.75000000008276..1385.89999999988413
  HPD interval (0.99999): -1782.25000000014052..2083.25000000025102

*/
go2 ?=>
  reset_store,
  run_model(1000,$model2,[show_probs_trunc,mean,
                           show_simple_stats,
                           % show_percentiles,
                           show_hpd_intervals,hpd_intervals=[0.84,0.9,0.94,0.99,0.99999]
                           % show_histogram,
                           % min_accepted_samples=1000,show_accepted_samples=true
                          ]),
  nl,
  % show_store_lengths,nl,
  % fail,
  nl.
go2 => true.
  
  
model2() =>
  SampleOrders =  categorical_n([500,9200,90,30,180],[-0.95,-0.4,-8.5,-9.5,45],10_000),
  Profit = sum(SampleOrders),
  Diff = Profit - 2895,

  add("profit",Profit),
  add("diff",Diff).