#| 

  Holmes clock problem in Racket.Gamble 

  From "Introduction to Probability" av Grinstead & Snell, page 92
  """
  Mr. Wimply Dimple, one of London's most prestigious watch makers, has come to 
  Sherlock Holmes in a panic, having discovered that someone has been producing and 
  selling crude counterfeits of his best selling watch. The 16 counterfeits so far 
  discovered bear stamped numbers, all of which fall between 1 and 56, and Dimple is 
  anxious to know the extent of the forger's work. All present agree that it seems 
  reasonable to assume that the counterfeits thus far produced bear consecutive 
  numbers from 1 to whatever the total number is. "Chin up, Dimple," opines Dr. Watson.
  "I shouldn't worry overly much if I were you; the Maximum Likelihood Principle, 
  which estimates the total number as precisely that which gives the highest 
  probability for the series of numbers found, suggests that we guess 56 itself as the
  total. Thus, your forgers are not a big operation, and we shall have them safely 
  behind bars before your business suffers significantly." "Stuff, nonsense, and 
  bother your fancy principles, Watson," counters Holmes. "Anyone can see that, of 
  course, there must be quite a few more than 56 watches - why the odds of our having 
  discovered precisely the highest numbered watch made are laughably negligible. A 
  much better guess would be twice 56." 
  """

  Let's simulate this by an urn model were we have two urns:
  - urn 1 which have 56 balls numbered 1..56
  - urn 2 which have 2*56 = 112 balls numbered 1..112

  The probability of selecting each of the urns are 1/2.

  Using Bayes formula:

      p(ball from urn 1) = 0.5
      p(ball from urn 2) = 0.5

      p(number 56 is drawn|urn=urn 1) = 1/56
      p(nummer 56 is drawn|urn=urn 2) = 1/112

  What are the probabilities:
    p(urn 1|number 56 was drawn)
    p(urn 2|number 56 was drawn)

  p(urn 1|number 56 was drawn) = p(urn 1 /\ number 56) / p(number 56)

                          = p(number 56|urn 1) * p(urn 1)
                            ------------------------------
                            p(number 56|urn 1) * p(urn 1) + p(number 56|urn 2)*p(urn 2)

                          = 1/56 * 0.5
                            ----------
                            1/56 * 0.5 + 1/112*0.5

                          = (1/56 * 0.5)/(1/56 * 0.5 + 1/112*0.5) 
                          = 0.6666667

  p(urn 2|number 56 was drawn) = p(urn 2 /\ number 56) / p(number 56)

                          = p(number 56|urn 2) * p(urn 2)
                            ------------------------------
                            p(number 56|urn 1) * p(urn 1) + p(number 56|urn 2)*p(urn 2)

                          = 1/112 * 0.5
                            ----------
                            1/56 * 0.5 + 1/112*0.5

                          = (1/112 * 0.5)/(1/56 * 0.5 + 1/112*0.5) 
                          = 0.3333333


  This model gives the following result which confirms the calculations (or the other way around :-)

  Distributions of variable urn
    urn1       =>     668  (0.668000)
    urn2       =>     332  (0.332000)
 
  I.e. if we draw ball number 56 it's more likely that it was from urn number 1 (with balls 1..56) than 
  from urn 2 (with balls 1..112), 66% vs 33%

  Here are some more experiments were we increments the number of balls in urn2:

  (num_balls1 56 num_balls2 112 ball_obs 56)
  var : urn
  urn1: 2/3 (0.6666666666666666)
  urn2: 1/3 (0.3333333333333333)

  (num_balls1 56 num_balls2 168 ball_obs 56)
  var : urn
  urn1: 3/4 (0.75)
  urn2: 1/4 (0.25)

  (num_balls1 56 num_balls2 224 ball_obs 56)
  var : urn
  urn1: 4/5 (0.8)
  urn2: 1/5 (0.2)

  (num_balls1 56 num_balls2 280 ball_obs 56)
  var : urn
  urn1: 5/6 (0.8333333333333334)
  urn2: 1/6 (0.16666666666666666)

  (num_balls1 56 num_balls2 336 ball_obs 56)
  var : urn
  urn1: 6/7 (0.8571428571428571)
  urn2: 1/7 (0.14285714285714285)

  (num_balls1 56 num_balls2 392 ball_obs 56)
  var : urn
  urn1: 7/8 (0.875)
  urn2: 1/8 (0.125)

  (num_balls1 56 num_balls2 448 ball_obs 56)
  var : urn
  urn1: 8/9 (0.8888888888888888)
  urn2: 1/9 (0.1111111111111111)

  (num_balls1 56 num_balls2 504 ball_obs 56)
  var : urn
  urn1: 9/10 (0.9)
  urn2: 1/10 (0.1)

  (num_balls1 56 num_balls2 560 ball_obs 56)
  var : urn
  urn1: 10/11 (0.9090909090909091)
  urn2: 1/11 (0.09090909090909091)


  This is a port of my WebPPL model holmes_clock_problem.wppl

  But cf with Mosteller's Locomitive problem (gamble_locomotive_problem.rkt) with 56 as the locomotive number observed:
  Max-int  Estimate 
  -----------------
   100      75.40107023621401
   200     112.69364171265156 
   500     202.54329682025934 
  1000     327.4619016746249
  2000     544.0112652172352

  (Mosteller's estimate would be 2*(56-1)+1 = 111)


  This program was created by Hakan Kjellerstrand, hakank@gmail.com
  See also my Racket page: http://www.hakank.org/racket/

|#

#lang gamble

; (require gamble/viz)
(require racket)
(require "gamble_utils.rkt")
; (require "gamble_distributions.rkt")


(define (model num_balls1 num_balls2 ball_obs)
  (show2 "num_balls1" num_balls1 "num_balls2" num_balls2 "ball_obs" ball_obs)
  (enumerate
   ; rejection-sampler
   ; importance-sampler
   ; mh-sampler

   (define urn1 0)
   (define urn2 1)
        
   (define urn (categorical-vw2 (vector 1/2 1/2) (vector "urn1" "urn2")))
   (define ball (if (eq? urn "urn1") (add1 (random-integer num_balls1)) (add1 (random-integer num_balls2))))
        
   (observe/fail (= ball ball_obs))
        
   (list urn
        )

   )
)

(define num_balls1 56)
(define num_balls2 112)
(define ball_obs 56)
(show-marginals (model num_balls1 num_balls2 ball_obs)
                  (list  "urn"
                         )
                  #:num-samples 1000
                  #:truncate-output 5
                  ; #:skip-marginals? #t
                  ; #:show-stats? #t
                  ; #:credible-interval 0.84
                  ; #:show-histogram? #t
                  ; #:show-percentiles? #t
                  )

(for ([num_balls2 8])
  (show-marginals (model num_balls1 (* 56 (+ num_balls2 3)) ball_obs)
                  (list  "urn"
                         )
                  #:num-samples 1000
                  #:truncate-output 5
                  ; #:skip-marginals? #t
                  ; #:show-stats? #t
                  ; #:credible-interval 0.84
                  ; #:show-histogram? #t
                  ; #:show-percentiles? #t
                  )
  )