/* Five qualities in Picat. From https://www.reddit.com/r/probabilitytheory/comments/1ezoejz/probability_calculation_help_needed/ """ Probability calculation help needed .. Imagine a world where people have at most (not all person has all qualities) 10 qualities (q1, q2, …, q10) with corresponding 10 probabilities (p1, p2, …, p10; sum(p1 + p2 + … + p10) = 1). What is the probability that a randomly selected person with 5 qualities would have q2? Is it something like .. 1 - ((1-p2)^5) """ Let's use a flat Dirichlet prior for the distribution of the 10 probabilities. According to this model, the probability of quality P2 is 0.5. This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Picat page: http://www.hakank.org/picat/ */ import ppl_distributions, ppl_utils. import util. main => go. /* We observe between 0..6 qualities: obs = 0 var : p2 Probabilities: false: 1.0000000000000000 obs = 1 var : p2 Probabilities: false: 0.8991910996672092 true: 0.1008089003327908 obs = 2 var : p2 Probabilities: false: 0.8013478023883851 true: 0.1986521976116149 obs = 3 var : p2 Probabilities: false: 0.7032219331598959 true: 0.2967780668401040 obs = 4 var : p2 Probabilities: false: 0.5984251968503937 true: 0.4015748031496063 obs = 5 var : p2 Probabilities: false: 0.5066666666666667 true: 0.4933333333333333 obs = 6 var : p2 Probabilities: true: 0.8571428571428571 false: 0.1428571428571428 */ go ?=> member(Obs,0..10), Obs = 5, println(obs=Obs), reset_store, run_model(1_000_000,$model(Obs),[show_probs_trunc]), nl, show_store_lengths,nl, fail, nl. go => true. model(Obs) => % There are 10 possible qualities N = 10, % A person has quality I which probability ProbQ[I] % Let's assume a flat Dirichlet prior Alpha Alpha = ones(N,1), ProbQ = dirichlet_dist(Alpha), % For a fixed distribution (generated by dirichet_dist) % ProbQ = [0.088149915189071,0.138381174420785,0.023157794093553,0.03793977377946,0.084279282873386,0.143668859927565,0.003733968335208,0.136927737903613,0.067691377496496,0.276070115980863], Qs = [bern(ProbQ[I]) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not % Qs = [bern(1/2) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not. This is much faster % How many qualities has this person? NumQualities = Qs.sum, % We observe 5 (Obs) qualities for this person observe(NumQualities == Obs), % What is the probability that this person has quality 2? P2 = check(Qs[2] == 1), % add("q2 (unobserved)",Qs[2]), if observed_ok then add("p2",P2), % add("q2",Qs[2]), end. /* Using a common and fixed distribution (generated by dirichlet dist): [0.088149915189071,0.138381174420785,0.023157794093553,0.03793977377946,0.084279282873386,0.143668859927565,0.003733968335208,0.136927737903613,0.067691377496496,0.276070115980863] Here we also experiment with the option min_accepted_samples which is an alternative to experimenting with different vales of the number of runs parameter to run_model. With parameter of 100_000 runs, the number of accepted samples is about 80: var : p2 Probabilities: true: 0.6578947368421053 false: 0.3421052631578947 Variable lengths: p2 = 76 (This took about 1.2s) Let's instead require 1000 accepted samples, and thus ignore the number of runs paramete: ... Num accepted samples: 995 Total samples: 1059771 (0.001%) Num accepted samples: 996 Total samples: 1060442 (0.001%) Num accepted samples: 997 Total samples: 1060834 (0.001%) Num accepted samples: 998 Total samples: 1061575 (0.001%) Num accepted samples: 999 Total samples: 1061680 (0.001%) Num accepted samples: 1000 Total samples: 1063108 (0.001%) var : p2 Probabilities: true: 0.6780000000000000 false: 0.3220000000000000 Variable lengths: p2 = 1000 (This took about 11s). A second run: .... Num accepted samples: 998 Total samples: 1060737 (0.001%) Num accepted samples: 999 Total samples: 1062164 (0.001%) Num accepted samples: 1000 Total samples: 1062652 (0.001%) var : p2 Probabilities: true: 0.6879999999999999 false: 0.3120000000000000 */ go2 ?=> Obs = 5, reset_store, run_model(100_000,$model2(Obs),[show_probs_trunc ,min_accepted_samples=1000,show_accepted_samples=true ]), nl, show_store_lengths,nl, fail, nl. go2 => true. model2(Obs) => % There are 10 possible qualities N = 10, % A person has quality I which probability ProbQ[I] % Let's assume a flat Dirichlet prior Alpha % Alpha = ones(N,1), % ProbQ = dirichlet_dist(Alpha), % For a fixed distribution (generated by dirichet_dist) ProbQ = [0.088149915189071,0.138381174420785,0.023157794093553,0.03793977377946,0.084279282873386,0.143668859927565,0.003733968335208,0.136927737903613,0.067691377496496,0.276070115980863], Qs = [bern(ProbQ[I]) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not % Qs = [bern(1/2) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not. This is much faster % How many qualities has this person? NumQualities = Qs.sum, % We observe 5 (Obs) qualities for this person observe(NumQualities == Obs), % What is the probability that this person has quality 2? P2 = check(Qs[2] == 1), if observed_ok then add("p2",P2), end. /* Here we fixed probability 1/10 for each quality, (and thus violating the requirement of sum(Probs) == 1). This is a much simpler (and faster) problem than the two versions above. Getting 1000 accepted samples is fast (0.2s), so let's beef it up to 10_000 accepted samples instead: ... Num accepted samples: 9997 Total samples: 40778 (0.245%) Num accepted samples: 9998 Total samples: 40781 (0.245%) Num accepted samples: 9999 Total samples: 40789 (0.245%) Num accepted samples: 10000 Total samples: 40793 (0.245%) var : p2 Probabilities: false: 0.5065000000000000 true: 0.4935000000000000 This took about 2.8s, 100_000 accepted samples takes some time (about 1min52s): ... Num accepted samples: 99997 Total samples: 406957 (0.246%) Num accepted samples: 99998 Total samples: 406959 (0.246%) Num accepted samples: 99999 Total samples: 406974 (0.246%) Num accepted samples: 100000 Total samples: 406983 (0.246%) var : p2 Probabilities: true: 0.5001900000000000 false: 0.4998100000000000 But now this is quite a simple problem, so really just 1000 samples might have worked (in 0.06s). We got 231 accepted samples. var : p2 Probabilities: false: 0.5064935064935064 true: 0.4935064935064935 Variable lengths: p2 = 231 */ go3 ?=> Obs = 5, reset_store, run_model(10_000,$model3(Obs),[show_probs_trunc ,min_accepted_samples=10_000,show_accepted_samples=true ]), nl, show_store_lengths,nl, fail, nl. go3 => true. model3(Obs) => % There are 10 possible qualities N = 10, % A person has quality I which probability ProbQ[I] % Let's assume a flat Dirichlet prior Alpha % Alpha = ones(N,1), % ProbQ = dirichlet_dist(Alpha), % For a fixed distribution (generated by dirichet_dist) % ProbQ = [0.088149915189071,0.138381174420785,0.023157794093553,0.03793977377946,0.084279282873386,0.143668859927565,0.003733968335208,0.136927737903613,0.067691377496496,0.276070115980863], % Qs = [bern(ProbQ[I]) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not Qs = [bern(1/2) : I in 1..N], % Qs[I] 1: has this quantile, 0: has not. This is much faster % How many qualities has this person? NumQualities = Qs.sum, % We observe 5 (Obs) qualities for this person observe(NumQualities == Obs), % What is the probability that this person has quality 2? P2 = check(Qs[2] == 1), if observed_ok then add("p2",P2), end.