/* Chinese restaurant process in Picat. https://en.wikipedia.org/wiki/Chinese_restaurant_process """ ... An equivalent, but subtly different way to define the Chinese restaurant process, is to let new customers choose companions rather than tables. Customer n+1 chooses to sit at the same table as any one of the n seated customers with probability 1/(n+theta) or chooses to sit at a new, unoccupied table with probability theta/(n + theta) Notice that in this formulation, the customer chooses a table without having to count table occupancies- we don't need |b|. ... It can be understood as the sum of n independent Bernoulli random variables, each with a different parameter: K = sum(bi,i=1..n) bi ~ Bernoulli theta / (i - 1 + theta) PMF: gamma(theta) ------------ * |s(n,k)*theta^k k = 1..N gamma(n + theta) where s denotes Stirling number of the first kind """ Also, see ppl gamble_chinese_restaurant_process.pi for an alternative model but that model has different probability "dynamics". Cf my Gamble model gamble_chinese_restaurant_process2.rkt This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Picat page: http://www.hakank.org/picat/ */ import ppl_distributions, ppl_utils. import util. % import ordset. main => go. /* Probability distrubution crp_dist. Samples: [3,2,3,3,3,1,2,3,1,2,2,2,2,2,4,4,2,3,1,2,1,1,1,3,1,2,1,1,6,1] mean = 2.13326 PDF: 1 0.28377319275152 2 0.401392676810632 3 0.229263684372043 4 0.070739977447924 5 0.013163307280954 6 0.001546239564815 7 0.000115467607728 8 0.000005315175594 9 0.000000137461438 CDF: 1 0.28377319275152 2 0.685165869562153 3 0.914429553934195 4 0.985169531382119 5 0.998332838663073 6 0.999879078227889 7 0.999994545835616 8 0.99999986101121 9 0.999999998472648 Quantiles: 0.000001 = 1 0.00001 = 1 0.001 = 1 0.01 = 1 0.025 = 1 0.05 = 1 0.1 = 1 0.25 = 1 0.5 = 2 0.75 = 3 0.84 = 3 0.9 = 3 0.975 = 4 0.99 = 5 0.999 = 6 0.99999 = 7 0.999999 = 8 The probability (CDF) of 9 and 10 filled tables are quite unusual: Picat> X=crp_dist_cdf(1/2,10,9) X = 0.999999998472648 Picat> X=crp_dist_cdf(1/2,10,10) X = 0.999999999999997 */ go ?=> Theta = 1/2, N = 10, println("Samples:"), println(crp_dist_n(Theta,N,30)), println(mean=crp_dist_mean(Theta,N)), nl, println("PDF:"), pdf_all($crp_dist(Theta,N),0.01,0.9999999).printf_list, nl, println("CDF:"), cdf_all($crp_dist(Theta,N),0.01,0.9999999).printf_list, nl, println("Quantiles:"), foreach(Q in quantile_qs()) println(Q=crp_dist_quantile(Theta,N,Q)), end, nl. go => true. /* The model below supports two variants for showing the tables: - sort-tables: true : This means that the tables are sorted to break symmetries. It make sense if one interpreted this as the customer is seated by an already occupied table (1 / (n + theta)). Using this option, the theoretical value of K is seem directly as probability on K filled tables (b). - sort-tables: false: Here the tables are not sorted. For this option, it's harder to related the tables (b) and the probabilities. Note that this only change the presentaion of the b variable. The probabiities k is not changed [max_quests = 10,theta = 0.5,sort_tables = true] var : b Probabilities: [1,1,0,0,0,0,0,0,0,0]: 0.3872000000000000 [1,0,0,0,0,0,0,0,0,0]: 0.2836000000000000 [1,1,1,0,0,0,0,0,0,0]: 0.2431000000000000 [1,1,1,1,0,0,0,0,0,0]: 0.0707000000000000 [1,1,1,1,1,0,0,0,0,0]: 0.0131000000000000 [1,1,1,1,1,1,0,0,0,0]: 0.0021000000000000 [1,1,1,1,1,1,1,0,0,0]: 0.0002000000000000 mean = [] var : k Probabilities: 2: 0.3872000000000000 1: 0.2836000000000000 3: 0.2431000000000000 4: 0.0707000000000000 5: 0.0131000000000000 6: 0.0021000000000000 7: 0.0002000000000000 mean = 2.1496 theoretical_mean = 2.13326 PDF: 2 0.401392676810632 1 0.28377319275152 3 0.229263684372043 4 0.070739977447924 5 0.013163307280954 6 0.001546239564815 7 0.000115467607728 8 0.000005315175594 9 0.000000137461438 [max_quests = 10,theta = 0.5,sort_tables = false] var : b Probabilities (truncated): [1,0,0,0,0,0,0,0,0,0]: 0.2866000000000000 [1,1,0,0,0,0,0,0,0,0]: 0.1448000000000000 [1,0,1,0,0,0,0,0,0,0]: 0.0681000000000000 [1,0,0,1,0,0,0,0,0,0]: 0.0482000000000000 ......... [1,0,0,0,1,0,0,1,0,1]: 0.0001000000000000 [1,0,0,0,0,1,0,1,1,0]: 0.0001000000000000 [1,0,0,0,0,1,0,1,0,1]: 0.0001000000000000 [1,0,0,0,0,0,0,1,1,1]: 0.0001000000000000 mean = [] var : k Probabilities: 2: 0.4019000000000000 1: 0.2866000000000000 3: 0.2283000000000000 4: 0.0683000000000000 5: 0.0136000000000000 6: 0.0012000000000000 7: 0.0001000000000000 mean = 2.1244 theoretical_mean = 2.13326 PDF: 2 0.401392676810632 1 0.28377319275152 3 0.229263684372043 4 0.070739977447924 5 0.013163307280954 6 0.001546239564815 7 0.000115467607728 8 0.000005315175594 9 0.000000137461438 [max_quests = 100,theta = 0.5,sort_tables = true] var : b Probabilities (truncated): [1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.2800000000000000 [1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.2317000000000000 [1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.2116000000000000 [1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.1195000000000000 ......... [1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.0013000000000000 [1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.0001000000000000 [1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.0001000000000000 [1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]: 0.0001000000000000 mean = [] var : k Probabilities (truncated): 3: 0.2800000000000000 2: 0.2317000000000000 4: 0.2116000000000000 5: 0.1195000000000000 ......... 9: 0.0013000000000000 12: 0.0001000000000000 11: 0.0001000000000000 10: 0.0001000000000000 mean = 3.2666 theoretical_mean = 3.28434 PDF: 3 0.279181826450048 2 0.229703516788347 4 0.214054165260315 5 0.117130720973547 1 0.088733539714164 6 0.049002166265233 7 0.016381147140072 8 0.004512998691993 9 0.001048321130806 10 0.000208964181523 11 0.0000362466743 12 0.000005533804576 13 0.000000750650872 14 0.000000091193925 15 0.000000009989839 */ go2 ?=> Theta = 1/2, MaxGuests = 10, % MaxGuests = 100, % SortTables = true, SortTables = false, println([max_quests=MaxGuests,theta=Theta,sort_tables=SortTables]), reset_store, run_model(10_000,$model2(MaxGuests,Theta,SortTables),[show_probs_trunc,mean]), println(theoretical_mean=crp_dist_mean(Theta,MaxGuests)), println("PDF:"), pdf_all($crp_dist(Theta,MaxGuests),0.01,0.99999999).sort_down(2).printf_list, nl, % show_store_lengths,nl, % fail, nl. go2 => true. model2(MaxGuests,Theta,SortTables) => T = [bern( Theta/(I - 1 + Theta)) : I in 1..MaxGuests], B = cond(SortTables == true,T.sort_down,T), % Number of filled tables K = B.sum, add("b",B), add("k",K).