/* Kth record for permutations in Picat. What is the (index of) the k'th record for a permutation of m values? Note: The things that's studied are the _indices_ of the records, not the _values_. Also, it is assumed that it is a permutation, i.e. different values. This means that for discrete distributions with some probability of duplicates, this will not work. E.g. it might work quite well for 10 values of random_integer(100), but not for 10 values of random_integer(10). Cf my Gamble model gamble_record_kth_record.rkt This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Picat page: http://www.hakank.org/picat/ */ import ppl_distributions, ppl_utils. import util. % import ordset. main => go. /* Probability that N is the K'th record. For K=2..4. Note: 1 (one) cannot be the K'th (K >= 2) record index, since 1 is always the first record (index). K=2: 1 0.0 (0 / 1) 2 0.5 (1 / 2) 3 0.166666666666667 (1 / 6) 4 0.083333333333333 (1 / 12) 5 0.05 (1 / 20) 6 0.033333333333333 (1 / 30) 7 0.023809523809524 (1 / 42) 8 0.017857142857143 (1 / 56) 9 0.013888888888889 (1 / 72) 10 0.011111111111111 (1 / 90) mean = 2.82897 K=3: 1 0.0 (0 / 1) 2 0.0 (0 / 1) 3 0.166666666666667 (1 / 6) 4 0.125 (1 / 8) 5 0.091666666666667 (11 / 120) 6 0.069444444444444 (5 / 72) 7 0.054365079365079 (137 / 2520) 8 0.04375 (7 / 160) 9 0.036011904761905 (121 / 3360) 10 0.030198412698413 (761 / 25200) mean = 3.23165 K=4: 1 0.0 (0 / 1) 2 0.0 (0 / 1) 3 0.0 (0 / 1) 4 0.041666666666667 (1 / 24) 5 0.05 (1 / 20) 6 0.048611111111111 (7 / 144) 7 0.044642857142857 (5 / 112) 8 0.040277777777778 (29 / 720) 9 0.036188271604938 (469 / 12960) 10 0.032551807760141 (5691 / 174829) mean = 1.99427 Quantile: 0.000001 = 2 0.00001 = 2 0.001 = 2 0.01 = 2 0.025 = 2 0.05 = 2 0.1 = 2 0.25 = 2 0.5 = 2 0.75 = 4 0.84 = 7 0.9 = 10 0.975 = 41 0.99 = 101 */ go ?=> N = 10, println("K=2:"), [V=k_record_dist_pdf(2,V).and_rat : V in 1..N].printf_list, println(mean=k_record_dist_mean(2,N)), nl, println("K=3:"), [V=k_record_dist_pdf(3,V).and_rat : V in 1..N].printf_list, println(mean=k_record_dist_mean(3,N)), nl, println("K=4:"), [V=k_record_dist_pdf(4,V).and_rat : V in 1..N].printf_list, println(mean=k_record_dist_mean(4,N)), nl, println("Quantile:"), Qs = quantile_qs(), foreach(Q in Qs, Q <= 0.99) println(Q=k_record_dist_quantile(2,Q)), end, nl. go => true. /* Here are some experiments of this. For permutation-near distributions, including continuous distributuins, this is fairly accurate. But for distributions with some (higher) probability of duplicates, this does not work (see random_integer(10) below. The value of 0 represents the cases when the first element in the generated list is the max value. * draw_without_replacement(10,1..10) (permutation) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.4996000000000000 3: 0.1679000000000000 0: 0.1007000000000000 4: 0.0823000000000000 5: 0.0472000000000000 6: 0.0341000000000000 7: 0.0230000000000000 8: 0.0181000000000000 9: 0.0152000000000000 10: 0.0119000000000000 mean = 2.8343 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * random_integer1(100) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.4979000000000000 3: 0.1691000000000000 0: 0.1048000000000000 4: 0.0800000000000000 5: 0.0482000000000000 6: 0.0353000000000000 7: 0.0239000000000000 8: 0.0162000000000000 9: 0.0140000000000000 10: 0.0106000000000000 mean = 2.8048 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * normal(100,15) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.4969000000000000 3: 0.1583000000000000 0: 0.1057000000000000 4: 0.0875000000000000 5: 0.0500000000000000 6: 0.0361000000000000 7: 0.0254000000000000 8: 0.0150000000000000 9: 0.0137000000000000 10: 0.0114000000000000 mean = 2.8204 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * uniform(0,1) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.4923000000000000 3: 0.1684000000000000 0: 0.1062000000000000 4: 0.0825000000000000 5: 0.0499000000000000 6: 0.0324000000000000 7: 0.0238000000000000 8: 0.0177000000000000 9: 0.0153000000000000 10: 0.0115000000000000 mean = 2.8246 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * extreme_value(10,10) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.5027000000000000 3: 0.1596000000000000 0: 0.1026000000000000 4: 0.0849000000000000 5: 0.0488000000000000 6: 0.0339000000000000 7: 0.0255000000000000 8: 0.0162000000000000 9: 0.0142000000000000 10: 0.0116000000000000 mean = 2.8231 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * random_integer1(10) Note: This is NOT (guaranteed to b) a permutation. We can note that the mean is lower than for the permutation-like distributions (around 2.7 vs around 2.8) [n = 10,k = 2] var : record ix k Probabilities: 2: 0.4553000000000000 3: 0.1588000000000000 0: 0.1567000000000000 4: 0.0845000000000000 5: 0.0483000000000000 6: 0.0313000000000000 7: 0.0236000000000000 8: 0.0191000000000000 9: 0.0133000000000000 10: 0.0091000000000000 mean = 2.683 HPD intervals: HPD interval (0.94): 0.00000000000000..7.00000000000000 HPD interval (0.99): 0.00000000000000..9.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * For K = 3 draw_without_replacement(10,1..10) [n = 10,k = 3] var : record ix k Probabilities: 0: 0.3830000000000000 3: 0.1598000000000000 4: 0.1263000000000000 5: 0.0911000000000000 6: 0.0739000000000000 7: 0.0535000000000000 8: 0.0442000000000000 9: 0.0383000000000000 10: 0.0299000000000000 mean = 3.2553 HPD intervals: HPD interval (0.94): 0.00000000000000..9.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 * For K = 4 draw_without_replacement(10,1..10) [n = 10,k = 2] var : record ix k Probabilities: 0: 0.7140000000000000 6: 0.0513000000000000 5: 0.0485000000000000 7: 0.0440000000000000 4: 0.0407000000000000 8: 0.0379000000000000 9: 0.0326000000000000 10: 0.0310000000000000 mean = 1.9277 HPD intervals: HPD interval (0.94): 0.00000000000000..9.00000000000000 HPD interval (0.99): 0.00000000000000..10.00000000000000 HPD interval (0.99999): 0.00000000000000..10.00000000000000 */ go2 ?=> N = 10, K = 4, println([n=10,k=K]), reset_store, run_model(10_000,$model2(N,K),[show_probs_trunc,mean, % show_percentiles, show_hpd_intervals,hpd_intervals=[0.94,0.99,0.99999] % show_histogram, % min_accepted_samples=1000,show_accepted_samples=true ]), nl, % show_store_lengths,nl, % fail, nl. go2 => true. model2(N,K) => X = draw_without_replacement(N,1..N), % X = random_integer1_n(100,N), % X = random_integer1_n(10,N), % X = normal_dist_n(100,15,N), % X = uniform_dist_n(0,1,N), % X = extreme_value_dist_n(10,11,N), RecordsIx = get_records_ix(X), NumRecords = RecordsIx.len, % The K'th record index if NumRecords >= K then RecordIxK = RecordsIx[K] else RecordIxK = 0 end, % add("num records",NumRecords), add("record ix k",RecordIxK).