/* Records: kth difference in Picat. What are the probabilities of the kth differences in records? Note: Here we check the indices of the records, not the record values. The theoretical kth_diff_p(K,M) for K=1 seems to be correct, but not for K > 1. For K = 1, the probability only depends on the second value, since the first value (index) is always 1. For K > 1, then it depends on both the second and the third value, which the formula kth_diff_p(K,M) does not cover. Also, the probability of the Kth difference depends on that we actually have K+1 records. For K > 1, then the value of 0 (i.e. when the number of records is < K + 1), dominates. Note: I tried to "observe away" these cases but the formula is still not correct. For permutations and continous distributions the distribution of kth difference seems to be quite stable for K = 1..3, * K = 1 As mentioned above, it follows fairöly well the theoretical formula. The mean is around 1.9 * K = 2 The mean is around 1.6 * K = 3 The mean is around 0.7 However, for discrete distributions, this does not hold, e.g poisson(1), random_integer(10, i.e. when there are large possibilities of duplicates. random_integer(100) behaves more like the one above. Cf my Gamble model gamble_record_kth_difference.rkt This program was created by Hakan Kjellerstrand, hakank@gmail.com See also my Picat page: http://www.hakank.org/picat/ */ import ppl_distributions, ppl_utils, ppl_common_utils. import util. % import ordset. main => go. /* I asked ChatGPT about the distribution of the (first) difference of the records. Distribution of General Record Time Differences P(DiffT(k) = m) For k=1 (the first difference): 1/((m+1)*m) But for k > 1, this is not correct. */ kth_diff_p(K,M) = Res => if K == 1 then Res = 1 / ((M + 1) * M) elseif (( (K + M - 2) * (K + M - 3)) == 0) then printf("Division by 0: K: %d M: %d\n",K,M), Res = 0 else Res = 1 / ( (K + M - 2) * (K + M - 3)) end. /* The first difference (K=1) seems to be correct, but not the second or third. The second and third differences are (just) the first shifted by K-1 steps. First difference: 1 0.5 (1 / 2) 2 0.166666666666667 (1 / 6) 3 0.083333333333333 (1 / 12) 4 0.05 (1 / 20) 5 0.033333333333333 (1 / 30) 6 0.023809523809524 (1 / 42) 7 0.017857142857143 (1 / 56) 8 0.013888888888889 (1 / 72) 9 0.011111111111111 (1 / 90) 10 0.009090909090909 (1 / 110) 11 0.007575757575758 (1 / 132) 12 0.006410256410256 (1 / 156) 13 0.005494505494505 (1 / 182) 14 0.004761904761905 (1 / 210) 15 0.004166666666667 (1 / 240) 16 0.003676470588235 (1 / 272) 17 0.003267973856209 (1 / 306) 18 0.002923976608187 (1 / 342) 19 0.002631578947368 (1 / 380) 20 0.002380952380952 (1 / 420) Second difference: Division by 0: K: 2 M: 1 1 0 (0 / 1) 2 0.5 (1 / 2) 3 0.166666666666667 (1 / 6) 4 0.083333333333333 (1 / 12) 5 0.05 (1 / 20) 6 0.033333333333333 (1 / 30) 7 0.023809523809524 (1 / 42) 8 0.017857142857143 (1 / 56) 9 0.013888888888889 (1 / 72) 10 0.011111111111111 (1 / 90) Third difference: 1 0.5 (1 / 2) 2 0.166666666666667 (1 / 6) 3 0.083333333333333 (1 / 12) 4 0.05 (1 / 20) 5 0.033333333333333 (1 / 30) 6 0.023809523809524 (1 / 42) 7 0.017857142857143 (1 / 56) 8 0.013888888888889 (1 / 72) 9 0.011111111111111 (1 / 90) 10 0.009090909090909 (1 / 110) */ go ?=> println("First difference:"), [M=kth_diff_p(1,M).and_rat : M in 1..20].printf_list, nl, println("Second difference:"), [M=kth_diff_p(2,M).and_rat : M in 1..10].printf_list, nl, println("Third difference:"), [M=kth_diff_p(3,M).and_rat : M in 1..10].printf_list, nl, println("Fourth difference:"), [M=kth_diff_p(4,M).and_rat : M in 1..10].printf_list, nl. go => true. /* The first difference (K=1) seems to be OK, but not the second or third. * For N=10 draw_without_replacement(N). [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4928000000000000 2: 0.1672000000000000 0: 0.0993000000000000 3: 0.0881000000000000 4: 0.0535000000000000 5: 0.0323000000000000 6: 0.0241000000000000 7: 0.0179000000000000 8: 0.0141000000000000 9: 0.0107000000000000 mean = 1.946 Theoretical kth diff prob: 1 0.5 (1 / 2) 2 0.166666666666667 (1 / 6) 3 0.083333333333333 (1 / 12) 4 0.05 (1 / 20) 5 0.033333333333333 (1 / 30) 6 0.023809523809524 (1 / 42) 7 0.017857142857143 (1 / 56) 8 0.013888888888889 (1 / 72) 9 0.011111111111111 (1 / 90) [n = 10,k = 2] var : kth diff Probabilities: 0: 0.3757000000000000 1: 0.2442000000000000 2: 0.1363000000000000 3: 0.0861000000000000 4: 0.0616000000000000 5: 0.0393000000000000 6: 0.0262000000000000 7: 0.0196000000000000 8: 0.0110000000000000 mean = 1.6004 Theoretical kth diff prob: Division by 0: K: 2 M: 1 1 0 (0 / 1) 2 0.5 (1 / 2) 3 0.166666666666667 (1 / 6) 4 0.083333333333333 (1 / 12) 5 0.05 (1 / 20) 6 0.033333333333333 (1 / 30) 7 0.023809523809524 (1 / 42) 8 0.017857142857143 (1 / 56) 9 0.013888888888889 (1 / 72) [n = 10,k = 3] var : kth diff Probabilities: 0: 0.7029000000000000 1: 0.1089000000000000 2: 0.0699000000000000 3: 0.0467000000000000 4: 0.0331000000000000 5: 0.0190000000000000 6: 0.0149000000000000 7: 0.0046000000000000 mean = 0.7378 Theoretical kth diff prob: 1 0.5 (1 / 2) 2 0.166666666666667 (1 / 6) 3 0.083333333333333 (1 / 12) 4 0.05 (1 / 20) 5 0.033333333333333 (1 / 30) 6 0.023809523809524 (1 / 42) 7 0.017857142857143 (1 / 56) 8 0.013888888888889 (1 / 72) 9 0.011111111111111 (1 / 90) * normal_dist(100,15) N = 10 [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4946000000000000 2: 0.1698000000000000 0: 0.0998000000000000 3: 0.0876000000000000 4: 0.0495000000000000 5: 0.0332000000000000 6: 0.0205000000000000 7: 0.0179000000000000 8: 0.0139000000000000 9: 0.0132000000000000 mean = 1.9393 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.3870000000000000 1: 0.2496000000000000 2: 0.1272000000000000 3: 0.0804000000000000 4: 0.0528000000000000 5: 0.0415000000000000 6: 0.0321000000000000 7: 0.0186000000000000 8: 0.0108000000000000 mean = 1.5731 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.7027000000000000 1: 0.1097000000000000 2: 0.0666000000000000 3: 0.0453000000000000 4: 0.0343000000000000 5: 0.0234000000000000 6: 0.0126000000000000 7: 0.0054000000000000 mean = 0.7464 * exponential_dist(1) [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4977000000000000 2: 0.1596000000000000 0: 0.1036000000000000 3: 0.0787000000000000 4: 0.0500000000000000 5: 0.0373000000000000 6: 0.0273000000000000 7: 0.0186000000000000 8: 0.0151000000000000 9: 0.0121000000000000 mean = 1.9632 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.3802000000000000 1: 0.2450000000000000 2: 0.1310000000000000 3: 0.0890000000000000 4: 0.0548000000000000 5: 0.0402000000000000 6: 0.0284000000000000 7: 0.0202000000000000 8: 0.0112000000000000 mean = 1.5956 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.7094000000000000 1: 0.1072000000000000 2: 0.0688000000000000 3: 0.0451000000000000 4: 0.0307000000000000 5: 0.0201000000000000 6: 0.0122000000000000 7: 0.0065000000000000 mean = 0.7221 * cauchy_dist(2,2) [n = 10,k = 1] var : kth diff Probabilities: 1: 0.5040000000000000 2: 0.1639000000000000 0: 0.0963000000000000 3: 0.0858000000000000 4: 0.0530000000000000 5: 0.0344000000000000 6: 0.0252000000000000 7: 0.0148000000000000 8: 0.0134000000000000 9: 0.0092000000000000 mean = 1.918 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.3843000000000000 1: 0.2387000000000000 2: 0.1366000000000000 3: 0.0852000000000000 4: 0.0576000000000000 5: 0.0390000000000000 6: 0.0274000000000000 7: 0.0194000000000000 8: 0.0118000000000000 mean = 1.5875 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.7036000000000000 1: 0.1068000000000000 2: 0.0668000000000000 3: 0.0500000000000000 4: 0.0318000000000000 5: 0.0209000000000000 6: 0.0129000000000000 7: 0.0072000000000000 mean = 0.7499 * poisson dist (10) [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4586000000000000 2: 0.1697000000000000 0: 0.1221000000000000 3: 0.0843000000000000 4: 0.0531000000000000 5: 0.0367000000000000 6: 0.0253000000000000 7: 0.0219000000000000 8: 0.0163000000000000 9: 0.0120000000000000 mean = 1.9903 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.4659000000000000 1: 0.1920000000000000 2: 0.1127000000000000 3: 0.0771000000000000 4: 0.0525000000000000 5: 0.0394000000000000 6: 0.0287000000000000 7: 0.0206000000000000 8: 0.0111000000000000 mean = 1.4609 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.8071000000000000 1: 0.0604000000000000 2: 0.0423000000000000 3: 0.0330000000000000 4: 0.0239000000000000 5: 0.0198000000000000 6: 0.0089000000000000 7: 0.0046000000000000 mean = 0.5242 * random_integer(100) [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4964000000000000 2: 0.1644000000000000 0: 0.1076000000000000 3: 0.0846000000000000 4: 0.0501000000000000 5: 0.0311000000000000 6: 0.0222000000000000 7: 0.0176000000000000 8: 0.0144000000000000 9: 0.0116000000000000 mean = 1.9109 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.3963000000000000 1: 0.2330000000000000 2: 0.1258000000000000 3: 0.0871000000000000 4: 0.0581000000000000 5: 0.0424000000000000 6: 0.0270000000000000 7: 0.0179000000000000 8: 0.0124000000000000 mean = 1.5768 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.7216000000000000 1: 0.0945000000000000 2: 0.0650000000000000 3: 0.0481000000000000 4: 0.0328000000000000 5: 0.0203000000000000 6: 0.0122000000000000 7: 0.0055000000000000 mean = 0.7132 * random_integer(10) [n = 10,k = 1] var : kth diff Probabilities: 1: 0.4469000000000000 0: 0.1607000000000000 2: 0.1602000000000000 3: 0.0856000000000000 4: 0.0486000000000000 5: 0.0305000000000000 6: 0.0233000000000000 7: 0.0168000000000000 8: 0.0139000000000000 9: 0.0135000000000000 mean = 1.8611 [n = 10,k = 2] var : kth diff Probabilities: 0: 0.5232000000000000 1: 0.1841000000000000 2: 0.1019000000000000 3: 0.0626000000000000 4: 0.0473000000000000 5: 0.0323000000000000 6: 0.0257000000000000 7: 0.0141000000000000 8: 0.0088000000000000 mean = 1.2497 [n = 10,k = 3] var : kth diff Probabilities: 0: 0.8376000000000000 1: 0.0551000000000000 2: 0.0372000000000000 3: 0.0274000000000000 4: 0.0198000000000000 5: 0.0131000000000000 6: 0.0067000000000000 7: 0.0031000000000000 mean = 0.4183 */ go2 ?=> N = 10, member(K,1..3), println([n=N,k=K]), reset_store, run_model(10_000,$model2(N,K),[show_probs_trunc,mean % , % show_percentiles, % show_hpd_intervals,hpd_intervals=[0.94], % show_histogram, % min_accepted_samples=1000,show_accepted_samples=true ]), nl, % show_store_lengths,nl, fail, nl. go2 => true. model2(N,K) => % Permutations X = draw_without_replacement(N,1..N), % Discrete % X = poisson_dist_n(10,N), % X = random_integer1_n(10,N), % However this is fairly according the the formula % X = random_integer1_n(100,N), % Continuous % X = normal_dist_n(100,15,N), % X = exponential_dist_n(1,N), % X = cauchy_dist_n(2,2,N), RecordsIx = get_records_ix(X), % RecordsIx = get_records_ix_geq(X), % Using >= instead of > NumRecords = RecordsIx.len, % KthDiff is 0 when the first element is N if NumRecords >= K+1 then % DiffsMean = differences(RecordsIx).mean, KthDiff = RecordsIx[K+1]-RecordsIx[K], % KValK = RecordsIx[K+1], else % DiffsMean = 0, KthDiff = 0, % KValK = 0 end, % add("num records",NumRecords), % add("k val k",KValK), add("kth diff",KthDiff). % add("diffs mean",DiffsMean).