Generalised Game Show Problem

We generalise a result in Professor David Mackay’s book on inference. Bayes theorem also plays a crucial role in decision-making:

P(A|D)=P(D|A)P(A)P(D|A)P(A)+P(D|¬A)P(¬A)

Let us consider a worked example which demonstrates how unintuitive results following from this 260-year-old theorem can be.

In the Game Show example, we can simplify Bayes’ Theorem by using a form that expands out all individual doors, and then cancelling off the unconditional probability of each door in numerator and denominator as they are each P(any door)=1n, where there are n doors. We consider the two distinct representative cases:

P(prize behind my chosen doorm)=P(mprize behind my chosen door)P

P(prize behind another doorm)=P(mprize behind another door)P

where here

P=1/(n1m)+(n1)/(n2m)

and where m means that any m doors were removed at the usual intermediate stage after I chose a door. When we start with n doors, and one has been chosen, and that door happens to have the prize behind it, then the Game Show Host is free to remove m doors from the set of n1 doors and so there are (n1m) available ways for the host to do so.

The probability P(anym1) is therefore 1/(n1m), since we equivocate between all the host’s options. For the other case, there are only n2 doors for the host to choose m from, so the number of ways is (n2m), and the probability P(anymnot 1) is therefore 1/(n2m).

after some algebra I find that

P(prize behind my chosen doorm)=n1mn1m+(n1)2

and

P(prize behind another doorm)=n1n1m+(n1)2

Thus, our probability factor is given by: P(prize behind another doorm)P(prize behind my chosen doorm)=n1n1m

This expression gives us back, from the original game, our game strategy factor of 2 times better if we shift door when n=3 and m=1. The factor rises to n1 better for shifting to another door for any n2 and m=n2 doors (all but one other than the one the player chose), and for the shift strategy a probability factor that tends to unity from above when n and m=1, i.e. only one door is removed.

An informal survey by Student-b has shown that quite often, intuition among a random sample of people asked, is lacking. Some will believe it is better to stick, and some say in the standard three-door game that it is slightly better to shift. The mathematics show that for positive n and m: it is always better to shift!

Homage to Bazball Strategy

We are selectors for a cricket team who wish to make decisions based on data around the success or otherwise of our players’ strategies or relative performances that we trust and believe to be comparable.
If we follow Bayes’ procedure to calculate the probability that the “real” average runs scored, b, by our batsman as player B (batting as “Bazball”) is greater than that of the same batsman as player A, a (batting as “Anodyne”) for data set averages x¯ and y¯respectively for the player, and with n and m data points, then setting s=nx¯+12 and t=my¯+12, we find that:

Prob (b>a)=mtΓ(s)Γ(t)0at1emaΓ(s,na)da

where we have assumed a Poisson distribution to parametrise the average numbers of runs scored in the two strategies, and the Jeffreys’ uninformative prior distribution for this model.

For example, if there are m=6 innings at y¯=30.0 for A, and n=8 innings at x¯=32.0 for B, then we find that the above equation =0.7407 and the odds are almost 3:1 on that the strategy as B has a greater average than that of the player as A. If we have a rule that the odds on a strategy for at least 5 innings must be better than 4:1 on, then our decision in this case would be to ask the player to continue to bat as A, pending more information. If we only required 2.5:1 then we would be inclined to try strategy B for this player.

Another angle on this is to look at the average number of balls `survived’ by the player in an innings, with the two strategies. That is, I am interested in the number of balls faced at the point the player is out. To model this, I start from the Pascal distribution with an uninformative (uniform) prior, since this distribution is a discrete one which models a future number n of independent trials, which contain r failures. I have r=1 for this case, as cricket is unforgiving! The end of the trials is on the nth ball, when the player fails, i.e. is out. This simplifies things:

f(x)=P(X=x)=(x1r1)pr(1p)xr=pr(1p)xr

where p is the probability of getting out on any given ball and X is the random quantity realised as x in a given trial (innings), the number of balls received up to and including getting out. The mean of the quantity x is 1p. This makes sense, as the number of balls faced in an innings is greater than or equal to one. Our prior distribution (density) for the unknown value of p was taken to be B(1,1). The Beta family is conjugate with respect to the Pascal distribution (negative binomial distribution), in such a way that for prior B(α,β), the posterior is also a Beta density with parameters α+nr and β+nx¯nr, where x¯ is the mean of the data x and there were n trials (innings). Since r=1 the conditional probability density is:

f(px)=B1pn(1p)s

where s=n(x¯1) and B=B(1+n,1+s). This means that if we compare two sets of innings, n and m, with average balls faced until getting out, x¯1 and y¯1respectively, then, relabelling the respective unknowns we find the probability of B getting out sooner being greater than that of A is:

Prob(β>α)=1CnCm01dαα1αm(1α)tβn(1β)sdβ

where t=m(y¯1) and Cn=B(1+n,1+s)) and Cm=B(1+m,1+t) are constants calculated from the data, which normalise the integrals. From this probability we can deduce the odds that one strategy is longer-lasting than the other, ignoring run rates this time.

We can then also simplify by converting the discrete geometric distribution to the continuous, and computationally slightly easier exponential distribution f(x)=pepx. The resulting relative probability, corresponding to and in good agreement with the equation above, is:

Prob(β>α)=tm+1Γ(n+1)Γ(m+1)01αmeαtΓ(n+1,sα,s)dα

where α and β are the real probabilities of getting out on the next or any given delivery, given by the inverse of the respective real average numbers of balls faced up to the point of failure, i.e. getting out. The Γ(a,b,c) in the integrand is the generalised, incomplete Gamma function arising from the first integration over all cases in the joint distribution where β>α. Note that the integral this time is over all cases from zero to one since the variables α (and β) are probabilities.

One can compare a set of innings in A and B modes, this time ignoring runs scored but focusing on how long the innings were and again perhaps having a rule for deciding which strategy is optimal and how to apply the rule to make the judgement.

If in the first strategy A the player stays at the crease for an average of y¯=25 balls, in m=4 innings and in strategy B, x¯=20 balls, in n=5 innings, I find that the probability that the unknown parameter β, representing the probability of getting out next ball in strategy B is higher than the unknown parameter α, representing the probability of getting out next ball in strategy A is 0.623 in the exponential distribution calculation. The Pascal calculation gave 0.626, under 0.5% off.

The applications for this kind of Bayesian joint-probability A/B comparison to get simple odds for or against are miriad and go far beyond the tip of the iceberg which is sport strategy; they are numerous in business and governmental strategy.

Student-b: Probability Tables

Student-b

Dear Sir/Madam,

In this letter, I derive and tabulate the maximum entropy values for the probabilities of each side of biased n-sided dice, for n=3,4,6,8,10,12,15, and 20. These probabilities for each of the n options (sides), are those which have the least input information beyond what we know, which is nothing more than the bias or average score on the n-sided die. I generalise the “Brandeis dice” problem from E T Jaynes’ 1963 lectures, to an n-sided die, from the 6-sided case. To calculate these probabilities, I obtain the solution of an n+1-order polynomial equation, derived using a power series identity, for the value of the Lagrange multiplier, λ. The resulting maximally-equivocated prior probabilities at the 5th, 17th, 34th, 50th (fair), 66th, 83rd, and 95th percentiles of the range from 1 up to will aid in decision-making, where the options are the conditions we cannot influence, but across which we may have a non-linear payoff.

We use the standard variational principle in order to maximise the entropy in the system.

i=1npifk(xi)=Fk

i=1npi=1

where the index k is not summed over in the first equation, and where the pi are the probabilities of the n options, e.g. sides of an n-dice. Fk are the numbers given in the problem statement (constraints or biases), and fk(xi) are functions of the Lagrange multipliers λi. The second equation is just the probability axiom requiring the probabilities to sum to one. This set of constraints is solved by using Lagrange multipliers. The formal solution is

pi=1Zexp[λ1f1(x1)...λmfm(xi)]

where Z(λ1,...,λm)=i=1nexp[λ1f1(x1)...λmfm(xi)]is the partition function and λk are the set of multipliers, of which for a solution to the problem there need to be fewer than n, though in our current problem as we shall see, there is only one. The constraints are satisfied if:

Fk=λklogeZ

for k ranging from 1 to m.

Our measure of entropy is given by S=i=1npilogepi and in terms of our constraints, i.e. the data, this function is:

S(F1,...,Fm)=logeZ+k=1mλkFk

The solution for the maximum of S is:

λk=SFk

For k in same range up to m. For our set of n-sided dice, m=1 and so I can simplify Fkto F. The fk(xi) are simply the set of i the values on the n sides of our die.

For the problem at hand of the biased die, I introduce the quantity q which I define as the tested, trusted average score on the given n-sided die in hand. That is, I set F=q here, our bias constraint number, which can range from the lowest die value 1 through to the highest value, n.

q=q0:=12(n+1)

i.e. 3.5 on a 6-sided die, then the die is fair, otherwise, it has a bias and therefore an additional constraint. I assume this is all I know and believe about the die, other than the number of sides, n.

We see that λk becomes just λ and the equation for Fk reduces to

F=λlogeZ

and the equation for Sk reduces to

S(F)=logeZ+λF

and its solution is

λ=SF

After a little algebra, I found that the partition function Z is given by

Z=x(xn1)(x1)

and after some further algebra, I found that in order to determine the value of x, where x=eλ, corresponding to the maximum entropy (least input information) set of probabilities, we must find the positive, real root of the following equation, which is not unity:

(nq)xn+1+(q(n+1))xn1+qx+(1q)=0

By inspection, this equation is always satisfied by the real solution x = 1, which corresponds to the fair or unbiased die, with all probabilities equal to 1/n for n sides. We need the other real root, and we obtain this by simple numerical calculation. From the solution x=xq for the given value of bias q, the set of probabilities corresponding to maximum entropy for each side of the relevant n-sided die are easily generated.

The following tables may be of use in decisionmaking in business and other contexts, especially where the agent (the organisation or individual making a decision) has a non-linear desirability or utility function over the outcomes (i.e. the values of the discrete set of possible options), does not have perfect intuition and does not wish to put any more information into the decision that is not within the agent’s state of knowledge.

I present tables for n = 3, 4, 6, 8, 10, 12, 15, and 20 here, each at 7 bias values of q for each n, corresponding to percentages of the range from 1 to n of 5%, 17, 34, 50, 66, 83 and 95%. There is transformation group symmetry in this problem. If i represents the side with i spots up, then when we reflect from in+1i and transform x1x we obtain the same probability, e.g. the probability of a 1 on a six sided die at 5th percentile bias is the same as a 6 at 95th percentile bias. This is why in our tables we can observe the corresponding symmetry in the values of the probabilities and in the entropy, which is maximal of all biases when there is no bias and thus no constraint. Readers may wish arbitrarily to adjust any of the probabilities in the tables in the appendix and recalculate the entropy S=i=1npilogepi, which will be lower than the maximum entropy value in the table.

APPENDIX Student-b Maximum Entropy Probability Tables

n=3
q05 q17 q34 q0 q66 q83 q95
q-vals 1.1 1.34 1.68 2 2.32 2.66 2.9
Score Probabilities
1 0.9078 0.7232 0.5064 0.3333 0.1864 0.0632 0.0078
2 0.0843 0.2137 0.3072 0.3333 0.3072 0.2137 0.0843
3 0.0078 0.0632 0.1864 0.3333 0.5064 0.7232 0.9078
entropy 0.3343 0.7386 1.0203 1.0986 1.0203 0.7386 0.3343
n=4
q05 q17 q34 q0 q66 q83 q95
q-vals 1.15 1.51 2.02 2.5 2.98 3.49 3.85
Score Probabilities
1 0.8689 0.6425 0.4136 0.25 0.1241 0.0324 0.002
2 0.1141 0.2374 0.2769 0.25 0.1854 0.0877 0.015
3 0.015 0.0877 0.1854 0.25 0.2769 0.2374 0.1141
4 0.002 0.0324 0.1241 0.25 0.4136 0.6425 0.8689
Entropy 0.445 0.9502 1.2921 1.3863 1.2921 0.9502 0.445
n=6
q05 q17 q34 q0 q66 q83 q95
q-vals 1.250 1.85 2.70 3.5 4.3 5.15 5.75
Score Probabilities
1 0.7998 0.5260 0.3043 0.1666 0.072 0.0134 0.0003
2 0.1602 0.2527 0.2282 0.1667 0.0961 0.028 0.0013
3 0.0321 0.1214 0.1711 0.1667 0.1282 0.0583 0.0064
4 0.0064 0.0583 0.1282 0.1667 0.1711 0.1214 0.0321
5 0.0013 0.028 0.0961 0.1667 0.2282 0.2527 0.1602
6 0.0003 0.0135 0.0721 0.1667 0.3043 0.526 0.7998
Entropy 0.6254 1.2655 1.6794 1.7918 1.6794 1.2655 0.6254

APPENDIX Student-b Maximum Entropy Probability Tables (ctd)

n=8
q05 q17 q34 q0 q66 q83 q95
q-vals 1.35 2.19 3.38 4.5 5.62 6.81 7.65
Score Probabilities
1 0.7407 0.4454 0.2412 0.125 0.05 0.0076 0.0001
2 0.1921 0.2489 0.1927 0.125 0.0626 0.0136 0.0002
3 0.0498 0.1391 0.1539 0.125 0.0784 0.0243 0.0009
4 0.0129 0.0777 0.1229 0.125 0.0982 0.0434 0.0034
5 0.0034 0.0434 0.0982 0.125 0.1229 0.0777 0.0129
6 0.0009 0.0243 0.0784 0.125 0.1539 0.1391 0.0498
7 0.0002 0.0136 0.0626 0.125 0.1927 0.2489 0.1921
8 0.0001 0.0076 0.05 0.125 0.2412 0.4454 0.7407
Entropy 0.7726 1.5012 1.9569 2.0794 1.9569 1.5012 0.7726
n=10
q05 q17 q34 q0 q66 q83 q95
q-vals 1.45 2.53 4.06 5.5 6.94 8.47 9.55
Score Probabilities
1 0.6896 0.3862 0.2 0.1 0.0381 0.005 0.0000
2 0.214 0.2382 0.1663 0.1 0.0458 0.0081 0.0001
3 0.0664 0.147 0.1383 0.1 0.055 0.0131 0.0002
4 0.0206 0.0907 0.115 0.1 0.0662 0.0213 0.0006
5 0.0064 0.0559 0.0957 0.1 0.0796 0.0345 0.002
6 0.002 0.0345 0.0796 0.1 0.0957 0.0559 0.0064
7 0.0006 0.0213 0.0662 0.1 0.115 0.0907 0.0206
8 0.0002 0.0131 0.055 0.1 0.1383 0.147 0.0664
9 0.0001 0.0081 0.0458 0.1 0.1663 0.2382 0.214
10 0.0000 0.005 0.0381 0.1 0.2 0.3862 0.6896
Entropy 0.8981 1.6905 2.1735 2.3026 2.1735 1.6905 0.8981

APPENDIX Student-b Maximum Entropy Probability Tables (ctd)

n=12
q05 q17 q34 q0 q66 q83 q95
q-vals 1.55 2.87 4.74 6.5 8.26 10.13 11.45
Score Probabilities
1 0.6451 0.3408 0.1708 0.0833 0.0306 0.0036 0.0000
2 0.2289 0.2255 0.1461 0.0833 0.0358 0.0055 0.0000
3 0.0812 0.1492 0.125 0.0833 0.0419 0.0083 0.0001
4 0.0288 0.0987 0.1069 0.0833 0.049 0.0125 0.0002
5 0.0102 0.0653 0.0915 0.0833 0.0572 0.0189 0.0005
6 0.0036 0.0432 0.0782 0.0833 0.0669 0.0286 0.0013
7 0.0013 0.0286 0.0669 0.0833 0.0782 0.0432 0.0036
8 0.0005 0.0189 0.0572 0.0833 0.0915 0.0653 0.0102
9 0.0002 0.0125 0.049 0.0833 0.1069 0.0987 0.0288
10 0.0001 0.0083 0.0419 0.0833 0.125 0.1492 0.0812
11 0.0000 0.0055 0.0358 0.0833 0.1461 0.2255 0.2289
12 0.0000 0.0036 0.0306 0.0833 0.1708 0.3408 0.6451
Entropy 1.0078 1.8001 2.1252 2.4849 1.7684 1.1462 1.0081
n=15
q05 q17 q34 q0 q66 q83 q95
q-vals 1.7 3.38 5.76 8 10.24 12.62 14.3
Score Probabilities
1 0.5882 0.2898 0.1402 0.0667 0.0236 0.0025 0.0000
2 0.2422 0.2063 0.1235 0.0667 0.0269 0.0035 0.0000
3 0.0997 0.1469 0.1087 0.0667 0.0305 0.0049 0.0000
4 0.0411 0.1046 0.0957 0.0667 0.0346 0.0069 0.0000
5 0.0169 0.0745 0.0843 0.0667 0.0393 0.0097 0.0001
6 0.007 0.053 0.0743 0.0667 0.0447 0.0136 0.0002
7 0.0029 0.0378 0.0654 0.0667 0.0507 0.0191 0.0005
8 0.0012 0.0269 0.0576 0.0667 0.0576 0.0269 0.0012
9 0.0005 0.0191 0.0507 0.0667 0.0654 0.0378 0.0029
10 0.0002 0.0136 0.0447 0.0667 0.0743 0.053 0.007
11 0.0001 0.0097 0.0393 0.0667 0.0843 0.0745 0.0169
12 0.0000 0.0069 0.0346 0.0667 0.0957 0.1046 0.0411
13 0.0000 0.0049 0.0305 0.0667 0.1087 0.1469 0.0997
14 0.0000 0.0035 0.0269 0.0667 0.1235 0.2063 0.2422
15 0.0000 0.0025 0.0236 0.0667 0.1402 0.2898 0.5882
Entropy 1.1517 2.0471 2.5698 2.7081 2.5698 2.0471 1.1517

APPENDIX Student-b Maximum Entropy Probability Tables (ctd)

n=20
q05 q17 q34 q0 q66 q83 q95
q-vals 1.95 4.23 7.46 10.5 13.54 16.77 19.05
Score Probabilities
1 0.5128 0.2318 0.108 0.05 0.0171 0.0016 0.0000
2 0.2498 0.1784 0.098 0.05 0.0188 0.0021 0.0000
3 0.1217 0.1372 0.0889 0.05 0.0207 0.0027 0.0000
4 0.0593 0.1056 0.0807 0.05 0.0229 0.0035 0.0000
5 0.0289 0.0812 0.0732 0.05 0.0252 0.0045 0.0000
6 0.0141 0.0625 0.0665 0.05 0.0278 0.0059 0.0000
7 0.0069 0.0481 0.0603 0.05 0.0306 0.0077 0.0000
8 0.0033 0.037 0.0547 0.05 0.0337 0.01 0.0001
9 0.0016 0.0285 0.0497 0.05 0.0371 0.013 0.0002
10 0.0008 0.0219 0.0451 0.05 0.0409 0.0169 0.0004
11 0.0004 0.0169 0.0409 0.05 0.0451 0.0219 0.0008
12 0.0002 0.013 0.0371 0.05 0.0497 0.0285 0.0016
13 0.0001 0.01 0.0337 0.05 0.0547 0.037 0.0033
14 0.0000 0.0077 0.0306 0.05 0.0603 0.0481 0.0069
15 0.0000 0.0059 0.0278 0.05 0.0665 0.0625 0.0141
16 0.0000 0.0045 0.0252 0.05 0.0732 0.0812 0.0289
17 0.0000 0.0035 0.0229 0.05 0.0807 0.1056 0.0593
18 0.0000 0.0027 0.0207 0.05 0.0889 0.1372 0.1217
19 0.0000 0.0021 0.0188 0.05 0.098 0.1784 0.2498
20 0.0000 0.0016 0.0171 0.05 0.108 0.2318 0.5128
Entropy 1.351 2.3085 2.8526 2.9957 2.8526 2.3085 1.351