[Can't Stop](https://en.wikipedia.org/wiki/Can%27t_Stop_(board_game);) is a board game with these rules. In this notebook I will explore rational plays in the game. (Note: I use a term that is not in the rule book: what they call a "marker" or "neutral marker," I call a "sherpa": a guide for leading you to the top.)
The first thing I'll want to model is rolls of the dice. A player rolls four dice, and let's say gets 2-3-5-6. The player must arrange the four dice into two pairs. In this case there are three ways to do it:
I'll define the function Roll
to return a set of all possible ways of pairing up four dice. (I'll make it a hashable set, because I'll need that later.)
from collections import Counter
from itertools import combinations, product
from functools import lru_cache
def Roll(a, b, c, d):
"A set of all the pairs of numbers that can be made with these 4 dice."
return Set((Pair(a+b, c+d), Pair(a+c, b+d), Pair(a+d, b+c)))
def Pair(a, b): return (a, b) if a < b else (b, a)
class Set(set):
"A hashable set. Caveat mutator!"
def __hash__(self): return sum(map(hash, self))
For example:
Roll(2, 3, 5, 6) # You get 3 choices with 4 distinct numbers
{(5, 11), (7, 9), (8, 8)}
Roll(2, 3, 3, 6) # You get 2 choices with 3 distinct numbers
{(5, 9), (6, 8)}
Roll(2, 2, 2, 6) # You get 1 choice with 1 or 2 distinct numbers
{(4, 8)}
Now I want to consider all possible dice rolls. In this game, the roll 2-3-5-6 is the same as 6-5-3-2; there are 24 permutations of 4 numbers. So rather than a list
of rolls, I'll make a Counter
:
die = (1, 2, 3, 4, 5, 6)
all_rolls = Counter(Roll(*dice) for dice in product(die, repeat=4))
all_rolls
Counter({{(2, 2)}: 1, {(2, 3)}: 4, {(2, 4)}: 4, {(2, 5)}: 4, {(2, 6)}: 4, {(2, 7)}: 4, {(2, 4), (3, 3)}: 6, {(2, 5), (3, 4)}: 12, {(2, 6), (3, 5)}: 12, {(2, 7), (3, 6)}: 12, {(2, 8), (3, 7)}: 12, {(2, 6), (4, 4)}: 6, {(2, 7), (4, 5)}: 12, {(2, 8), (4, 6)}: 12, {(2, 9), (4, 7)}: 12, {(2, 8), (5, 5)}: 6, {(2, 9), (5, 6)}: 12, {(2, 10), (5, 7)}: 12, {(2, 10), (6, 6)}: 6, {(2, 11), (6, 7)}: 12, {(2, 12), (7, 7)}: 6, {(3, 4)}: 4, {(3, 5), (4, 4)}: 12, {(3, 6), (4, 5)}: 24, {(4, 6)}: 8, {(3, 7), (4, 6)}: 12, {(3, 8), (4, 7)}: 12, {(3, 7), (4, 6), (5, 5)}: 24, {(3, 8), (4, 7), (5, 6)}: 24, {(4, 8), (5, 7)}: 24, {(3, 9), (4, 8), (5, 7)}: 24, {(3, 8), (5, 6)}: 12, {(3, 9), (5, 7), (6, 6)}: 24, {(3, 10), (6, 7)}: 12, {(5, 8)}: 4, {(3, 10), (5, 8), (6, 7)}: 24, {(3, 11), (6, 8), (7, 7)}: 24, {(3, 12), (7, 8)}: 12, {(4, 7), (5, 6)}: 24, {(4, 8), (6, 6)}: 18, {(4, 9), (6, 7)}: 24, {(4, 9), (5, 8), (6, 7)}: 24, {(4, 10), (5, 9), (7, 7)}: 24, {(4, 10), (6, 8)}: 24, {(4, 11), (6, 9), (7, 8)}: 24, {(4, 12), (7, 9)}: 12, {(5, 9), (6, 8)}: 24, {(5, 10), (7, 8)}: 24, {(5, 10), (6, 9)}: 12, {(5, 11), (6, 10), (7, 9)}: 24, {(5, 12), (7, 10)}: 12, {(6, 10)}: 4, {(6, 11), (7, 10)}: 12, {(6, 12), (7, 11)}: 12, {(7, 12)}: 4, {(4, 4)}: 1, {(4, 5)}: 4, {(4, 7)}: 4, {(4, 8)}: 4, {(4, 6), (5, 5)}: 6, {(4, 9), (5, 8)}: 12, {(4, 10), (7, 7)}: 6, {(4, 11), (7, 8)}: 12, {(4, 12), (8, 8)}: 6, {(5, 6)}: 4, {(5, 7), (6, 6)}: 12, {(5, 8), (6, 7)}: 24, {(5, 9), (6, 8), (7, 7)}: 24, {(5, 10), (6, 9), (7, 8)}: 24, {(5, 11), (7, 9), (8, 8)}: 24, {(5, 12), (8, 9)}: 12, {(6, 8)}: 8, {(6, 9), (7, 8)}: 24, {(6, 10), (8, 8)}: 18, {(6, 10), (7, 9)}: 24, {(6, 11), (7, 10), (8, 9)}: 24, {(6, 12), (8, 10)}: 12, {(7, 10)}: 4, {(7, 11), (8, 10)}: 12, {(7, 12), (8, 11)}: 12, {(8, 12)}: 4, {(6, 6)}: 1, {(6, 7)}: 4, {(6, 9)}: 4, {(6, 8), (7, 7)}: 6, {(6, 11), (8, 9)}: 12, {(6, 12), (9, 9)}: 6, {(7, 8)}: 4, {(7, 9), (8, 8)}: 12, {(7, 10), (8, 9)}: 24, {(7, 11), (8, 10), (9, 9)}: 24, {(7, 12), (9, 10)}: 12, {(8, 10)}: 8, {(8, 11), (9, 10)}: 24, {(8, 12), (9, 11)}: 12, {(9, 12)}: 4, {(8, 8)}: 1, {(8, 9)}: 4, {(8, 10), (9, 9)}: 6, {(8, 12), (10, 10)}: 6, {(9, 10)}: 4, {(9, 11), (10, 10)}: 12, {(9, 12), (10, 11)}: 12, {(10, 12)}: 4, {(10, 10)}: 1, {(10, 11)}: 4, {(10, 12), (11, 11)}: 6, {(11, 12)}: 4, {(12, 12)}: 1})
# The number of different rolls, and the total number of rolls
len(all_rolls), sum(all_rolls.values())
(109, 1296)
I'll define the function best_pairing
to return the maximum worth that a roll can make, given the columns we are trying to make, and a function, worth
, that says how much it is worth to advance a space in a column. (The default is that each space advanced is worth one point, regardless of column.)
def one(col): "One space advanced is worth 1 point."; return 1
def best_pairing(columns, roll, worth=one):
"The maximum total worth of a roll, considering all ways of pairing up the roll."
return max((a in columns) * worth(a) + (b in columns) * worth(b)
for (a, b) in roll)
As an example, consider again the roll (2, 3, 5, 6)
:
roll = Roll(2, 3, 5, 6)
roll
{(5, 11), (7, 9), (8, 8)}
What can we do with that roll?
best_pairing([7], roll) # We can get 1 point for making 7
1
best_pairing([6, 10, 12], roll) # We can't get any of 6-10-12; we've "blown it"
0
best_pairing([5, 7, 12], roll) # We can get 5, or we can get 7, but we can't get both at once
1
best_pairing([5, 7, 11], roll) # We can get 5 and 11 (which is more than just getting 7)
2
best_pairing([8], roll) # We can get 8 twice
2
The function best_pairing
gives us an answer for one roll, but what about over all possible rolls?
The function E(columns, worth)
gives the expectation (that is, the mean value), of the total worth
averaged over all possible rolls.
The function P(columns)
gives the probability of advancing in at least one of the columns (that is, the probability of not blowing it).
@lru_cache(None)
def E(columns, worth):
"The expected value over all rolls of the pairing that maximizes the sum of `worth(col)`."
return sum(best_pairing(columns, roll, worth) * all_rolls[roll]
for roll in all_rolls) / sum(all_rolls.values())
@lru_cache(None)
def P(columns):
"The probability of not blowing it, given these columns."
return sum(bool(best_pairing(columns, roll)) * all_rolls[roll]
for roll in all_rolls) / sum(all_rolls.values())
Note that because I use lru_cache
, from now on I'll represent a collection of columns as a tuple, not a list.
First, let's ask what's the probability of being able to make at least one 7? And whats the expected value of the number of 7s we can make?
P((7,))
0.6435185185185185
E((7,), one)
0.7129629629629629
How about column 2?
P((2,))
0.13194444444444445
E((2,), one)
0.13271604938271606
Another important concept is progress towards claiming a column. While we have seen that it is easier to make a 7 than a 2, advancing a space in the 2 column is more valuable in the sense that it gets you 1/3 of the way to claiming the column, while advancing in the 7 column gets you only 1/13 of the way. We will implement progress
:
col_length = [0, 0, 3, 5, 7, 9, 11, 13, 11, 9, 7, 5, 3]
def progress(col):
"The amount of progress as a fraction of the column's length."
return 1 / col_length[col]
E((7,), progress)
0.05484330484330481
E((2,), progress)
0.04423868312757201
Now, let's create a table that, for each column, answers these questions:
progress
.)advance in the column in this roll, how many additional rolls would it take, on average, to claim the column?
print(' c P(c) E(c) progr len rolls remaining')
print(' == ==== ==== ===== === ===== ====')
for c in range(2, 13):
Ec, Ep = E((c,), one), E((c,), progress)
print(' {:2} {:2.0%} {:4.2f} {:5.3f} {:2} {:4.1f} {:4.1f}'
.format(c, P((c,)), Ec, Ep, col_length[c], col_length[c] / Ec, (col_length[c] - 1) / Ec))
c P(c) E(c) progr len rolls remaining == ==== ==== ===== === ===== ==== 2 13% 0.13 0.044 3 22.6 15.1 3 23% 0.24 0.048 5 21.0 16.8 4 36% 0.37 0.053 7 18.9 16.2 5 45% 0.48 0.053 9 18.9 16.8 6 56% 0.61 0.055 11 18.1 16.4 7 64% 0.71 0.055 13 18.2 16.8 8 56% 0.61 0.055 11 18.1 16.4 9 45% 0.48 0.053 9 18.9 16.8 10 36% 0.37 0.053 7 18.9 16.2 11 23% 0.24 0.048 5 21.0 16.8 12 13% 0.13 0.044 3 22.6 15.1
So, while 7 is the easiest column to make (at 64%), 6 and 8 are the easiest columns to claim (taking 18.1 rolls to claim, on average; this means that 6 and 8 give you the highest expected progress per roll). Columns 2 and 12 are the hardest to claim.
But, if you happen to roll a combination that can make a 2 or a 12, it may be a good idea to take it, because then you only need an expected 15.1 more rolls to claim the column, which is better than any of the other columns.
Overall, this tells me that Sid Sackson did an amazingly good job of making the game balanced!
Intuitively, you are unlikely to blow it with sherpas on 6, 7, and 8, and more likely with, say, 2, 11, and 12. Let's quantify that intuition:
P((6, 7, 8))
0.9197530864197531
P((2, 11, 12))
0.4382716049382716
So there's a 91.97% chance of not blowing it if you have 6-7-8, but only 43.8% if you have 2-11-12.
Let's see which combinations of sherpas are best overall, under a variety of criteria:
# All possible combinations of three sherpas
def top(n, fn, fmt='{:6.3f} {}'.format):
"Print best n triples of sherpas, in order, along with their fn(sherpas) values."
C = Counter({sherpas: fn(sherpas)
for sherpas in combinations(range(2, 13), 3)})
for (sherpas, val) in C.most_common(n):
print(fmt(val, sherpas))
# The combinations of sherpas that are best for not blowing it:
top(30, P)
0.920 (6, 7, 8) 0.914 (5, 7, 8) 0.914 (6, 7, 9) 0.911 (4, 6, 8) 0.911 (6, 8, 10) 0.903 (4, 7, 8) 0.903 (6, 7, 10) 0.895 (5, 6, 8) 0.895 (6, 8, 9) 0.893 (3, 7, 8) 0.893 (4, 7, 9) 0.893 (5, 7, 10) 0.893 (6, 7, 11) 0.890 (2, 7, 8) 0.890 (6, 7, 12) 0.887 (5, 6, 7) 0.887 (7, 8, 9) 0.886 (4, 6, 7) 0.886 (7, 8, 10) 0.883 (2, 6, 8) 0.883 (4, 6, 10) 0.883 (4, 8, 10) 0.883 (6, 8, 12) 0.877 (4, 7, 10) 0.867 (5, 6, 9) 0.867 (5, 8, 9) 0.865 (3, 6, 7) 0.865 (7, 8, 11) 0.864 (2, 6, 7) 0.864 (4, 6, 9)
# The combinations of sherpas that advance the most number of spaces in one roll:
top(30, lambda s: E(s, one))
1.318 (6, 7, 8) 1.296 (5, 7, 8) 1.296 (6, 7, 9) 1.242 (4, 7, 8) 1.242 (6, 7, 10) 1.231 (5, 6, 7) 1.231 (7, 8, 9) 1.228 (5, 6, 8) 1.228 (6, 8, 9) 1.219 (4, 6, 7) 1.219 (7, 8, 10) 1.193 (4, 6, 8) 1.193 (6, 8, 10) 1.184 (3, 7, 8) 1.184 (4, 7, 9) 1.184 (5, 7, 10) 1.184 (6, 7, 11) 1.151 (5, 6, 9) 1.151 (5, 8, 9) 1.148 (2, 7, 8) 1.148 (6, 7, 12) 1.147 (3, 6, 7) 1.147 (7, 8, 11) 1.145 (5, 7, 9) 1.123 (4, 5, 7) 1.123 (7, 9, 10) 1.116 (2, 6, 7) 1.116 (4, 6, 9) 1.116 (5, 8, 10) 1.116 (7, 8, 12)
# The combinations of sherpas that give the most progress towards claiming columns in one roll:
top(30, lambda s: E(s, progress))
0.143 (2, 4, 10) 0.143 (4, 10, 12) 0.140 (2, 8, 12) 0.140 (2, 6, 12) 0.139 (2, 5, 11) 0.139 (3, 9, 12) 0.138 (2, 8, 10) 0.138 (4, 6, 12) 0.138 (2, 6, 10) 0.138 (4, 8, 12) 0.138 (2, 4, 11) 0.138 (3, 10, 12) 0.138 (4, 6, 10) 0.138 (4, 8, 10) 0.138 (4, 9, 12) 0.138 (2, 5, 10) 0.138 (3, 4, 10) 0.138 (4, 10, 11) 0.137 (2, 6, 11) 0.137 (3, 8, 12) 0.137 (2, 7, 12) 0.136 (4, 6, 11) 0.136 (3, 8, 10) 0.136 (2, 9, 12) 0.136 (2, 4, 9) 0.136 (2, 5, 12) 0.136 (5, 10, 12) 0.136 (2, 4, 8) 0.136 (6, 10, 12) 0.136 (2, 3, 10)
Interesting. The results for progress
are quite different from the others: either 2 or 12 shows up in all of the top dozen spots for progress
, but none of the top dozen for the other two.
You may have noticed that even numbers are more common at the top of these lists than odd numbers. That is in part because it is always possible to make some even column, no matter what roll you get, while rolls that do not contain an odd number (or that contain 4 odd numbers) cannot make odd columns:
P((2, 4, 6, 8, 10, 12))
1.0
P((3, 5, 7, 9, 11))
0.875
Now we're ready for the key question: when should I roll and when should I stop? Clearly, if you have fewer than 3 sherpas on the board, and if no column has been claimed, then you should always roll, because you will always be able to make some column. But once you have three sherpas, at some point you will want to stop; every time you roll you risk losing the progress you have made so far. Let's figure out under what conditions you should stop. Here are the points I considered:
Ultimately, you want to optimize the chance of winning the game, but that is too hard a goal for right now. Instead we'll try to maximize total progress (which is a good proxy for winning, at least in the opening phases of the game).
We'll define optimal(sherpas, worth)
to return the expected total worth of an entire turn, assuming optimal play in pairing the dice and in deciding when to stop.
The function optimal
calls the recursive function optimal_1
, which has some additional parameters: sofar
is the total worth accumlated on this turn; maxrolls
is the maximum number of rolls in a turn (otherwise we would recurse forever); and cache
is a cache of intermediate results.
Why did I arrange things that way? Because I know that this is a dynamic programming problem, where the cache will be important in making the function run faster. But @lru_cache
won't do the job properly: if the correct move is to stop when you have accumulated a certain amount of worth sofar
, then stopping should be the correct move regardless of what the maxdepth
is. But @lru_cache
would compute a separate result for each value of maxdepth
; that would be doing more computation than necessary. So I manage my own cache, which (for a given set of sherpas and worth function) depends only on sofar
, not on maxdepth
.
The overall approach for optimal
/ optimal_1
is to stop (and return the accumlated worth sofar
) when maxdepth
hits zero, and otherwise to consider the weighted mean of optimal play over all possible rolls, and compare that to the result you would get from stopping (namely, sofar
).
It is unfortunate that the way I defined the expectation function, E
, I can't reuse it here; I need to define a new function, weighted_mean
.
I also define decision
to return 'ROLL'
or 'STOP'
, whichever is optimal.
I also define stopping_point
to say at what point (in terms of total worth) you should stop.
Why do I define optimal_1
to have a maxrolls
of 16? By empirical tests (not shown here), I determined that any larger value gives the same results (and just takes longer to compute), but any smaller value gives different (sub-optimal) results.
@lru_cache(1024)
def optimal(sherpas, worth=progress):
"The expected total worth, assuming optimal play."
return optimal_1(sherpas, worth, {})
@lru_cache(1024)
def stopping_point(sherpas, worth=progress):
"At what total worth should you stop rather than roll?"
cache = {}
optimal_1(sherpas, worth, cache)
return min(sofar for sofar in cache if sofar == cache[sofar])
def optimal_1(sherpas, worth, cache, sofar=None, maxrolls=16):
"The expected total worth, with a cache that depends only on `sofar`."
if sofar is None:
sofar = sum(map(worth, sherpas))
if sofar not in cache:
if maxrolls == 0:
cache[sofar] = sofar
else:
def roll_it(roll):
this_turn = best_pairing(sherpas, roll, worth)
return (0 if this_turn == 0 else
optimal_1(sherpas, worth, cache, sofar + this_turn, maxrolls - 1))
cache[sofar] = max(sofar, weighted_mean(roll_it, all_rolls))
return cache[sofar]
def weighted_mean(func, counter):
"The weighted mean of func(x), for each x is counter."
return sum(func(x) * counter[x]
for x in counter) / sum(counter.values())
def decision(sherpas, worth=progress, sofar=None):
"Return 'STOP' or 'ROLL', whichever gives a higher score."
if sofar is None:
sofar = sum(map(worth, sherpas))
return ('STOP' if optimal(sherpas, worth) == sofar else 'ROLL')
Some examples. First consider sherpas on 2-3-12, using the one
worth function:
optimal((2, 3, 12), one)
3
decision((2, 3, 12), one)
'STOP'
stopping_point((2, 3, 12), one)
3
Now let's look at 6-7-8, using the (default) progress
worth function:
optimal((6, 7, 8))
0.6579805166909175
decision((6, 7, 8))
'ROLL'
stopping_point((6, 7, 8))
1.44055944055944
This might look like a contradiction: the expected value is 0.65798 (that is, total progress equivalent to about 2/3 towards claiming a column), but the stopping point is 1.44 (almost one and a half columns of progress, spread over the 3 columns). How can that be? The answer is that with optimal play, more than half the time you will blow it, but if you don't blow it, you're almost half way to winning the game. In other words, since there is a 92% chance of not blowing it on any one roll, the optimal decision is to keep rolling and rolling until you either blow it or accumulate 1.44 columns worth of progress.
Here's one more example:
optimal((2, 7, 10))
0.6134923445225983
decision((2, 7, 10))
'ROLL'
stopping_point((2, 7, 10))
0.8388278388278387
# What do we need to reach the stopping point?
{x: sum(map(progress, x))
for x in [(2, 7, 10, 2), (2, 7, 10, 10, 10), (2, 7, 10, 7, 7)]}
{(2, 7, 10, 2): 0.8864468864468864, (2, 7, 10, 7, 7): 0.7069597069597069, (2, 7, 10, 10, 10): 0.8388278388278387}
In other words, if you have sherpas that have advanced one space in each of the 2-7-10 column, you should roll again, and if you get another 2 you should immediately stop; if you get two 7s you should roll again; and if you get two 10s you should stop.
Now let's see what triples of sherpas are most valuable under optimal play:
%time top(30, optimal)
0.867 (2, 3, 12) 0.867 (2, 11, 12) 0.810 (2, 4, 12) 0.810 (2, 10, 12) 0.778 (2, 5, 12) 0.778 (2, 9, 12) 0.758 (2, 6, 12) 0.758 (2, 8, 12) 0.744 (2, 7, 12) 0.733 (2, 3, 11) 0.733 (3, 11, 12) 0.698 (6, 7, 12) 0.698 (2, 7, 8) 0.689 (2, 6, 8) 0.689 (6, 8, 12) 0.686 (6, 8, 10) 0.686 (4, 6, 8) 0.676 (2, 3, 4) 0.676 (2, 3, 10) 0.676 (2, 4, 11) 0.676 (2, 10, 11) 0.676 (3, 4, 12) 0.676 (3, 10, 12) 0.676 (4, 11, 12) 0.676 (10, 11, 12) 0.658 (6, 7, 8) 0.652 (6, 7, 9) 0.652 (5, 7, 8) 0.644 (2, 3, 5) 0.644 (2, 3, 9) CPU times: user 3min 8s, sys: 1.71 s, total: 3min 9s Wall time: 3min 23s
And here are the triples that will have you rolling the longest:
%time top(30, stopping_point)
1.441 (6, 7, 8) 1.429 (4, 6, 8) 1.429 (6, 8, 10) 1.395 (5, 7, 8) 1.395 (6, 7, 9) 1.267 (4, 7, 8) 1.267 (6, 7, 10) 1.195 (4, 6, 10) 1.195 (4, 8, 10) 1.186 (2, 7, 8) 1.186 (6, 7, 12) 1.183 (4, 7, 9) 1.183 (5, 7, 10) 1.172 (3, 7, 8) 1.172 (6, 7, 11) 1.152 (2, 6, 8) 1.152 (6, 8, 12) 1.141 (5, 6, 8) 1.141 (6, 8, 9) 1.088 (4, 7, 10) 1.050 (4, 6, 7) 1.050 (7, 8, 10) 1.002 (5, 6, 7) 1.002 (7, 8, 9) 0.942 (4, 6, 9) 0.942 (5, 8, 10) 0.931 (4, 8, 9) 0.931 (5, 6, 10) 0.928 (2, 6, 7) 0.928 (7, 8, 12) CPU times: user 3min 7s, sys: 1.79 s, total: 3min 9s Wall time: 3min 34s
You can use the stopping_point
or decision
functions to decide how to maximize your progress in any given situation. But you can't quite use it to maximize your chances of winning the game. That would require some more work:
How should I pair up the four dice?
opponent their chance of claiming the column.
Should I stop or should I roll?
None of these are insurmountable, but we won't attempt to solve them here.