On April 28, Gary Antonik had another Numberplay column that quotes my friend Bill Gosper. (Gosper often presents more advanced puzzles in the math-fun mailing list.) This puzzle was:
For the expression
N U M + B E R = P L A Y
,
- Which distinct numerals (each different) can be substituted for letters to make a valid expression?
- How many solutions are there?
I tackled this type of problem (known as a cryptarithmetic or alphametic problem) in my Udacity class CS 212.
My initial approach was simple: when in doubt, use brute force. I try all permutations of digits replacing letters (that should be quick and easy—there can be at most 10 factorial or 3.6 million permutations), then for each one, I use Python's eval
function to see if the resulting string is a valid expression. The basic idea is simple, but there are a few complications to worry about:
=
and Python uses ==
for equality; we'll allow either one in formulas.012
) is illegal (but 0
by itself is ok).2 * BE or not 2
, the or
and not
are Python keywords.implementing my function solve
to return an iterator, which yields solutions one at a time; you can get the first one with next
or all of them with set
.
solve
¶Below we see that solve
works by generating every way to replace letters in the formula with numbers,
and then filtering them to keep only valid strings (ones that evaluate to true and have no leading zero).
The str.translate
method is used to do the replacements.
import itertools
import re
def solve(formula):
"""Given a formula like 'NUM + BER = PLAY', fill in digits to solve it.
Generate all valid digit-filled-in strings."""
return filter(valid, letter_replacements(formula))
def letter_replacements(formula):
"""All possible replacements of letters with digits in formula."""
formula = formula.replace(' = ', ' == ') # Allow = or ==
letters = cat(set(re.findall('[A-Z]', formula)))
for digits in itertools.permutations('1234567890', len(letters)):
yield formula.translate(str.maketrans(letters, cat(digits)))
def valid(exp):
"""Expression is valid iff it has no leading zero, and evaluates to true."""
try:
return not leading_zero(exp) and eval(exp) is True
except ArithmeticError:
return False
cat = ''.join # Function to concatenate strings
leading_zero = re.compile(r'\b0[0-9]').search # Function to check for illegal number
next(solve('NUM + BER = PLAY'))
'489 + 537 == 1026'
%time len(set(solve('NUM + BER = PLAY')))
CPU times: user 17.2 s, sys: 61.3 ms, total: 17.3 s Wall time: 17.4 s
96
faster_solve
¶Depending on your computer, that probably took 15 or 20 seconds. Can we make it faster? To answer the question, I start by profiling to see where the time is spent. I can use the magic function %prun
to profile:
%prun next(solve('NUM + BER = PLAY'))
We see that about 2/3 of the time is spent in eval
. So let's eliminate the calls to eval
. That should be doable, because the expression we are evaluating is basically the same each time, but with different permutations of digits filled in. We could save a lot of work if we convert the expression into a Python function, compile that function just once, and then call the function for each of the 3.6 million permutations of digits. We want to take an expression such as:
"NUM + BER == PLAY"
and transform it into the Python function:
(lambda A,B,E,L,M,N,P,R,U,Y:
(100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y))
Actually that's not quite right. The rules say that "N", "B", and "P" cannot be zero. So the function should be:
(lambda A,B,E,L,M,N,P,R,U,Y:
B and N and P and ((100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)))
Here is the code to compile a formula into a Python function:
def compile_formula(formula, verbose=False):
"""Compile formula into a function. Also return letters found, as a str,
in same order as parms of function. For example, 'YOU == ME**2' returns
(lambda E,M,O,U,Y: M and Y and ((100*Y+10*O+U) == (10*M+E)**2), 'YMEUO'"""
formula = formula.replace(' = ', ' == ')
letters = cat(sorted(set(re.findall('[A-Z]', formula))))
firstletters = sorted(set(re.findall(r'\b([A-Z])[A-Z]', formula)))
body = re.sub('[A-Z]+', compile_word, formula)
body = ' and '.join(firstletters + [body])
fn = 'lambda {}: {}'.format(','.join(letters), body)
if verbose: print(fn)
assert len(letters) <= 10
return eval(fn), letters
def compile_word(matchobj):
"Compile the word 'YOU' as '(100*Y + 10*O + U)'."
word = matchobj.group()
terms = reversed([mult(10**i, L) for (i, L) in enumerate(reversed(word))])
return '(' + '+'.join(terms) + ')'
def mult(factor, var): return var if factor == 1 else str(factor) + '*' + var
compile_formula("YOU == ME**2", verbose=True)
lambda E,M,O,U,Y: M and Y and (100*Y+10*O+U) == (10*M+E)**2
(<function __main__.<lambda>(E, M, O, U, Y)>, 'EMOUY')
compile_formula("NUM + BER = PLAY", verbose=True)
lambda A,B,E,L,M,N,P,R,U,Y: B and N and P and (100*N+10*U+M) + (100*B+10*E+R) == (1000*P+100*L+10*A+Y)
(<function __main__.<lambda>(A, B, E, L, M, N, P, R, U, Y)>, 'ABELMNPRUY')
Now we're ready for the faster version of solve
:
def faster_solve(formula):
"""Given a formula like 'NUM + BER == PLAY', fill in digits to solve it.
This version precompiles the formula and generates all digit-filled-in strings."""
fn, letters = compile_formula(formula)
for digits in itertools.permutations((1,2,3,4,5,6,7,8,9,0), len(letters)):
try:
if fn(*digits):
yield formula.translate(str.maketrans(letters, cat(map(str, digits))))
except ArithmeticError:
pass
next(faster_solve('NUM + BER = PLAY'))
'587 + 439 = 1026'
%time len(list(faster_solve('NUM + BER = PLAY')))
CPU times: user 1.32 s, sys: 8.41 ms, total: 1.33 s Wall time: 1.33 s
96
We get the same answer, but faster_solve
is 15 times faster than solve
.
examples = """
NUM + BER = PLAY
YOU = ME ** 2
SEND + MORE = MONEY
PI * R**2 = AREA
WRONG + WRONG = RIGHT
WRIGHT + WRIGHT = TO * FLY + FLIGHT
TWO + TWENTY = TWELVE + TEN
A**2 + B**2 = C**2
AYH**2 + BEE**2 = SEE**2
RAMN = R**3 + RM**3 = N**3 + RX**3
MON-EY = EVIL**(1/2)
ALPHABET + LETTERS = SCRABBLE
SIN**2 + COS**2 = UNITY
POTATO + TOMATO = PUMPKIN
ODD * ODD = FREAKY
DOUBLE + DOUBLE + TOIL = TROUBLE
WASH + YOUR = HANDS
SPEED + LIMIT = FIFTY
TERRIBLE + NUMBER = THIRTEEN
SEVEN + SEVEN + SIX = TWENTY
EIGHT + EIGHT + TWO + ONE + ONE = TWENTY
ELEVEN + NINE + FIVE + FIVE = THIRTY
NINE + SEVEN + SEVEN + SEVEN = THIRTY
FOURTEEN + TEN + TEN + SEVEN = FORTYONE
HAWAII + IDAHO + IOWA + OHIO = STATES
VIOLIN * 2 + VIOLA = TRIO + SONATA
SEND + A + TAD + MORE = MONEY
ZEROES + ONES = BINARY
DCLIZ + DLXVI = MCCXXV
COUPLE + COUPLE = QUARTET
FISH + N + CHIPS = SUPPER
SATURN + URANUS + NEPTUNE + PLUTO = PLANETS
PLUTO not in {PLANETS}
EARTH + AIR + FIRE + WATER = NATURE
TWO * TWO = SQUARE
HIP * HIP = HURRAY
NORTH / SOUTH = EAST / WEST
NAUGHT ** 2 = ZERO ** 3
I + THINK + IT + BE + THINE = INDEED
DO + YOU + FEEL = LUCKY
WELL - DO + YOU = PUNK
NOW + WE + KNOW + THE = TRUTH
SORRY + TO + BE + A + PARTY = POOPER
SORRY + TO + BUST + YOUR = BUBBLE
STEEL + BELTED = RADIALS
ABRA + CADABRA + ABRA + CADABRA = HOUDINI
I + GUESS + THE + TRUTH = HURTS
LETS + CUT + TO + THE = CHASE
THATS + THE + THEORY = ANYWAY
TWO + TWO = FOUR
X / X = X
A**N + B**N = C**N and N > 1
ATOM**0.5 = A + TO + M
GLITTERS is not GOLD
ONE < TWO and FOUR < FIVE
ONE < TWO < THREE < TWO * TWO
(2 * BE or not 2 * BE) = THE + QUES-TION
sum(range(AA)) = BB
sum(range(POP)) = BOBO
ODD + ODD = EVEN
ROMANS+ALSO+MORE+OR+LESS+ADDED = LETTERS
FOUR+ONE==FIVE and ONE+ONE+ONE+ONE+ONE==FIVE
TEN+SEVEN+SEVEN+SEVEN+FOUR+FOUR+ONE = FORTY
NINETEEN+THIRTEEN+THREE+2*TWO+3*ONE = FORTYTWO
IN + ARCTIC + TERRAIN + AN + ANCIENT + EERIE + ICE + TRACT + I + ENTER + A + TRANCE = FLATIANA
ONE < TWO < THREE < SEVEN - THREE < THREE + TWO < THREE + THREE < SEVEN < SEVEN + ONE < THREE * THREE
AN + ACCELERATING + INFERENTIAL + ENGINEERING + TALE + ELITE + GRANT + FEE + ET + CETERA = ARTIFICIAL + INTELLIGENCE
""".strip().splitlines()
def run(examples):
for e in examples:
print('{:45}| {}'.format(e, next(faster_solve(e))))
return len(examples)
%time run(examples)
NUM + BER = PLAY | 587 + 439 = 1026 YOU = ME ** 2 | 576 = 24 ** 2 SEND + MORE = MONEY | 9567 + 1085 = 10652 PI * R**2 = AREA | 96 * 7**2 = 4704 WRONG + WRONG = RIGHT | 37081 + 37081 = 74162 WRIGHT + WRIGHT = TO * FLY + FLIGHT | 413058 + 413058 = 82 * 769 + 763058 TWO + TWENTY = TWELVE + TEN | 784 + 781279 = 781351 + 712 A**2 + B**2 = C**2 | 3**2 + 4**2 = 5**2 AYH**2 + BEE**2 = SEE**2 | 760**2 + 522**2 = 922**2 RAMN = R**3 + RM**3 = N**3 + RX**3 | 1729 = 1**3 + 12**3 = 9**3 + 10**3 MON-EY = EVIL**(1/2) | 108-42 = 4356**(1/2) ALPHABET + LETTERS = SCRABBLE | 17531908 + 7088062 = 24619970 SIN**2 + COS**2 = UNITY | 235**2 + 142**2 = 75389 POTATO + TOMATO = PUMPKIN | 168486 + 863486 = 1031972 ODD * ODD = FREAKY | 677 * 677 = 458329 DOUBLE + DOUBLE + TOIL = TROUBLE | 798064 + 798064 + 1936 = 1598064 WASH + YOUR = HANDS | 5291 + 6748 = 12039 SPEED + LIMIT = FIFTY | 40221 + 36568 = 76789 TERRIBLE + NUMBER = THIRTEEN | 45881795 + 302758 = 46184553 SEVEN + SEVEN + SIX = TWENTY | 68782 + 68782 + 650 = 138214 EIGHT + EIGHT + TWO + ONE + ONE = TWENTY | 52371 + 52371 + 104 + 485 + 485 = 105816 ELEVEN + NINE + FIVE + FIVE = THIRTY | 797275 + 5057 + 4027 + 4027 = 810386 NINE + SEVEN + SEVEN + SEVEN = THIRTY | 3239 + 49793 + 49793 + 49793 = 152618 FOURTEEN + TEN + TEN + SEVEN = FORTYONE | 19564882 + 482 + 482 + 78082 = 19643928 HAWAII + IDAHO + IOWA + OHIO = STATES | 510199 + 98153 + 9301 + 3593 = 621246 VIOLIN * 2 + VIOLA = TRIO + SONATA | 176478 * 2 + 17645 = 2076 + 368525 SEND + A + TAD + MORE = MONEY | 9283 + 7 + 473 + 1062 = 10825 ZEROES + ONES = BINARY | 698392 + 3192 = 701584 DCLIZ + DLXVI = MCCXXV | 62049 + 60834 = 122883 COUPLE + COUPLE = QUARTET | 653924 + 653924 = 1307848 FISH + N + CHIPS = SUPPER | 5718 + 3 + 98741 = 104462 SATURN + URANUS + NEPTUNE + PLUTO = PLANETS | 127503 + 502351 + 3947539 + 46578 = 4623971 PLUTO not in {PLANETS} | 63985 not in {6314287} EARTH + AIR + FIRE + WATER = NATURE | 67432 + 704 + 8046 + 97364 = 173546 TWO * TWO = SQUARE | 807 * 807 = 651249 HIP * HIP = HURRAY | 958 * 958 = 917764 NORTH / SOUTH = EAST / WEST | 51304 / 61904 = 7260 / 8760 NAUGHT ** 2 = ZERO ** 3 | 328509 ** 2 = 4761 ** 3 I + THINK + IT + BE + THINE = INDEED | 1 + 64125 + 16 + 73 + 64123 = 128338 DO + YOU + FEEL = LUCKY | 57 + 870 + 9441 = 10368 WELL - DO + YOU = PUNK | 5277 - 13 + 830 = 6094 NOW + WE + KNOW + THE = TRUTH | 673 + 38 + 9673 + 128 = 10512 SORRY + TO + BE + A + PARTY = POOPER | 80556 + 20 + 34 + 9 + 19526 = 100145 SORRY + TO + BUST + YOUR = BUBBLE | 94665 + 24 + 1092 + 5406 = 101187 STEEL + BELTED = RADIALS | 87336 + 936732 = 1024068 ABRA + CADABRA + ABRA + CADABRA = HOUDINI | 7457 + 1797457 + 7457 + 1797457 = 3609828 I + GUESS + THE + TRUTH = HURTS | 5 + 26811 + 478 + 49647 = 76941 LETS + CUT + TO + THE = CHASE | 9478 + 127 + 75 + 704 = 10384 THATS + THE + THEORY = ANYWAY | 86987 + 863 + 863241 = 951091 TWO + TWO = FOUR | 734 + 734 = 1468 X / X = X | 1 / 1 = 1 A**N + B**N = C**N and N > 1 | 3**2 + 4**2 = 5**2 and 2 > 1 ATOM**0.5 = A + TO + M | 1296**0.5 = 1 + 29 + 6 GLITTERS is not GOLD | 35499278 is not 3651 ONE < TWO and FOUR < FIVE | 351 < 703 and 2386 < 2491 ONE < TWO < THREE < TWO * TWO | 431 < 674 < 62511 < 674 * 674 (2 * BE or not 2 * BE) = THE + QUES-TION | (2 * 13 or not 2 * 13) = 843 + 7239-8056 sum(range(AA)) = BB | sum(range(11)) = 55 sum(range(POP)) = BOBO | sum(range(101)) = 5050 ODD + ODD = EVEN | 655 + 655 = 1310 ROMANS+ALSO+MORE+OR+LESS+ADDED = LETTERS | 975348+3187+5790+79+1088+36606 = 1022098 FOUR+ONE==FIVE and ONE+ONE+ONE+ONE+ONE==FIVE | 1380+345==1725 and 345+345+345+345+345==1725 TEN+SEVEN+SEVEN+SEVEN+FOUR+FOUR+ONE = FORTY | 520+12820+12820+12820+4937+4937+902 = 49756 NINETEEN+THIRTEEN+THREE+2*TWO+3*ONE = FORTYTWO| 42415114+56275114+56711+2*538+3*841 = 98750538 IN + ARCTIC + TERRAIN + AN + ANCIENT + EERIE + ICE + TRACT + I + ENTER + A + TRANCE = FLATIANA| 42 + 379549 + 5877342 + 32 + 3294825 + 88748 + 498 + 57395 + 4 + 82587 + 3 + 573298 = 10354323 ONE < TWO < THREE < SEVEN - THREE < THREE + TWO < THREE + THREE < SEVEN < SEVEN + ONE < THREE * THREE| 321 < 483 < 45711 < 91612 - 45711 < 45711 + 483 < 45711 + 45711 < 91612 < 91612 + 321 < 45711 * 45711 AN + ACCELERATING + INFERENTIAL + ENGINEERING + TALE + ELITE + GRANT + FEE + ET + CETERA = ARTIFICIAL + INTELLIGENCE| 59 + 577404251698 + 69342491650 + 49869442698 + 1504 + 40614 + 82591 + 344 + 41 + 741425 = 5216367650 + 691400684974 CPU times: user 51.2 s, sys: 246 ms, total: 51.4 s Wall time: 51.7 s
67