The word game Boggle (rules here) is a fun game to play with friends, but not so fun to play against a computer. It is all too simple for a computer to find all the valid words in a Boggle board (such as the one shown above) in less than a second. We'll see how to do that as a straightforward programming exercise. Then we'll consider a more challenging problem:
The goal of Inverse Boggle is to find an arrangement of letters on a board that scores the most points. It is called Inverse Boggle because, instead of "give me a board and I'll find the words" it is "give me the word list and I'll find a board with lots of words."
It is not feasible to try all possible boards ($26^{5 \times 5} \approx 10^{35}$ possible $5 \times 5$ boards), so we will use hill-climbing, a heuristic search technique that does not guarantee finding an optimal solution.
I'll represent a board as a dict where the keys are (x, y)
coordinates of squares, and the values are individual letters (except that QU
occupies a single square). A board will also hold:
board.squares
: a list of squares in row-major order.board.neighbors
: a dict of {square: neighbors}
.board.format
: a string format method to print the board nicely using __repr__
.import random
import itertools
class Board(dict):
"""A board is a dict of {(x, y): L}."""
def __init__(self, letters=None):
letters = [('QU' if L == 'Q' else L) for L in letters if 'A' <= L <= 'Z']
n = int(len(letters) ** 0.5)
self.squares = [(x, y) for y in range(n) for x in range(n)]
self.neighbors = {s: neighbors(s, n) for s in self.squares}
self.format = ('\n'.join(['{:2s}' * n] * n)).format
self.update(zip(self.squares, letters))
def __repr__(self):
return self.format(*(self[s].capitalize() for s in self.squares))
def neighbors(square, n) -> list:
"""All the squares that neighbor a square."""
(x, y) = square
return [(x+dx, y+dy) for dx in (-1, 0, 1) for dy in (-1, 0, 1)
if 0 <= x+dx < n and 0 <= y+dy < n and not (dx == dy == 0)]
board = Board('PUZZL WORDE BOGGL SEARC FINDH')
board
P U Z Z L W O R D E B O G G L S E A R C F I N D H
We'll read in a word list, convert to uppercase, and keep all the words that are three letters or more. The variable WORDS
holds the set of valid words. The function total_score
computes the total score of a set of words. It does not take a board as an argument, and so it does not check if the words can actually be made on the board. It does check that the words are in the WORDS
list of valid words.
! [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt
! wc -w enable1.txt
172820 enable1.txt
def total_score(found, scores=[0, 0, 0, 1, 1, 2, 3, 5] + [11] * 99) -> int:
"""The total score for the words found, according to the rules."""
return sum([scores[len(w)] for w in found & WORDS])
WORDS = {w for w in open('enable1.txt').read().upper().split() if len(w) >= 3}
len(WORDS)
172724
The strategy for finding words:
[(0, 0)]
) and a prefix string consisting of the letters visited in the path; for this one-square path on board
(the one with PUZZLE
in it), the prefix string would be 'P'
. On a 5 by 5 board we would have 25 initial paths of length one.found
.board
, the path [(0, 0)]
with prefix 'P'
would be continued to three neighbors:[(0, 0), (1, 0)]
, 'PU'
: here 'PU'
is a prefix of a word so continue this path.[(0, 0), (0, 1)]
, 'PW'
: here 'PW'
is not a prefix of any word so do not continue this path.[(0, 0), (1, 1)]
, 'PO'
: here 'PO'
is a prefix of a word so continue this path.'PO'
is as follows:[(0, 0), (1, 1), (0, 1)]
, 'POW'
: here 'POW'
is both a word and a prefix, so record it and continue.PREFIXES
of all words in the word list.def word_prefixes(words) -> set:
"The set of all non-empty, non-full-word prefixes of each word in a word list."
return {word[:i] for word in words for i in range(1, len(word))}
PREFIXES = word_prefixes(WORDS) # Precompute this once.
def find_words(board, words=WORDS, prefixes=PREFIXES):
"""Find all words in a Boggle board by recursively continuing paths that are prefixes of words."""
found = set() # Accumulate the words found in the set `found`
def continue_path(path, prefix):
if prefix in words:
found.add(prefix)
if prefix in prefixes:
for s in board.neighbors[path[-1]]:
if s not in path:
continue_path(path + [s], prefix + board[s])
# Body of find_words: Start paths from each of the squares on the board
for s in board.squares:
continue_path([s], board[s])
return found
How many words can we find? And how long does it take?
%time len(find_words(board))
CPU times: user 4.39 ms, sys: 43 µs, total: 4.43 ms Wall time: 4.5 ms
252
Can we see the words and summarize them?
import textwrap
def report(board, verbose=True):
found = find_words(board, WORDS, PREFIXES)
if verbose:
print(f'{len(found)} words found:')
print(textwrap.fill(' '.join(sorted(found)), width=90))
print(f'Score of {total_score(found)} for {len(found)} words on board:\n{board}')
report(board)
252 words found: AGE AGED AGES AGGRO AGGROS AGO AIN AIS AND ANE ANES ANI ANIS ANISE ARC ARCH ARGLE ARGLED BEAD BEAGLE BEAN BEAR BEARD BEG BEGAN BEGGAR BEGGED BEGROAN BEN BEND BOA BOAR BOARD BOG BOGAN BOGGED BOGGLE BOGGLED BOO BOOR BOOS BOP BORDEL BOS BOURG BOW CRAG CRAGGED CRANE CRANES DAG DAGGLE DAGGLED DAGO DAGOES DAGOS DAIS DARN DEGAGE DEL DRAG DRAGGED DRAGGLE DRAGGLED DRAIN DROOP DROP EAGLE EAR EARL EARN EDGE EDGES EFS EGAD EGG EGGAR EGGED EGO EGOS ELD END ENDARCH ENRAGE ENRAGED EOSIN FEAR FEN FENAGLE FENAGLED FEND FIAR FIE FIEND FIN FINAGLE FINAGLED FIND FINE FINES GAD GAE GAEN GAES GAG GAGE GAGED GAGES GAIN GAN GANE GANEF GANEFS GAR GARGLE GARGLED GARNI GEAR GED GEL GELD GEN GLED GOA GOAD GOB GOBO GOBOES GOBOS GOBS GOES GOO GOOP GOOS GOOSE GOR GORGE GORGED GOURD GOURDE GRAD GRAIN GRAN GRAND GROAN GROG GROUP GROW IFS INARCH LED LEDGE LEDGES LEG LEZ NAE NAG NAGGED NAIF NAIFS NAOS NARC NARD NEAR NEB NEBS NEIF NEIFS OAR OBE OBES OBOE OBOES OES ORGAN ORGANISE ORZO OSE OUR POOR POROSE POUR POW PUR PURGE PURGED PURGES PUZZLE PUZZLED RAD RAG RAGE RAGED RAGES RAGGED RAGGLE RAIN RAISE RAN RAND RANI RANIS ROAD ROAN ROAR ROB ROBE ROBES ROBS ROE ROES ROOSE ROSE ROSIN ROUP ROW SEA SEAR SEARCH SEG SEGGAR SEGO SEI SEIF SEN SEND SIN SINE SOAR SOB SOGGED SORD SORGO SOW UPO URD URGE URGED URGES WOAD WOE WOES WOG WOO WOOS WOP WORD WOS WURZEL ZED ZOO ZOOS Score of 449 for 252 words on board: P U Z Z L W O R D E B O G G L S E A R C F I N D H
Let's test that the Q
works, and that a 4x4 board works:
report(Board('QEST ANSI XWEO YRSN'))
125 words found: ANE ANES ANEW ANSWER ANSWERS AWE AWES AWN AWNS AWRY ENOSIS ENS EON EONS ERS ESES ESS ESSES ION IONS ITS NAE NAW NEIST NENE NEON NEONS NESS NESSES NEST NESTS NEW NEWNESS NEWS NOES NOESIS NOISE NOISES NOS NOSE NOSES NOSIER OES ONE ONENESS ONERY ONES ONS OSE OSES OSIER OSIERS QUA QUEAN QUEANS QUEST QUESTION QUESTIONER QUESTIONERS QUESTIONS QUESTS REI REIS RENEST RENESTS RES RESIST REST RESTS REWAN REWAX SEA SEI SEIS SEISE SEISES SEN SENE SENSE SENSES SER SERS SESSION SEW SEWAN SEWANS SEWN SEWS SIS SISES SIT SITS SNAW SNAWS SON SONE SONES SONS SOS STIES SWAN SWANS TIE TIER TIERS TIES TIS WAE WAENESS WAES WAN WANE WANES WANS WAX WAXY WEN WENS WEST WESTS WREN WRENS WREST WRESTS WRY Score of 245 for 125 words on board: QuE S T A N S I X W E O Y R S N
I'll tackle the Inverse Boggle problem with a hill-climbing approach:
def boggle_hill_climbing(board, words=WORDS, prefixes=PREFIXES, repeat=500):
"""Solve Inverse Boggle by hill-climbing: find a high-scoring board by
starting with one and changing it."""
board = Board(board[s] for s in board.squares) # Copy board, so we don't mutate original
best_score = total_score(find_words(board, words, prefixes))
for _ in range(repeat):
s, old_letter = mutate_boggle(board)
new_score = total_score(find_words(board, words, prefixes))
if new_score >= best_score:
best_score = new_score
else:
board[s] = old_letter # Change back
return board
# The distribution of letters in the 16-cube version of the game
LETTERS = 3 * 'AABCDEEEGHIILMNOOPRSTUY' + 'AADEFFIJKKLLNNQRSSTTUVVWWXZ'
def mutate_boggle(board, letters=LETTERS):
"""Make a random change to one letter in the board"""
s = random.choice(board.squares)
old_letter = board[s]
board[s] = random.choice(letters)
return s, old_letter
def random_board(n=5, letters=LETTERS) -> Board:
"""Return a random Boggle board."""
return Board(random.sample(letters, n * n))
Let's generate a random board and see if we can improve it:
board2 = random_board(5)
report(board2)
177 words found: ABIOTIC ABO ALE ALOOF ALT ALTO ATT ATTIC AUK BABE BABOO BABOOL BABU BAKE BAL BALE BALK BAT BATT BAUBEE BEE BEEP BIO BIOTIC BIPOD BOA BOAT BOD BOLA BOLE BOLT BOO BOOT BOP BOT BOTA BOTT BOTTLE BOY BUB BUBAL BUBALE BUBO BUOY CITOLA CITOLE DIOL DIT DITA DITTO DOPY DOT DOTTLE EAT EAU ELK FIB FOB FOOL FOOT FOOTBOY FOOTLE FOP HEP HIP HOB HOBO HOE HOOD HOOF HOOP HOOT HOP HOPE HOY HYP HYPO IODIC IOTA KAB KAE KALE KAT KEA KELOID KLOOF KUE LAB LAKE LAT LATI LEA LEAK LEK LEKU LOB LOBO LOO LOOF LOOP LOOPY LOOT LOT LOTA LOTI LOTIC LOTTO LOUP LOUPE OAK OAT OBI OBOE OBOL OBOLE ODIC OFT OLE OLEA OOH OOT OOTID OPE OTIC OTTO OUPH OUPHE PEE PEH PHI PIBAL POD POH POI POOD POOF POOH POOL POOP POT POTBOY POTTLE POTTO PUB PUKE TAB TABOO TABU TAE TAEL TAKE TALE TALK TAO TAU TAUPE TIC TIT TITLE TOD TOIT TOLA TOLE TOO TOOL TOOT TOOTLE TOP TOPI TOT TOTAL TOUPEE UKE UPO YIP YOB YOU Score of 234 for 177 words on board: C I T L E D T O A K F O B U B P I O P E I Y H E U
report(boggle_hill_climbing(board2), False)
Score of 5782 for 1577 words on board: V I S E N D E T T S R A L A R S I M E C E D N S H
Impressive! We got roughly a ten-fold improvement in score after 500 repetitions.
We can do the same with our original board
:
report(board)
252 words found: AGE AGED AGES AGGRO AGGROS AGO AIN AIS AND ANE ANES ANI ANIS ANISE ARC ARCH ARGLE ARGLED BEAD BEAGLE BEAN BEAR BEARD BEG BEGAN BEGGAR BEGGED BEGROAN BEN BEND BOA BOAR BOARD BOG BOGAN BOGGED BOGGLE BOGGLED BOO BOOR BOOS BOP BORDEL BOS BOURG BOW CRAG CRAGGED CRANE CRANES DAG DAGGLE DAGGLED DAGO DAGOES DAGOS DAIS DARN DEGAGE DEL DRAG DRAGGED DRAGGLE DRAGGLED DRAIN DROOP DROP EAGLE EAR EARL EARN EDGE EDGES EFS EGAD EGG EGGAR EGGED EGO EGOS ELD END ENDARCH ENRAGE ENRAGED EOSIN FEAR FEN FENAGLE FENAGLED FEND FIAR FIE FIEND FIN FINAGLE FINAGLED FIND FINE FINES GAD GAE GAEN GAES GAG GAGE GAGED GAGES GAIN GAN GANE GANEF GANEFS GAR GARGLE GARGLED GARNI GEAR GED GEL GELD GEN GLED GOA GOAD GOB GOBO GOBOES GOBOS GOBS GOES GOO GOOP GOOS GOOSE GOR GORGE GORGED GOURD GOURDE GRAD GRAIN GRAN GRAND GROAN GROG GROUP GROW IFS INARCH LED LEDGE LEDGES LEG LEZ NAE NAG NAGGED NAIF NAIFS NAOS NARC NARD NEAR NEB NEBS NEIF NEIFS OAR OBE OBES OBOE OBOES OES ORGAN ORGANISE ORZO OSE OUR POOR POROSE POUR POW PUR PURGE PURGED PURGES PUZZLE PUZZLED RAD RAG RAGE RAGED RAGES RAGGED RAGGLE RAIN RAISE RAN RAND RANI RANIS ROAD ROAN ROAR ROB ROBE ROBES ROBS ROE ROES ROOSE ROSE ROSIN ROUP ROW SEA SEAR SEARCH SEG SEGGAR SEGO SEI SEIF SEN SEND SIN SINE SOAR SOB SOGGED SORD SORGO SOW UPO URD URGE URGED URGES WOAD WOE WOES WOG WOO WOOS WOP WORD WOS WURZEL ZED ZOO ZOOS Score of 449 for 252 words on board: P U Z Z L W O R D E B O G G L S E A R C F I N D H
report(boggle_hill_climbing(board), False)
Score of 5690 for 1511 words on board: L P C A P N A R O M D E T I L S E S E N R T N R G
Again, roughly a ten-fold improvement.
Now, let's start from a very high-scoring board, identified by Justin Boyan in his Ph.D. thesis, and see if we can improve it:
boyan = Board('RSTCS DEIAE GNLRP EATES MSSID')
report(boyan, False)
Score of 10112 for 2290 words on board: R S T C S D E I A E G N L R P E A T E S M S S I D
report(boggle_hill_climbing(boyan), False)
Score of 10112 for 2290 words on board: R S T C S D E I A E G N L R P E A T E S M S S I D
Sadly, we were not able to make an improvement in 500 repetitions. But that certainly is no guarantee that boyan
is the best possible board. Here are some things to try to find a better board; maybe you can implement some of them, or try some ideas of your own: