According to [Wikipedia](https://en.wikipedia.org/wiki/Ghost_(game);):
Ghost is a written or spoken word game in which players take turns adding letters to a growing word fragment, trying not to be the one to complete a valid word. Each fragment must be the beginning of an actual word, and usually some minimum is set on the length of a word that counts, such as three or four letters. The player who completes a word loses.
I'd like to create a program to allow any two players (human or computer) to play the game, and I'd like to figure out who wins if both players play optimally. The concepts I will need to define, and my implementation choices, are as follows:
enable1
, and make a set of all the words of sufficient length.str
of letters, such as 'gho'
.ghost
it is {'', g, gh, gho, ghos, ghost}
. "Prefix" is a synonym of "beginning".Vocabulary
object holds a set of all the words
in a dictionary, as well as all the valid fragments
(beginnings) of the words.0
; the second player 1
.enable1.legal_plays('gho') = {'ghos, 'ghou'}
.strategy(vocab, fragment) -> play
.play_game(vocab, *strategies)
plays a game between two (or more) player strategies.enable1
¶Vocabulary(words)
takes a collection of words as input, stores the words as a set, and also stores all the legal fragments of those words. legal_plays(fragments)
gives a set of all plays that can be formed by adding a letter to create a legal word fragment. I also define the function words
to split any string into component words.
class Vocabulary:
"Holds a set of legal words and a set of legal prefix fragments of those words."
def __init__(self, words, minlength=3):
self.words = {word for word in words if len(word) >= minlength}
self.fragments = {word[:i] for word in self.words for i in range(len(word) + 1)}
def legal_plays(self, fragment):
return {fragment + L for L in alphabet} & self.fragments
alphabet = 'abcdefghijklmnopqrstuvwxyz'
words = str.split # Function to split a str into words
Here is a small example with a vocabulary of just three words:
v = Vocabulary(words('game ghost ghoul'))
v.words
{'game', 'ghost', 'ghoul'}
v.fragments
{'', 'g', 'ga', 'gam', 'game', 'gh', 'gho', 'ghos', 'ghost', 'ghou', 'ghoul'}
And here is a large vocabulary, from a standard online Scrabble™ word list known as enable1
:
! [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt
enable1 = Vocabulary(words(open('enable1.txt').read()))
Let's explore enable1
:
len(enable1.words), len(enable1.fragments), max(enable1.words, key=len)
(172724, 387878, 'ethylenediaminetetraacetates')
enable1.legal_plays('gho')
{'ghos', 'ghou'}
enable1.legal_plays('ew')
{'ewe'}
enable1.legal_plays('th')
{'tha', 'the', 'thi', 'tho', 'thr', 'thu', 'thw', 'thy'}
The first player is 0
and the second player is 1
. These names are convenient because:
to_play = winner = lambda fragment: len(fragment) % 2
Who wins a game if both players play rationally? To answer that, consider the general situation at some point during the game where a player is presented with a fragment. That player can win if either:
The function win(vocab, fragment)
implements this. It returns a winning fragment if there is one, otherwise False
. In particular, it returns fragment
if the current player has already won (because fragment
is
a word or illegal fragment), and it returns one of the legal plays if that play leads to a position from
which the opponent cannot win
.
def win(vocab, fragment=''):
"""Does the current player have a forced win?
If so, return a play (or the current fragment) that leads to a win."""
if fragment in vocab.words or fragment not in vocab.fragments:
return fragment
for play in vocab.legal_plays(fragment):
if not win(vocab, play):
return play
return False
Let's test win
to gain some confidence that we got it right:
# No winning play because all words have odd number of letters.
win(Vocabulary(words('cat camel gecko')))
False
# 'g' is a winning play (but 'c' would be a loser)
win(Vocabulary(words('cat camel goat gerbil')))
'g'
# No winning play; doomed to 'camel' or 'gar'
win(Vocabulary(words('cat camel goat gecko gerbil gar')))
False
# 'g' wins because 'ga' can be answered with 'gan' and 'ge' with 'ge'
win(Vocabulary(words('cat camel goat gecko gerbil gar gannet')))
'g'
Can the first player win with the enable1
vocabulary?
win(enable1)
False
No. The game is a win for the second player, not the first. This agrees with xkcd's Randall Monroe, who says "I hear if you use the Scrabble wordlist, it’s always a win for the second player."
But ... Wikipedia says that the minimum word length can be "three or four letters." In enable1
the limit was three; let's try again with a limit of four:
enable1_4 = Vocabulary(enable1.words, 4)
win(enable1_4)
'n'
Yes. The first player can win with this vocabulary, by playing 'n'
first (and there might be other first plays that also force a win). It makes sense that it is easier for the first player to win, because we've eliminated a bunch of three-letter words from the vocabulary, all of which are losers for the first player. So here's a good meta-strategy: Say "Hey, let's play a game of Ghost. We can use the enable1
word list. Would you like the limit to be 3 or 4 letters?" Then if your opponent says three (or four) you can say "OK, since you decided that, I'll decide to go second (or first)."
We define a strategy as a function that is given a vocabulary and a fragment as arguments, and returns a legal play. Below we define rational
, a strategy that wins whenever it is possible to do so and plays randomly otherwise, and ask
(a strategy factory that returns a strategy that, when called, will ask the named person to input a fragment).
import random
def rational(vocab, fragment):
"Select a play that makes opponent not win (if possible), otherwise a random play."
return win(vocab, fragment) or random.choice(list(vocab.legal_plays(fragment)))
def ask(name):
"Return a strategy that asks for the next letter."
return lambda _, fragment: input('Player ' + name + ' given "' + fragment + '" plays? ')
Here is a function to play a game. You give it a vocabulary, two (or possibly more) strategies, and optionally a verbose
keyword to say if you want a line printed for each play or not.
from itertools import cycle
def play(vocab, *strategies, verbose=False):
"Return (winner, final_fragment) for a game of Ghost between these strategies."
fragment = ''
for (p, strategy) in cycle(enumerate(strategies)): # p is the player number
play = strategy(vocab, fragment)
if verbose:
print('Player {}, given "{}", plays "{}".'.format(p, fragment, play))
if play not in vocab.legal_plays(fragment):
return (winner(fragment + '?'), play) # Player loses for making an illegal play
elif play in vocab.words:
return (winner(play), play) # Player loses for making a word
else:
fragment = play # Keep playing
play(enable1, rational, rational)
(1, 'odyssey')
# Does player 1 win every time?
[play(enable1, rational, rational, verbose=False) for _ in range(20)]
[(1, 'truancies'), (1, 'xebec'), (1, 'ods'), (1, 'vying'), (1, 'ewe'), (1, 'bwana'), (1, 'vying'), (1, 'knell'), (1, 'zloty'), (1, 'flirt'), (1, 'zlote'), (1, 'bwana'), (1, 'gjetost'), (1, 'pliancies'), (1, 'flounce'), (1, 'ufologist'), (1, 'flour'), (1, 'dweeb'), (1, 'flirt'), (1, 'ignatia')]
play(enable1_4, rational, rational)
(0, 'nyctalopia')
play(enable1, ask('P'), rational)
Player P given "" plays? d Player P given "dw" plays? dwa Player P given "dwar" plays? dwarv Player P given "dwarve" plays? dwarves
(1, 'dwarves')
Now we know how to play perfectly, if we have a computer handy to execute the strategy.
But can we summarize the strategy into a form that is small enough that a human can memorize it? I will define the function outcomes(vocab, fragment, player)
to return a set of all the words that are possible outcomes of a game, where the opponent can use any strategy whatsoever, but player
uses a strategy that is:
import textwrap
def outcomes(vocab, fragment, player):
"The smallest set of outcome words, if player tries to win, and make the set small."
if fragment in vocab.words:
return {fragment}
else:
cases = [outcomes(vocab, play, player) for play in vocab.legal_plays(fragment)]
if to_play(fragment) == player: # Player picks the top priority case
return min(cases, key=lambda words: priority(words, player))
else: # Oher player could pick anything
return set.union(*cases)
def priority(words, player):
"""Return (lossiness, number_of_words, total_number_of_letters),
where lossiness is 0 if no losses, 1 if mixed losses/wins, 2 if all losses.
The idea is to find the list of outcome words that minimizes this triple."""
lossiness = (0 if all(winner(word) == player for word in words) else
1 if any(winner(word) == player for word in words) else
2)
return (lossiness, len(words), sum(map(len, words)))
def report_outcomes(vocab, player):
"Find minimal outcomes for player; print info."
oc = outcomes(vocab, '', player)
winners = {w for w in oc if winner(w) == player}
def fill(label, words):
if words:
text = '{} ({}): {}'.format(label, len(words), ' '.join(sorted(words)))
print(textwrap.fill(text))
print('Outcomes ({}) for player {}:'.format(len(oc), player))
fill('Losers', oc - winners)
fill('Winners', winners)
Let's see what minimal set of words player 0 can force the game into (with both vocabularies):
report_outcomes(enable1, 0)
Outcomes (6) for player 0: Losers (1): qursh Winners (5): qaid qiviut qoph qurush qwerty
Interesting! There are only 6 words; it wouldn't be hard for a human to memorize these. Then, when you are playing as player 0, pick 'q'
first, and then try to steer the game to one of the 5 words with an even number of letters. Unfortunately, one word, 'qursh'
(a monetary unit of Saudi Arabia), has an odd number of letters, which means that if the opponent replies to 'q'
with 'qu'
and to 'qur'
with 'qurs'
, then player 0 will lose. But if the opponent makes any other responses, player 0 will win.
report_outcomes(enable1_4, 0)
Outcomes (7) for player 0: Winners (7): naan nene ngultrum nirvanic nolo null nyctalopia
Neat! Only 7 words, and the first player can always win by forcing the opponent to one of these words.
Since player 0 can pick any letter, the minimal outcomes
set for player 1 must be at least 26 words. Let's see how much bigger it turns out to be.
With enable1
we already know that player 1 can force a win, so all the words in the outcomes
set will have odd length:
report_outcomes(enable1, 1)
Outcomes (55) for player 1: Winners (55): aah aal aargh aas bwana cwm drave dreck drink droit druid dry ewe fjeld fjord gjetost hmm ihram jnana kwashiorkor llano mho nth oquassa prawn prequel prill pro prurigo pry qua quell quiff quomodo qursh rhamnus rheum rhizoid rho rhumb rhyolitic squoosh tchotchke uhlan vying wrath wrest wrist wrong wrung wry xanthic xanthin yttrium zucchetto
Memorize this list and you will never lose as player 1.
How about with the other vocabulary?
report_outcomes(enable1_4, 1)
Outcomes (85) for player 1: Losers (4): hybris hyphen hyte ngultrum Winners (81): aquiculture aquifer aquilegia aquiver bwana cnidarian drave dreck drink droit druid dryness eschatologies eschatology escheat eserine esker esophagus esplanade esquire esquiring essay estuarine estuary esurience fjeld fjord gjetost hyaenic hydatid hyena hyenine hyenoid hygeist hying hylozoism hylozoist hymen hyoid hypha hyraces hyrax hyson ihram jnana kwashiorkor llano mbira ngwee oquassa plaza plethoric plica plonk pluck plyer quaff quell quiff quomodo qursh rhamnus rheum rhizoid rhomb rhumb rhyolitic squoosh tchotchke uhlan vying wrath wrest wrist wrong wrung wryly xanthic xanthin yttrium zucchetto
In this case there are 85 words, four of which are losses for player 1. But the other 81 words are wins, so with this strategy you'd have a good chance against an imperfect opponent.
In the variant SuperGhost, players can add a letter to either the beginning or the end of a fragment, as long as this forms a fragment that is part of some word. As Wikipedia says, given the fragment era
, a player might play bera
or erad
. I was thinking of SuperGhost when I made the design decision to encapsulate legal_plays
as a method of Vocabulary
, rather than as a separate function. Because I did that, I should be able to use all the existing code if I just make a new class, SuperVocabulary
, that finds all fragments (i.e. infixes) rather than just the beginning fragments (i.e. prefixes), and if I change legal_plays
to add letters to both ends.
class SuperVocabulary(Vocabulary):
"Holds a set of legal words and a set of legal infix fragments of those words."
def __init__(self, words, minlength=3):
self.words = {word for word in words if len(word) >= minlength}
self.fragments = {word[i:j] for word in self.words
for i in range(len(word))
for j in range(i, len(word) + 1)}
def legal_plays(self, fragment):
"All plays (adding a letter to fragment) that form a valid infix."
return {p for L in alphabet for p in (fragment + L, L + fragment)} & self.fragments
Now I will create SuperVocabulary
objects for 3- and 4-letter versions of enable1
, and check out how many fragments there are in each variant:
enable1_3super = SuperVocabulary(enable1.words, 3)
enable1_4super = SuperVocabulary(enable1.words, 4)
# Example legal plays
enable1_3super.legal_plays('crop')
{'acrop', 'cropa', 'crope', 'croph', 'cropi', 'cropl', 'cropo', 'cropp', 'cropr', 'crops', 'cropt', 'cropu', 'cropy', 'ecrop', 'icrop', 'ncrop', 'rcrop', 'tcrop'}
# Can the first player win in SuperGhost with 3-letter words?
win(enable1_3super)
'u'
# How about with a 4-letter limit?
win(enable1_4super)
'u'
The first player can win with or without three-letter words. And unless the first player is perfect, the rational strategy can do pretty well as seond player as well. Here is a sample game:
play(enable1_3super, ask('P'), rational)
Player P given "" plays? q Player P given "cq" plays? cqu Player P given "cque" plays? acque Player P given "acques" plays? acquest
(1, 'acquest')
I would like to give a concise summary of the strategy for SuperGhost, but my existing outcomes
function won't do it. That's because it is not enough to know that a particular word results in a win; we have to know in what order the letters of the word are added. I'll leave it as an exercise to find a good way to summarize SuperGhost strategies.
In the variant SuperDuperGhost, players have an option to reverse the fragment before adding a letter to the beginning or end. As Wikipedia says, given the fragment era
, a player might play bera, erad, nare,
or aren
.
Wikipedia is not clear, but I interpret this as meaning that the fragment played must still be a fragment of a word (not a reversed fragment of a word). Again, all we need is a new subclass:
class SuperDuperVocabulary(SuperVocabulary):
"Holds a set of legal words and a set of legal infix fragments of those words."
def legal_plays(self, fragment):
"All plays that form a valid infix; optionally reverse fragment first."
def all4(f, L): return (f + L, f[::-1] + L, L + f, L + f[::-1])
return {p for L in alphabet for p in all4(fragment, L)} & self.fragments
enable1_3duper = SuperDuperVocabulary(enable1.words)
enable1_4duper = SuperDuperVocabulary(enable1.words, 4)
# Example legal plays
enable1_3duper.legal_plays('crop')
{'acrop', 'cropa', 'crope', 'croph', 'cropi', 'cropl', 'cropo', 'cropp', 'cropr', 'crops', 'cropt', 'cropu', 'cropy', 'ecrop', 'icrop', 'iporc', 'ncrop', 'nporc', 'porce', 'porch', 'porci', 'porcu', 'rcrop', 'tcrop'}
Now we should check who wins. But I'm impateint: I tried win(enable1_3duper)
, waited a minute, and it still hadn't returned, so I interrupted that computation and threw in a lru_cache
decorator; which stops win
from wasting time repeating the same computation over and over. This brings the time down to about 2 seconds.
from functools import lru_cache
@lru_cache(None)
def win(vocab, fragment=''):
"""Does the current player have a forced win?
Return fragment if the player has already won, or return a play that forces a win."""
if fragment in vocab.words or fragment not in vocab.fragments:
return fragment
for play in vocab.legal_plays(fragment):
if not win(vocab, play):
return play
return False
win(enable1_3duper)
'p'
win(enable1_4duper)
'p'
The first player can win with either vocabulary. Here's a sample game.
play(enable1_3duper, rational, rational, verbose=True)
Player 0, given "", plays "p". Player 1, given "p", plays "pr". Player 0, given "pr", plays "lpr". Player 1, given "lpr", plays "rple". Player 0, given "rple", plays "rplex". Player 1, given "rplex", plays "rplexe". Player 0, given "rplexe", plays "urplexe". Player 1, given "urplexe", plays "urplexes". Player 0, given "urplexes", plays "ourplexes". Player 1, given "ourplexes", plays "fourplexes".
(0, 'fourplexes')
Let's see how many fragments each vocabulary takes:
[len(v.fragments)
for v in [enable1, enable1_4, enable1_3super, enable1_4super, enable1_3duper, enable1_4duper]]
[387878, 387844, 1076434, 1076431, 1076434, 1076431]
Here's a summary of what we have learned. (Note: the bold qursh means it is a losing word):
| Variant | Shortest | Winner | First Player Forced Outcomes | 2nd Outcomes | Fragments |:----: |:---: |:---: |:---: |:---: |---: | Ghost | 3 | Second | qaid qiviut qoph qursh qurush qwerty | 55 words | 387,878 | Ghost | 4 | First | naan nene ngultrum nirvanic nolo null nyctalopia | 85 words | 387,844 | Super | 3 | First | ? | ? | 1,076,434 | Super | 4 | First | ? | ? | 1,076,431 | SuperDuper | 3 | First| ? | ? | 1,076,434 | SuperDuper | 4 | First| ? | ? | 1,076,431
Here are some additional ideas to play with:
vocab.words
, inserting or deleting some crucial words to ensure victory. Can you harden play
(and/or change Vocabulary
) to protect against that?Vocabulary
saves words and fragments that could never be reached in a game. For example, because 'the'
is a word that ends the game, we could never reach 'them'
or 'theme'
or 'thermoluminescences'
. Can you prune away these redundant words/fragments?play(enable1, ask('A'), ask('B'), ask('C'))
will play a three-player game. But rational
(along with win
and winner
) would no longer work, since they assume there are exactly two players. Can you alter them to allow n players?'era'
you could play 'erba'
.'era'
you could play 'bear'
.