Mutually Orthogonal Wordle Words
Wordle is a very popular New York Times game that (as I recall)
surged in popularity during the COVID-19 pandemic.https://www.nytimes.com/games/wordle/index.html
In case you haven’t seen it or played it before, here’s a
simple overview.
The point of Wordle is to guess a five letter word in six tries.The pool of possible words (~2300) is much smaller than
all five letter words in the English language (~13000). This makes this
project practible.
Each guess gives you three types of possible
information.
- If the letter of your word is grey, that means it does not appear in the word whatsoever.
- If the letter is yellow, that means it appears oneor more
times in the word, but not in its current position. - If the letter is green, that means it is in its correct
position.This does not imply that it may have another correct
duplicate elsewhere in the word, however.
With this background, it should be obvious that there are many possible “correct” ways to play Wordle, each a different method of cutting down the decision tree of possible words. This led, during the peak of its popularity, to much debate over the mathematically “optimal” first guess which would reduce the tree as much as possible. On the other hand, I prefer to eliminate as many letters as possible, and let my grasp of the English language do the rest. It is difficult, however, to come up with words that eliminate only unused letters to reduce the search space effectively. Thus, this project.
Mutually Orthogonal Covering Words
With the idea of reducing the search space as much as possible, I propose the method of mutually orthogonal covering words. For example, “abc,” “def,” and “hij” are three mutually orthogonal words that “cover” the set \{a,b,c,d,e,f,g,h,i\}. So the task before us is simple: find a covering set of five letter words to make solving Wordle easy. This leads to two questions: one, how many letters should we cover, and two, which letters should we choose?
Given that Wordle only allows six guesses, we will be unable to cover the entire alphabet in a useful way. To make an informed decision, though, it is useful to consider the frequency of letters in English. One snag with this approach is that Wordle words are not uniformly drawn from the English language. For one thing, they are all five letters long, and for another, are selected by humans for a game. This will of course lead to bias. As anyone who has ever watched Wheel of Fortune knows, the most common letters in English as a whole are R,S,T,L,N, and E. It turns out that this is not nearly correct for Wordle words. Running frequency analysis on all possible Wordle answers gives the list (in descending order) of E,A,R,O, and T. Plotting a histogram shows a sharp downturn in frequency for the last four letters of Z,X,Q, and J. Thus it seems clear that we should cover the top twenty letters, as otherwise we “waste” the information we could have gained from Z,X, and Q. Thus, we will search for all possible four-tuples of mutually orthogonal covering words.
Procedure
This analysis was completed in Python3 in a Jupyter notebook. That notebook may be found here, and excerpts from the code will be explained in detail.
Filtering
The first step is to run frequency analysis and filter out all words that contain infrequent letters and those with double letters (as they waste our allowed information entropy).
words = list(filter( lambda m: reduce(lambda q,p: q and p,
[y not in m for y in x[20:]]), words))
words = list(filter(lambda a: [i for i in set(a) if a.count(i) > 1] == [],
words))Here, x is the frequency distribution ordered list of
letters, and x[20:] takes only the six least-frequent
letters.
We can then filter out exactly those words which we don’t care about because they have too rare of letters:
words = list(filter( lambda m: reduce(lambda q,p: q and p, [y not in m for y in x[20:]]), words))Another useful class of words we can throw out is ones with double letters, as they are effectively stealing entropy from us.
words = list(filter(lambda a: [i for i in set(a) if a.count(i) > 1] == [], words))Vectorization
As you may know, strings
are very hard for computers. Numbers, and by extention vectors,
matrices, tensors, and all those lovely objects, are very easy for
computers.Why else would they call it TensorFlow instead of
StringFlow?
To save time, then, we’ll cast all of our words into a
26-dimensional vector space, where the length along each axis
corresponds to the position of that letter in the word.
def vector_word(word):
v = np.zeros(26)
for char in zip(word, range(1,6)):
v[ord(char[0])-ord('a')] = char[1]
return v
As an example, “abcde” would look like
[1,2,3,4,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
in this representation.
Now, instead of seeing if each word has any of the letters from another word (which is slow) we can simply compute the dot product of the two words (which is very fast).
Finding All Possible Covers
It turns out that these dot products are so fast, and narrow down the search tree so drastically, that we can take the most naive approach possible and still find every possible four-cover within a few seconds. This is simply nested loops.
for v1 in vwords:
if np.all(v1==0): continue # just in case
l1 = filter(lambda a: np.dot(a, v1) == 0 and not np.all(a==0), vwords)
for v2 in l1:
l2 = filter(lambda b: np.dot(b, v2) == 0 and not np.all(b==0), l1)
for v3 in l2:
l3 = filter(lambda c: np.dot(c,v3) == 0 and not np.all(c==0), l2)
for v4 in l3:
print("")
print(unvector(v1))
print(unvector(v2))
print(unvector(v3))
print(unvector(v4))
This gives us the output (and the answer to our initial question!):
barge
child
funky
stomp
birch
adept
flung
smoky
crimp
abled
funky
ghost
flack
begin
dumpy
short
flaky
begin
chord
stump
fleck
abhor
dingy
stump
girth
abled
comfy
spunk
glade
birch
funky
stomp
gloat
bench
dumpy
frisk
grasp
befit
chunk
moldy
right
abled
comfy
spunk
tango
belch
dumpy
frisk
tonga
belch
dumpy
frisk
This is a total of twelve four-covers.
Corrolaries
To wrap up this project, here are a few interesting things we may notice about these results:
- The letter “y” only ever appears in the final place and never within a word.
- Some words appear more often than we might expect (particularly the words ending in “y”)
- There are more of these than we might expect intuitively, though not necessarily from an information entropy perspective.