Codebox Software
Wordle Solver Shell Script
Published:
The UNIX command-line has a number of tools, such as grep
and awk
, that are great at
reading text files and finding lines that match a set of criteria.
Solving the word game Wordle involves
exactly this process - using the list of valid words,
find the one that matches a set of clues.
For example, let's say you guessed that the word was CRATE and Wordle gave you the following clues:
The grey tiles mean that the letter does not appear in the word, the yellow tile means that the letter is in the word but in a different position, and the green tile means that the letter is in the correct place. So we know:
- C A and T don't appear anywhere in the word
- The second letter is R
- There is an E somewhere in the word, but it's not the 5th letter
All 3 of these statements can be expressed quite easily using grep:
> cat words.txt | grep -v c | grep -v a | grep -v t | grep .r... | grep e | grep '....[^e]' pries fryer greek brier ...
These filters reduce the word list down from 12,971 to just 105. By using the reduced list to choose the next word, and then adding further grep commands to the list of filters, we can quickly arrive at an answer.
One complication arises due to the way that Wordle handles duplicate letters in guesses. For example, if the target word is OLDIE and we guess BOOST we will get the following tiles:
Because the target word only has one O in it, only one of the Os in our guess is coloured yellow, the other is grey.
This extra information is useful, but it means that when transforming tile colours into grep expressions we can no longer
consider the 5 tiles individually as we did in the previous example (doing so would result in the 3rd tile producing a
grep expression of grep -v o
and excluding any words containing the letter O). Instead we must consider
each unique letter rather than each tile, so in this example we would create 4 sets of expressions one for each of the
letters B O S and T.
The clues provided by Wordle give us 2 pieces of information about each letter - positional information indicating where it appears in the word, and also information about how many times the letter appears.
How many times the letter appears
The number of times a letter appears in the target word is fairly simple to derive: just add together the number of yellow and green tiles for that letter. If there are also some grey tiles for that letter then we know the 'yellow + green' total is the exact number, otherwise it is a lower bound. For example, let's say we have the following tiles:
We only have one R in our guess and it was coloured yellow, so we know there is at least one R in the target word,
but there might be more - for example the words
RIVER and
RISER
would both match this set of clues. In our grep commands we can
express this using the repetition operator {1,}
which means '1 or more matches'.
We have 2 Es in our guess, but one of them was coloured grey. If there were 2 Es in the target word then both
of them would have been coloured yellow or green, but that isn't the case here, so we know there is only one E in
the word. We can express this using the operator {1}
which means 'exactly 1 match'.
Where the letter appears
Creating positional expressions for each letter is also quite easy, we can just replace each letter in our guess with an expression that either matches that letter (if the tile was green) or does not match that letter (if the tile was grey or yellow). In our GREEK example above we know that:
- The first letter is not G
- The second letter is not R
- The third letter is not E
- The fourth letter is E
- The fifth letter is not K
These statements can be translated into the grep expression: [^g][^r][^e]e[^k]
By combining these 2 types of filters together we can fully utilise the information that Wordle gives us in its clues, and remove as many non-matching words from the list as possible on each attempt.
wordle.sh
I have implemented the algorithm described above in this shell script.
I ended up using awk
rather than grep
to do the filtering, because that made it easier to string multiple filtering operations together in
a single command. The script automatically downloads a Wordle word list the
first time you use it.
To use the script, run it with one command-line argument for each guess you have made in your game of Wordle. The arguments should be of the
form <word>,<clues>
where word
is the 5-letter word that you guessed, and clues
contains
the colours of the tiles displayed by Wordle for that word. The tile colours are represented by the letters 'b' for black (or grey), 'y' for yellow
and 'g' for green. By default the script will show you up to 10 words that match the clues provided, however this number can be changed using the optional
--count
parameter.
For example:
bash# ./wordle.sh crane,byybb LARIS LIRAS RAILS RATOS ROTAS SORTA TAROS TORAS SOLAR SORAL [508 matches found in total] bash# ./wordle.sh crane,byybb rails,yybbb AMOUR KORAT ABORT DOUAR DOURA TORAH AMORT MORAT APORT PORTA [76 matches found in total] bash# ./wordle.sh --count=30 crane,byybb rails,yybbb MOWRA FORAM BORAK FORAY GOBAR AORTA OTTAR TORTA ABHOR YURTA KORMA DOWAR BOYAR OMRAH KOURA MORAY ABORD DOBRA DORBA GOURA AMOUR KORAT ABORT DOUAR DOURA TORAH AMORT MORAT APORT PORTA [76 matches found in total] bash# ./wordle.sh crane,byybb rails,yybbb abort,ybyyb QORMA JORAM DORAD FORAM FORAY KORMA DOWAR OMRAH MORAY DOUAR [13 matches found in total]