May we see some of your APL code? I like APL's syntax a lot but I haven't come across much code to read.
Sure, I'll provide an explanation of the Wordle solver. So, here's the Wordle solver in APL:
Code:
d←'abcdefghijklmnopqrstuvwxyz'
∇ wordle l;⎕IO;g;i;s;w ⍝ I have l for list; g for guess; i for index; s for summary; and w for word.
⎕IO←i←1◊s←+/d∘.=w←l[?↑⍴l;]◊'Word: ',w
now: ⎕←g←l[?↑⍴l;]◊→end⍴⍨g≡w◊i←i+1◊l←(((+/[2]l∘.=d)∧.≤(6,⍳5)[1+(s<+/d∘.=g)×s⌊+/d∘.=g])∧((+/[2]l∘.=d)∧.≥s⌊+/d∘.=g)∧(((g=w)/l)∧.=(g=w)/w)∧(((~g=w)/l)∧.≠(~g=w)/g)∧(~∨/(l∊(~g∊w)/g)))⌿l◊→now
end: 'Guesses: ',⍕i
∇
I asked Josh, and my posting it here does give him a special license to redistribute the code in relation to Kiwi Farms, but everyone else may use the code under the AGPLv3. Clearly, this is very difficult to read, and practically impossible without knowledge of APL, so I'll now describe it. Here's an alternative form of the code which splits the solver into its parts:
Code:
∇ wordle l;⎕IO;g;i;s;w
⎕IO←i←1◊s←+/d∘.=w←l[?↑⍴l;]◊'Word: ',w
now: ⎕←g←l[?↑⍴l;]◊→end⍴⍨g≡w◊i←i+1
l←(~∨/(l∊(~g∊w)/g))⌿l
l←(((~g=w)/l)∧.≠(~g=w)/g)⌿l
l←(((g=w)/l)∧.=(g=w)/w)⌿l
l←((+/[2]l∘.=d)∧.≥s⌊+/d∘.=g)⌿l
l←((+/[2]l∘.=d)∧.≤(6,⍳5)[1+(s<+/d∘.=g)×s⌊+/d∘.=g])⌿l◊→now
end: 'Guesses: ',⍕i
∇
The solver is based on a very simple principle: There are only so many five-letter words in the English language, and not very many at all, meaning a machine may hold the entire dictionary of such words in its storage and gradually narrow it down. I constructed a simple dictionary using a public corpus, but some of you may wish to look into the New York Times leak to get theirs. I'm soon going to write about improving the solver slightly by reordering the dictionary to make picking the first word optimal, but all current iterations of the solver choose from the dictionary randomly.
I particularly liked the rigidity of APL here, since it was very convenient to represent the dictionary as a two-dimensional array with an unbounded first dimension and a last dimension of five.
The first line of the APL function is a definition:
I defined a function named wordle returning no value with one parameter named l; all remaining names are local parameters, including the Index Origin ⎕IO value which may be set to zero or one.
The second line performs initializations and displays the word which the solver will find. APL expressions are evaluated right-to-left. The diamond ◊ separates expressions, which are evaluated left-to-right:
Code:
⎕IO←i←1◊s←+/d∘.=w←l[?↑⍴l;]◊'Word: ',w
So, ⎕IO and i are both set to one, with i only serving as a count of the guesses; w is set to the word which the solver will find, chosen at random; s is set to the summary of w; and then the string is displayed, joined by the comma function which combines arrays. The code which I've named Summarize in other languages is very special:
With a domain d, it compares each element of the given value to the domain; when comparing with a vector such as a word, this will be a two-dimensional matrix of ones and zeroes as its result; reduction with addition across the last dimension converts this to a count. An example will make this very clear:
Code:
+/d∘.='kiwifarms'
1 0 0 0 0 1 0 0 2 0 1 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0
Here's an even clearer result:
Code:
d,[0.5]+/d∘.='kiwifarms'
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 0 0 0 0 1 0 0 2 0 1 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0
The code ignores all values not in the domain, but I take no advantage of this detail. There's a lot of beauty in a name, but notice the APL needs no name, as the code is literally shorter than the name Summarize here. Therefore, inlining the code is always preferable.
The last line of the function reports the number of guesses after converting the count of guesses i into a string:
The Common Lisp reported the count in English, even capitalizing everything properly, but that was infeasible here. Follows is the COMMON-LISP:FORMAT expression which does so, for those curious:
Code:
(format t "~&~:(~R~) ~:[guesses were~;guess was~] needed.~%"
count (= 1 count))
The second line is where most of the real work happens, obviously:
Code:
now: ⎕←g←l[?↑⍴l;]◊→end⍴⍨g≡w◊i←i+1◊l←(((+/[2]l∘.=d)∧.≤(6,⍳5)[1+(s<+/d∘.=g)×s⌊+/d∘.=g])∧((+/[2]l∘.=d)∧.≥s⌊+/d∘.=g)∧(((g=w)/l)∧.=(g=w)/w)∧(((~g=w)/l)∧.≠(~g=w)/g)∧(~∨/(l∊(~g∊w)/g)))⌿l◊→now
APL lacks indefinite loops in its basic constructs, so it's necessary to build them out of goto, basically. The last expression with the right arrow → jumps back to the line to start it anew.
Firstly, g is initialized at random from the dictionary, no differently than w, and is also displayed by sending the expression to the box character ⎕:
Secondly, this expression is the exit condition. It was easy enough to use the match function ≡ for this, which returns one if both sides are identical, and zero otherwise. That result is used to reshape the expression naming the third line of code. I'm fond of using the swap operator ⍨ for eliminating parentheses. The expression
l f⍨ r
is equivalent to
r f l
. It just switches the elements, which is useful to avoid parentheses with the evaluation order. So, this code goes to the third line if it's reshaped to size one, but does nothing if it's reshaped to size zero:
Thirdly, i is incremented:
The original Common Lisp solver used a function WORDLE::CHECK-WORD which took in both words and returned a vector of symbolic results. Here's an example:
Code:
CL-USER> (WORDLE::CHECK-WORD "kiwis" "milks")
#((:NONE #\m) (:HERE #\i) (:NONE #\l) (:SOME #\k 6 1 #(T NIL T T NIL))
(:HERE #\s))
The Common Lisp solver was actually very clumsy compared to the APL, but this isn't too hard to understand. The solver itself used these symbolic results to reject words containing letters in the :NONE case, reject words not containing letters in the right positions for the :HERE case, and the :SOME case was particularly clumsy in indicating that the letter is somewhere in the word but not where it was given in the guess. The Common Lisp also had another case :MISS purely for handling a poor interaction of these cases, and there are other details I won't explain.
The APL was originally going to follow the Common Lisp in structure, but I knew it could be better.
The APL solver handles each of these cases more elegantly and splits the problem into five; rather than word-at-a-time thinking, the APL solver handles all letter matches simultaneously, and likewise for the other four cases.
This is a filter expression in which the bit-vector a filters the rows of the array l based on whether a holds a one or a zero, keeping the row in the former case:
The solver is built around this expression, and the one-line form combines the masks with logical AND before filtering the dictionary. The first expression filters words containing letters known to not be not present, equivalent to the :NONE case of the Common Lisp:
The member function ∊ returns a result indicating whether an element in the left appears in the right or not. These examples will make it clear:
Code:
'take it down now now now'∊'kiwi farms'
0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 1 1 0 0 1
1 2 3∊1 1 4 3 5
1 0 1
The following expression returns letters in the guess not present in the word to guess. The tilde ~ negates its parameter:
Those letters are then used on the entire dictionary simultaneously in the same way:
The last little bit
~∨/
is logical OR reduction along the last dimension followed by negation. Afterwards, this is a nice little vector in which a one indicates the row lacked those letters, and zero indicates the opposite.
The second expression filters words containing letters known to not be present in the given positions:
The comparison function = returns a one for each element that matches on both sides, and zero otherwise. The following expression reduces the guess to only those letters which didn't match in value and position:
This example will help, and notice the order is maintained:
Code:
(~'kiwis'='milks')/'kiwis'
kwi
The following expression does the same to the dictionary, but removes entire columns instead:
So, the filtered dictionary has the same number of columns as the filtered guess has letters afterwards. This is necessary for the inner product code. To be brief, the code
∧.≠
will compare the guess with each row of the filtered dictionary with inequality and then reduce the vectors with logical AND afterwards. Therefore, the resulting vector will hold a one in a position only if all values in the row weren't equal with any letter of the guess.
Here's a nice explanation of inner product:
To be brief, I needed to compare the filtered guess with every filtered row of the dictionary, and this was the way to do it. In some vector languages, the language notices this kind of code more simply, and will allow it. Had I only needed to compare a single letter, it would've worked without inner product, which was a little frustrating when I was writing this.
The third expression filters words not containing letters known to be present in their correct positions, equivalent to the :HERE case of the Common Lisp, and is similar to the second expression:
Notice the code is almost identical, but it doesn't negate the intermediate values, and uses equality over inequality in the inner product.
Lastly, the fourth and fifth expressions handle the remainder of the :SOME case, that case in which the correct letters were guessed but not in the correct positions. I learned a lot about how to properly handle this case when writing it in APL; the Common Lisp handling of it is actually slightly flawed, in that it's too open and failed to eliminate guesses in a very specific set of circumstances. Since I can do everything simultaneously in APL, I noticed it was very easy to handle. These two expressions also use the Summarize code again.
The fourth expression summarizes the guess and compares it to the summary of the word to guess with the minimum function ⌊ before using that information to filter the dictionary by its summaries:
Code:
((+/[2]l∘.=d)∧.≥s⌊+/d∘.=g)
This reveals only what is known when playing the game, namely the minimum count of known letters, and examples make this clear:
Code:
+/d∘.='kiwis'
0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0
+/d∘.='milks'
0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0
(+/d∘.='kiwis')⌊+/d∘.='milks'
0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
So, in this example, the guess "kiwis" has more instances of the letter i than the word to guess "milks" and so the code correctly shows that the word contains a minimum of one letter i. I won't explain the following code beyond pointing out that it summarizes the entire dictionary, and reduces along the second axis instead, due to the way the code works:
After this, both sides contain summaries, and the code
∧.≥
will reject any words in the dictionary whose corresponding letter counts aren't equal to or greater than the summary obtained on the right side of the inner product expression.
The fifth and final expression is similar, but calculates the known maximum letter counts:
Code:
((+/[2]l∘.=d)∧.≤(6,⍳5)[1+(s<+/d∘.=g)×s⌊+/d∘.=g])
A more complex operation was needed to reduce the maximums; after all, using the maximum function ⌈ wouldn't work, because that would reveal knowledge when it shouldn't be known. What I did was calculate the array of minimums and then multiply it with a bit vector of comparisons. The code
(s<+/d∘.=g)
produces an array in which every element is a zero unless the summary of the guess is greater than the summary of the word to guess for that letter, in which case the game would show grey squares and thus the maximum would be known. By multiplying the array of minimums by this array of comparisons, I obtain an array where only the known maximums remain, and all else will be zero:
I still need to convert zero to the proper integer for unknown maximums, however, and I used six. I saw no better way to perform this translation than a table mapping. The codes are incremented by one due to Index Origin ⎕IO. The code
6,⍳5
will produce the vector
6 1 2 3 4 5
:
Code:
(6,⍳5)[1+(s<+/d∘.=g)×s⌊+/d∘.=g]
After that, the fifth expression is identical to the fourth, except that its inner product code is
∧.≤
instead, meaning it requires the corresponding letter counts to be less than or equal to those on the right.
With all of these masks combined, the dictionary can be stripped of all words that can't be the word to guess. Sooner or later, the word to guess is found by this method.