Let’s use Wordle as an excuse to practice with string manipulation.
dict <- data.frame(word = read.csv("Collins Scrabble Words (2019).txt",
header = FALSE, skip = 2)[, 1]) %>%
mutate(word=tolower(word))
sample_n(dict, 100) %>%
unlist() %>%
unname()
## [1] "afferent" "waybill" "wammul" "mitigates"
## [5] "castlings" "socialises" "dawding" "prolocutors"
## [9] "brandishes" "prehending" "coleopters" "inthroning"
## [13] "embryologists" "riotous" "misspent" "consultantships"
## [17] "paramounts" "tabogganing" "retypes" "polarograph"
## [21] "shivery" "diandries" "inegalitarian" "teuchats"
## [25] "easts" "creates" "pedanticisms" "semiologists"
## [29] "certificates" "oratrixes" "utterless" "nilghau"
## [33] "superpersons" "scowrie" "calisthenics" "caricatured"
## [37] "hornists" "voluntary" "orogenic" "clairaudience"
## [41] "figurately" "cataract" "mopoke" "gooks"
## [45] "scalableness" "vinegary" "humidities" "monochlorides"
## [49] "outacting" "unthatch" "endangerer" "pollman"
## [53] "recertify" "notching" "virandas" "verberations"
## [57] "forevermore" "microcopy" "pabulums" "hearsy"
## [61] "emperorships" "blindfolded" "piscator" "magnificat"
## [65] "vacantnesses" "reassessed" "scrimpers" "gizzards"
## [69] "rearhorses" "marjoram" "surmasters" "octostyle"
## [73] "smoothpate" "sokol" "conjunctions" "rizzoring"
## [77] "dysphemistic" "douc" "frontlists" "caviars"
## [81] "civils" "clysters" "neurular" "midstories"
## [85] "endogenous" "fourfoldness" "thistledowns" "methanations"
## [89] "syndicate" "ignorances" "satsangs" "jacklights"
## [93] "whither" "slushily" "comices" "pileorhizas"
## [97] "spivvery" "weblish" "hexaemeric" "swealings"
First word ‘adieu’, all grey!
⬜⬜⬜⬜⬜
Let’s filter out all those vowels and the number of characters.
out <- dict %>%
filter(str_detect(word, "a|e|i|u", negate=T),
nchar(word)==5)
nrow(out)
## [1] 1200
1,200 words to work with
sample_n(out, 75) %>%
unlist() %>%
unname()
## [1] "cocco" "proto" "jomon" "hymns" "kloof" "donny" "bohos" "motts" "poboy"
## [10] "poots" "boozy" "phons" "orlop" "poynt" "myops" "typos" "kroon" "vlogs"
## [19] "trons" "cobbs" "sodom" "flogs" "poopy" "golly" "sybow" "sooky" "molds"
## [28] "crook" "octyl" "torts" "phots" "bowrs" "wormy" "jowls" "dolor" "gotch"
## [37] "trots" "hoosh" "corno" "socko" "toffy" "comby" "jolty" "goory" "swoon"
## [46] "clomp" "polls" "proyn" "mosks" "smogs" "sprod" "forth" "jonty" "noops"
## [55] "nolls" "shows" "north" "wooly" "showd" "hoppy" "glost" "jolls" "bonks"
## [64] "zocco" "bobby" "bowls" "looms" "goopy" "cyton" "sords" "sowff" "konks"
## [73] "prosy" "snobs" "crool"
Second try ‘strop’. R, o, and p yellow
⬜⬜🟨🟨🟨
Let’s filter for most common consonants, as well as the information from the previous guess.
out2 <- out %>%
filter(str_detect(word, "t|n|s"),
str_detect(word, "r|o|p"))
nrow(out2)
## [1] 909
909 words to work with1
sample_n(out2, 75) %>%
unlist() %>%
unname()
## [1] "molds" "tophs" "tronc" "coopt" "wonts" "confs" "torcs" "zobos" "wonky"
## [10] "cloys" "omovs" "gonof" "shogs" "frory" "moots" "plong" "scrod" "corso"
## [19] "jolls" "tofts" "shock" "bosky" "poynt" "whort" "forky" "gyron" "dorts"
## [28] "gymps" "nobby" "thoft" "stong" "snobs" "volks" "yorks" "towzy" "flogs"
## [37] "rocks" "yolks" "shorn" "shool" "tromp" "showy" "troth" "gongs" "forgo"
## [46] "scoop" "pooks" "shots" "pomps" "hoods" "softs" "proso" "worms" "boons"
## [55] "story" "flops" "sycon" "flows" "prost" "props" "koffs" "hoghs" "torot"
## [64] "holon" "stonk" "cowks" "sownd" "posho" "rotls" "rooms" "yocks" "notch"
## [73] "bolos" "boxty" "lownd"
Third try ‘poncy’. p and y green, o yellow.
🟩🟨⬜⬜🟩
In fact that’s a regression since I did not choose a word with r or o.
Let’s filter for the first and last letters.
out3 <- out2 %>%
filter(substr(word, 1,1)=="p",
substr(word,5,5)=="y")
nrow(out3)
## [1] 14
Only 14 words possible.
out3 %>%
unlist() %>%
unname()
## [1] "phony" "poncy" "pongy" "ponty" "popsy" "porgy" "porky" "porny" "porty"
## [10] "potsy" "potty" "powny" "prosy" "proxy"
Next try pygmy, no help.
🟩⬜⬜⬜🟩
In fact that’s an even worse regression because I forgot to use a word that included o and r again. Let’s filter for o and r so I won’t forget, and also remove the consonants we’ve used so far.
out4 <- out3 %>%
filter(str_detect(word, "o|r"),
str_detect(word, "d|s|t|n|c|g|m", negate=T))
nrow(out4)
## [1] 2
Only two words left.
out4 %>%
unlist() %>%
unname()
## [1] "porky" "proxy"
Tried proxy on fifth attempt, got it!
Zero credit for using any vocabulary skills (I still don’t know what strop or poncy mean). But it’s a good occasion to do some string manipulation. To keep some skill involved, next time I won’t filter for most common consonants, but only filter out based on past guesses.
Wordle 213 5/6
⬜⬜⬜⬜⬜ ⬜⬜🟨🟨🟨 🟩🟨⬜⬜🟩 🟩⬜⬜⬜🟩 🟩🟩🟩🟩🟩
I later noticed that the sequential calls to str_detect did some additional filtering↩︎