Searching for a blander blender


7 Feb 2019 Code on Github

Introduction

In the village where I live there are two pubs, quite close to one another, one called The Bull, the other called The Bell. I like imagine someone opening pubs called The Ball, The Bill, and The Boll. (I wasn't 100% sure boll was a word, but I once did a summer project on bollworm, so it seemed possible; it turns out boll means the round pods of certain plants, like cotton, which is what bollworm eat.)

I've also always liked how you sing a song, sang a song, and then the song was sung. It feels like seng should also be word, maybe an adjective meaning something like musical. I wish more words followed this pattern. Maybe we could ring a rong or bring a brong.

Anyway, all this got me thinking, what English words can you swap its vowel(s) for any other vowel and still have a valid English word?

Vowel variants

I set a program to search through all the words in my word list, replacing any of its As for each of the other vowels and checking whether the resulting variants were also words on my list. The longest words I found every vowel variants valid had seven letters:

  • balling, belling, billing, bolling, bulling
  • blander, blender, blinder, blonder, blunder
  • patting, petting, pitting, potting, putting

The blander variants are my favourite since they all seem like reasonable words (unlike bolling and bulling). The patting ones seem like cheating, as they are verb forms: you can remove the last four letters to get the base word. I seem to remember I once found the pat, pet, pit, pot, put group when thinking about this problem in a particularly boring lecture.

With the blander variants, removing the final -er gets you a simpler word in all cases, except for blunder - there is no blund. Also the -er ending is used in two way: one is to turn an adjective into a comparison (blander and blonder), the other to turn a verb into a noun for something that does the verb (blender and blinder).

Shorter variants

There are two groups with six letters: patted (which again seems like cheating), and masses (musses is pretty rare, only appearing 16 times in my word list).

With five letters, we have: mares, dally, massy (?), packs, balls, tales, mates, bands.

Longer variants

If I use a more lenient word list (the UNIX word list at /usr/share/dict/words), then I find one set of variants with 9 letters: massiness. None of the words, massiness, messiness, missiness, mossiness or missiness appear in my word list, but my spell check accepts messiness and mossiness which seems reasonable.

The Unix word list has no variants with 8 letters, and for 7 letters it has only unstack and chacker. So the word list includes the words unsteck and chacker, but not blonder, patting or balling (because they are verb forms, I guess). And that's why I'm not using that word list.

Varying any letter

If we allow any letter in a word to be replaced by any other letter, then each word can have up to 25 variants for each of its letters.

Most variants at a given position

Unsurprisingly, there aren't any words with a letter that can be replaced by all other letters. The best I found was three sets of words with 15 variants.

For example, the one set of words which can have 16 different letters in its first position is: ain, bin, din, fin, gin, hin (an ancient Hebrew unit of liquid measure apparently), jin, kin, lin, pin, rin, sin, tin, vin, win and yin.

WordCountLetter options
*in16a, b, d, f, g, h, j, k, l, p, r, s, t, v, w, y
*at16b, c, e, f, g, h, k, l, m, o, p, r, s, t, v, w
ta*16b, d, e, g, j, m, n, o, p, r, s, t, u, v, w, x
*ats15b, c, e, f, g, h, k, l, m, o, p, r, t, v, w
*ays15b, c, d, f, g, h, j, k, l, m, n, p, r, s, w
*ill15b, d, f, g, h, j, k, m, n, p, r, s, t, v, w

I only listed the 15-variants words that have four letters. There are also are *ay, *at, *in, ta*, and *ot - I'll let you figure out the words they can make.

One thing I noticed is that the first letter tends to be the easiest to vary. Also, most of these words have A as the its vowel and they all can take a B, G, P, R, or W. That got me wondering, which letter is the most replaceable (which I'll come back to below).

Most variants in total

This graph show the distribution for the total number of variants per word. Of the 69,000 words in my list, 35,000 cannot vary any of their letters. On average, words have 2.06 variants each.

0 35070 12824 5845 3 3266 2302 1731 6 1339 1058 929 9 769 696 543 12 509 413 304 15 277 273 210 18 181 155 122 21 101 87 76 24 79 53 34 27 23 18 11 30 2 5 1 33 1 100 101 102 103 104 105 Number of words Number of variants

The most common word without a variant is people (which makes up 0.19% of words according to my counts).

One word - tat - had 33 variants. Also in the top ten for most variants are pat, tap, tag, sat, gat, lat and sat, which are all themselves variants of tat. The only word in the top 20 without an A is pins.

Word Variants Count
tat bat, cat, eat, fat, gat, hat, kat, lat, mat, oat, pat, rat, sat, vat, wat, tit, tot, tut, tab, tad, tae, tag, taj, tam, tan, tao, tap, tar, tas, tau, tav, taw, tax 33
say bay, cay, day, fay, gay, hay, jay, kay, lay, may, nay, pay, ray, way, shy, sky, sly, soy, spy, sty, sab, sac, sad, sae, sag, sal, sap, sat, sau, saw, sax 31
pat bat, cat, eat, fat, gat, hat, kat, lat, mat, oat, rat, sat, tat, vat, wat, pet, pit, pot, put, pac, pad, pah, pal, pam, pan, pap, par, pas, paw, pax, pay 31
tap cap, dap, gap, hap, lap, map, nap, pap, rap, sap, wap, yap, zap, tip, top, tup, tab, tad, tae, tag, taj, tam, tan, tao, tar, tas, tat, tau, tav, taw, tax 31
cares bares, dares, fares, hares, lares, mares, nares, pares, tares, wares, ceres, cores, cures, cades, cafes, cages,cakes, cames, canes, capes, cases, cates, caves, cards, carls, carns, carps, carts, cared, carer, caret 31
tag bag, dag, fag, gag, hag, jag, lag, mag, nag, rag, sag, wag, zag, teg, tog, tug, tab, tad, tae, taj, tam, tan, tao, tap, tar, tas, tat, tau, tav, taw, tax 31

It's somewhat surprising that the words with the most variants nearly all have three letters, since the more letters a word has, the more opportunities it has for making variants. On the other hand, three letters is the most common word length, so you're more likely to find a valid three letter English word.

This graph shows the most number of variants for words of different lengths. It shows the peak at three letters, though four and five letter words can also have a lot variants.

With one letter there is just one variant, which are the only two one-letter words: a and I.

1 1 2 21 3 33 4 30 5 31 6 19 7 17 8 9 9 8 10 5 11 2 12 2 13 2 14 1 15 1 16 1 0 5 10 15 20 25 30 35 Maximum variants Word length

There are two sets of 16-letter words with one variant each, and they are counterterrorism / counterterrorist, and nationalizations /rationalizations.

Words with variants every position

It's not too surprising that a three letter word can have every one of its letters changed (separately) to make three other words, e.g. bin can change to tin, bun and bit. But what's the longest word that can have each of its letters changed? I found 11 six-letter words:

Word Variants
balled calledbilledbailedbalkedballadballet
canter bantercentercartercancercantorcanted
shores choresscoressharesshovesshortsshored
shares charesstaresshoresshapessharksshared
prises crisespoisesprosespricesprismsprised
paster masterposterpatterpasserpastorpastel
spices apicesslicesspacesspikesspicksspiced
coster fostercastercottercosiercostarcosted
bailer mailerboilerballerbaiterbailorbailey
chases phasesceaseschoseschafeschasmschased
limped pimpedlumpedlippedlimnedlimpidlimper

For the variant words in this table, I picked the most common variant at each position.

Swappable letters

Looking at these variant words, it seems that in many cases the letter can be swapped is a T, P, S or B., so I thought I'd take a look at which letters were most "swappable".

The table below shows the number of times each letter can be swapped. For example, there are over 12,000 words where the S can be swapped for another letter, but only 23 words where the Q can be swapped.

s 12188 d 11328 r 10873 t 10089 l 7909 p 7581 n 6743 e 6641 m 6425 a 6299 b 6297 c 5956 g 5012 h 4979 o 4968 w 4738 f 4463 i 4272 k 3575 y 3505 u 3297 v 2467 j 1592 z 975 x 755 q 23 0 2000 4000 6000 8000 10000 12000 14000 Number of swaps Letter

I makes sense that there are more words where the S can be swapped compared to Q, since there are more Ss in words than Qs. So I divided the number swaps by the number of times each letter appears in my word list. This gives a percentage for the number of times each instance of a letter can be swapped on average.

j 150% w 95% k 65% b 60% f 57% d 52% z 48% x 48% p 48% v 44% m 43% y 41% h 41% g 30% t 28% l 28% c 28% r 27% s 26% u 18% n 18% a 15% o 15% e 11% i 9% q 2% 0% 20% 40% 60% 80% 100% 120% 140% 160% Percent swappable Letter

Now we can see that S has a relatively low "swapability", while Q remains at the bottom of the table.

Amazingly, J has a swappability of over 100% percent; on average, each letter J can be swapped for 1.5 other letters. This table shows the distribution of the number for variant for each word with a J. Of the 1052 words that have a J, 400 have at least one variant. Two, which we've seen before, have 15 variants: jin and taj.

0 652 133 2 55 42 4 32 28 6 24 18 8 15 17 10 16 6 12 4 4 14 4 2 1 4 16 64 256 1024 Number of words Number of variants

Swapping Q

At the other end of the scale is Q. Of the 325 possible letter swaps, only 19 cannot convert at least one English word into another. All of these impossible swaps involve the letter Q. In other words, there are only 6 letters that you can swap a Q for. Even then, fakir / faqir is a bit of a cheat, since it's the same word spelt two different ways.

Letter Example word pair
Bbuilt, quilt
Dduality, quality; duelling, quelling
Ffuelling, quelling
Gplague, plaque; guilt, quilt
Kfakir, faqir
Squit, suit; quite, suite

Letter swap pairs

Once I had a list of every pair of letters that could be swapped to give two valid words, I looked at which swaps were most common. The table below shows the top ten most common pairs of letters that can be swapped for each other. It includes an example of two common words which can be interconverted by that letter swap. The top pair is D and S, which can be swapped in 2951 words (giving rise to 2951 different valid words).

The words I picked for my examples are the ones that had the highest (geometric) mean frequency in my word counts. I think this quite nicely shows common examples of the letter swap. I like how the two shortest words in English (a and I) can be swapped.

Letter pair Variants Example word pair Example word frequency
d - s 2951 had, has 0.33%
d - r 2381 head, hear 0.03%
r - s 1382 there, these 0.21%
a - o 1351 an, on 0.53%
a - i 1212 a, i 1.71%
a - e 1208 and, end 0.37%
l - r 1037 light, right 0.06%
s - t 943 she, the 1.60%
a - u 928 as, us 0.27%
e - o 847 new, now 0.17%

Comments (3)

Rory Sellers on 23 Jan 2020, 12:42 p.m.

Very cool. I found b-g many years ago. Much later I stumbled upon bl-nder. I’ll bet you also think heteronyms are sexy. My favorite is unionized.

Rory Sellers on 23 Jan 2020, 12:49 p.m.

Another cool heteronym is THOU because both “heteronemes” (my neologism for the variants) are common (and in Webster’s 3rd) but few people guess the version with the “hard” T-H (i.e., thousand) when only shown the written word w/o hearing it spoken. But I applaud your love of our language.

Rorick (Rory) Sellers on 23 Jan 2020, 12:54 p.m.

Very relevant to Scrabble!

Tags

language