Introduction
In the village where I live there are two pubs, quite close to one another, one called The Bull, the other called The Bell. I like imagine someone opening pubs called The Ball, The Bill, and The Boll. (I wasn't 100% sure boll was a word, but I once did a summer project on bollworm, so it seemed possible; it turns out boll means the round pods of certain plants, like cotton, which is what bollworm eat.)
I've also always liked how you sing a song, sang a song, and then the song was sung. It feels like seng should also be word, maybe an adjective meaning something like musical. I wish more words followed this pattern. Maybe we could ring a rong or bring a brong.
Anyway, all this got me thinking, what English words can you swap its vowel(s) for any other vowel and still have a valid English word?
Vowel variants
I set a program to search through all the words in my word list, replacing any of its As for each of the other vowels and checking whether the resulting variants were also words on my list. The longest words I found every vowel variants valid had seven letters:
- balling, belling, billing, bolling, bulling
- blander, blender, blinder, blonder, blunder
- patting, petting, pitting, potting, putting
The blander variants are my favourite since they all seem like reasonable words (unlike bolling and bulling). The patting ones seem like cheating, as they are verb forms: you can remove the last four letters to get the base word. I seem to remember I once found the pat, pet, pit, pot, put group when thinking about this problem in a particularly boring lecture.
With the blander variants, removing the final -er gets you a simpler word in all cases, except for blunder - there is no blund. Also the -er ending is used in two way: one is to turn an adjective into a comparison (blander and blonder), the other to turn a verb into a noun for something that does the verb (blender and blinder).
Shorter variants
There are two groups with six letters: patted (which again seems like cheating), and masses (musses is pretty rare, only appearing 16 times in my word list).
With five letters, we have: mares, dally, massy (?), packs, balls, tales, mates, bands.
Longer variants
If I use a more lenient word list (the UNIX word list at /usr/share/dict/words), then I find one set of variants with 9 letters: massiness. None of the words, massiness, messiness, missiness, mossiness or missiness appear in my word list, but my spell check accepts messiness and mossiness which seems reasonable.
The Unix word list has no variants with 8 letters, and for 7 letters it has only unstack and chacker. So the word list includes the words unsteck and chacker, but not blonder, patting or balling (because they are verb forms, I guess). And that's why I'm not using that word list.
Varying any letter
If we allow any letter in a word to be replaced by any other letter, then each word can have up to 25 variants for each of its letters.
Most variants at a given position
Unsurprisingly, there aren't any words with a letter that can be replaced by all other letters. The best I found was three sets of words with 15 variants.
For example, the one set of words which can have 16 different letters in its first position is: ain, bin, din, fin, gin, hin (an ancient Hebrew unit of liquid measure apparently), jin, kin, lin, pin, rin, sin, tin, vin, win and yin.
Word | Count | Letter options |
---|---|---|
*in | 16 | a, b, d, f, g, h, j, k, l, p, r, s, t, v, w, y |
*at | 16 | b, c, e, f, g, h, k, l, m, o, p, r, s, t, v, w |
ta* | 16 | b, d, e, g, j, m, n, o, p, r, s, t, u, v, w, x |
*ats | 15 | b, c, e, f, g, h, k, l, m, o, p, r, t, v, w |
*ays | 15 | b, c, d, f, g, h, j, k, l, m, n, p, r, s, w |
*ill | 15 | b, d, f, g, h, j, k, m, n, p, r, s, t, v, w |
I only listed the 15-variants words that have four letters. There are also are *ay, *at, *in, ta*, and *ot - I'll let you figure out the words they can make.
One thing I noticed is that the first letter tends to be the easiest to vary. Also, most of these words have A as the its vowel and they all can take a B, G, P, R, or W. That got me wondering, which letter is the most replaceable (which I'll come back to below).
Most variants in total
This graph show the distribution for the total number of variants per word. Of the 69,000 words in my list, 35,000 cannot vary any of their letters. On average, words have 2.06 variants each.
The most common word without a variant is people (which makes up 0.19% of words according to my counts).
One word - tat - had 33 variants. Also in the top ten for most variants are pat, tap, tag, sat, gat, lat and sat, which are all themselves variants of tat. The only word in the top 20 without an A is pins.
Word | Variants | Count |
---|---|---|
tat | bat, cat, eat, fat, gat, hat, kat, lat, mat, oat, pat, rat, sat, vat, wat, tit, tot, tut, tab, tad, tae, tag, taj, tam, tan, tao, tap, tar, tas, tau, tav, taw, tax | 33 |
say | bay, cay, day, fay, gay, hay, jay, kay, lay, may, nay, pay, ray, way, shy, sky, sly, soy, spy, sty, sab, sac, sad, sae, sag, sal, sap, sat, sau, saw, sax | 31 |
pat | bat, cat, eat, fat, gat, hat, kat, lat, mat, oat, rat, sat, tat, vat, wat, pet, pit, pot, put, pac, pad, pah, pal, pam, pan, pap, par, pas, paw, pax, pay | 31 |
tap | cap, dap, gap, hap, lap, map, nap, pap, rap, sap, wap, yap, zap, tip, top, tup, tab, tad, tae, tag, taj, tam, tan, tao, tar, tas, tat, tau, tav, taw, tax | 31 |
cares | bares, dares, fares, hares, lares, mares, nares, pares, tares, wares, ceres, cores, cures, cades, cafes, cages,cakes, cames, canes, capes, cases, cates, caves, cards, carls, carns, carps, carts, cared, carer, caret | 31 |
tag | bag, dag, fag, gag, hag, jag, lag, mag, nag, rag, sag, wag, zag, teg, tog, tug, tab, tad, tae, taj, tam, tan, tao, tap, tar, tas, tat, tau, tav, taw, tax | 31 |
It's somewhat surprising that the words with the most variants nearly all have three letters, since the more letters a word has, the more opportunities it has for making variants. On the other hand, three letters is the most common word length, so you're more likely to find a valid three letter English word.
This graph shows the most number of variants for words of different lengths. It shows the peak at three letters, though four and five letter words can also have a lot variants.
With one letter there is just one variant, which are the only two one-letter words: a and I.
There are two sets of 16-letter words with one variant each, and they are counterterrorism / counterterrorist, and nationalizations /rationalizations.
Words with variants every position
It's not too surprising that a three letter word can have every one of its letters changed (separately) to make three other words, e.g. bin can change to tin, bun and bit. But what's the longest word that can have each of its letters changed? I found 11 six-letter words:
Word | Variants | |||||
---|---|---|---|---|---|---|
balled | called | billed | bailed | balked | ballad | ballet |
canter | banter | center | carter | cancer | cantor | canted |
shores | chores | scores | shares | shoves | shorts | shored |
shares | chares | stares | shores | shapes | sharks | shared |
prises | crises | poises | proses | prices | prisms | prised |
paster | master | poster | patter | passer | pastor | pastel |
spices | apices | slices | spaces | spikes | spicks | spiced |
coster | foster | caster | cotter | cosier | costar | costed |
bailer | mailer | boiler | baller | baiter | bailor | bailey |
chases | phases | ceases | choses | chafes | chasms | chased |
limped | pimped | lumped | lipped | limned | limpid | limper |
For the variant words in this table, I picked the most common variant at each position.
Swappable letters
Looking at these variant words, it seems that in many cases the letter can be swapped is a T, P, S or B., so I thought I'd take a look at which letters were most "swappable".
The table below shows the number of times each letter can be swapped. For example, there are over 12,000 words where the S can be swapped for another letter, but only 23 words where the Q can be swapped.
I makes sense that there are more words where the S can be swapped compared to Q, since there are more Ss in words than Qs. So I divided the number swaps by the number of times each letter appears in my word list. This gives a percentage for the number of times each instance of a letter can be swapped on average.
Now we can see that S has a relatively low "swapability", while Q remains at the bottom of the table.
Amazingly, J has a swappability of over 100% percent; on average, each letter J can be swapped for 1.5 other letters. This table shows the distribution of the number for variant for each word with a J. Of the 1052 words that have a J, 400 have at least one variant. Two, which we've seen before, have 15 variants: jin and taj.
Swapping Q
At the other end of the scale is Q. Of the 325 possible letter swaps, only 19 cannot convert at least one English word into another. All of these impossible swaps involve the letter Q. In other words, there are only 6 letters that you can swap a Q for. Even then, fakir / faqir is a bit of a cheat, since it's the same word spelt two different ways.
Letter | Example word pair |
---|---|
B | built, quilt |
D | duality, quality; duelling, quelling |
F | fuelling, quelling |
G | plague, plaque; guilt, quilt |
K | fakir, faqir |
S | quit, suit; quite, suite |
Letter swap pairs
Once I had a list of every pair of letters that could be swapped to give two valid words, I looked at which swaps were most common. The table below shows the top ten most common pairs of letters that can be swapped for each other. It includes an example of two common words which can be interconverted by that letter swap. The top pair is D and S, which can be swapped in 2951 words (giving rise to 2951 different valid words).
The words I picked for my examples are the ones that had the highest (geometric) mean frequency in my word counts. I think this quite nicely shows common examples of the letter swap. I like how the two shortest words in English (a and I) can be swapped.
Letter pair | Variants | Example word pair | Example word frequency |
---|---|---|---|
d - s | 2951 | had, has | 0.33% |
d - r | 2381 | head, hear | 0.03% |
r - s | 1382 | there, these | 0.21% |
a - o | 1351 | an, on | 0.53% |
a - i | 1212 | a, i | 1.71% |
a - e | 1208 | and, end | 0.37% |
l - r | 1037 | light, right | 0.06% |
s - t | 943 | she, the | 1.60% |
a - u | 928 | as, us | 0.27% |
e - o | 847 | new, now | 0.17% |
Comments (3)
Rory Sellers on 23 Jan 2020, 12:42 p.m.
Very cool. I found b-g many years ago. Much later I stumbled upon bl-nder. I’ll bet you also think heteronyms are sexy. My favorite is unionized.
Rory Sellers on 23 Jan 2020, 12:49 p.m.
Another cool heteronym is THOU because both “heteronemes” (my neologism for the variants) are common (and in Webster’s 3rd) but few people guess the version with the “hard” T-H (i.e., thousand) when only shown the written word w/o hearing it spoken. But I applaud your love of our language.
Rorick (Rory) Sellers on 23 Jan 2020, 12:54 p.m.
Very relevant to Scrabble!