Spotting Familiarity: The logic behind car license plate coincedences

I was recently driving between Las Vegas and Los Angeles and to pass the time during the fairly boring desert drives I was looking at car number plates.

I was noticing a suprising number had recognisable three letter words on, which got me thinking about how often this should happen. Let’s see what’s going on.

The theoretical answer

I was mainly driving through California which use the 1ABC234 number plate format and as far as I can tell, the letters are assigned randomly.

There are 26 x 26 x 26 possible combinations for the three letters. The easiest way to think about this is in stages: AA, AB, AC.. has 26 combinations, then BA, BB, BC.. also has 26 and so on. So the number of ways that two letters can appear is 26 x 26 = 676. Adding a third letter is 26 x 26 x 26 = 17576 by the same logic.

Putting in three “wildcards” into this Scrabble three letter word dictionary returns 1015 three letter words in the English language.

That means that 1015 / 17576 = 0.0577 = 5.8% of car number plates form words.

Done.

The actual answer

That’s a boring answer. Let’s think about this a bit. My real question is more like:

How often will I see a number plate with a three letter word

The words in bold change my answer.

“I”

I don’t know every three letter word in the English language to recognise every one.

I made a tiny Python program that would print each word in turn and respond to a Y/N input from me as to whether I recognized the word or not.

wd = "C:"
with open(wd+"\\plates.txt") as f:
    words = [x.strip('\n') for x in f.readlines()]
yes = 0
no = 0
for w in words:
    print(w)
    inp = input()
    if inp == 'a':
        yes += 1
    else:
        no += 1
print("Yes: {0} No: {1}".format(yes, no)

I recognised 628 / 1015 = 0.619 = 62% of the words. This means that in fact the proportion of word number plates is closer to 628 / 17576 = 0.0357 = 3.6%

“how often”

We now know a percentage of cars that have a recognisable three letter combination. We can use this to work out how often I should have been seeing one on the road.

The road I was driving was the I15 route highlighted in blue.

route

Finding out any sort of official traffic flow numbers on this route is tricky. This website offers a lot of information although seemingly outdated, states that 300 000 vehicles use a more busy stretch of the interstate each day, which works out at roughly 200 cars per minute. This sounds like a little too much to be an average so we’ll cut it to 150 cars per minute.

Let’s assume I’m in the slow lane doing 65 whereas all other traffic is doing 70 (and therefore overtakes me). The relative traffic flow around me is (5/ 70) * 150 = 11 cars per minute.

Therefore using the 3.57% word rate found above, we are essentially performing a negative binomial distribution with r = 1 and p = 0.0357, the expected value of which is  r / p = 1 / 0.0357 = 28, i.e after 28 cars we expect to have seen a number plate with a word on. So in terms of minutes until we see one, this is 28 cars / 11 cars per minute = 2 minutes 33 seconds.

Pretty frequent!

It’s worth pointing out here that the traffic flow can vary hugely in different places, as will your knowledge of English. So the general formula,  with T being traffic per minute and W number of three letter words known is:

Expected wait time = (1 / (W/17576)) / T  = 17576 * T / W

Running a simulation

Even though we now know the answer, let’s see it in action with a Python simulation.

wd = "C:"
with open(wd+"\\rawplates.txt") as f:
    words = [x.strip('\n') for x in f.readlines()]
cars = ["ford", "audi", "jaguar", "renault", "lexus", "bmw", "citroen", "mini", ""]
colors = ["red", "blue", "green", "black", "yellow", "white"]
letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
found = False
runs = 0
while not found:
    car = random.choice(cars)
    color = random.choice(colors)
    plate = random.choice(letters) + random.choice(letters) + random.choice(letters)
    print(color + " " + car + " with plate including " + plate)
    runs += 1
    for w in words:
        if w == plate:
            found = True
print(runs)
print(str(round(runs/11)) + "m" + str(round(runs % 11 * (60/11))) + "s")

Which gives the following example output:

green bmw with plate including SCJ
yellow renault with plate including VSK
yellow lexus with plate including WEB
3
0m16s

That yellow lexus came along pretty quickly to give us our first match! How about again?

black renault with plate including YKI
white renault with plate including KMH
blue mini with plate including MVP
red citroen with plate including WIW
yellow audi with plate including TGJ
blue lexus with plate including XCB
red audi with plate including VPI
blue jaguar with plate including WLD
black lexus with plate including SNY
white lexus with plate including BMJ
white ford with plate including MGJ
yellow audi with plate including RQL
green lexus with plate including GZD
blue bmw with plate including LBX
blue jaguar with plate including NYL
white lexus with plate including JJJ
yellow lexus with plate including MOY
blue ford with plate including XIB
blue audi with plate including SHI
black audi with plate including MWH
yellow ford with plate including OIL
21
2m55s

Much longer before the yellow ford gave us OIL! And actually closer to the theoretical we worked out earlier.

There’s a lot of variance here so the below is an ordered result of doing it 50 times, checking against a list of only the words that I know:

cargraph

This represents a 2 hour 11 minute journey. The average wait until a word is 2m38s (29 cars) with the median being 2m16s (25 cars).

And finally, the longest we have to wait on this 2hr+ journey is just 7m 27s (82 cars).

No wonder, then, that I was seeing so many words on my trip. It’s actually very likely indeed.

And now we are truly done.

Extension

It doesn’t necessarily need be words that caught my eye. Other three letter combinations could just as easily catch my attention which could change the above.Three recurring letters e.g AAA is pretty interesting, or a number plate MJB (my initials) would also be enough. Then there there are number plates that fall outside of the conventional system; e.g personalised.  This means that there is potentially a larger and more varied pool of eligible plates.

One thought on “Spotting Familiarity: The logic behind car license plate coincedences

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s