In 2007, arguably the foremost statistician-cum-maven concerning the U.S. Social Security Administration’s baby name database, Laura Wattenberg*, appears to be the first to have noticed quite a dramatic trend, the rise of boys’ names ending in “n” from about 15% in the 1950s to around 35% now.
A few weeks ago, I made an animated gif to visualize this phenomenon, and it became, well, probably not viral per se, but at least antibiotic-resistant:
The problem, I think, is twofold: (a) most people are exposed to baby names as a “Top 10” list, or at best a “Top 100 list”, without context, and these names tell less and less of the whole story In 1950, the Top 10 names made up 33% births, while in 2013 it is less than 9% as names have become more diverse. (Also, in 1950 there are 5,700 unique names in the database, while in 2013 there were 17,800.) And (b) Patterns in the last letter are less obvious to casual analysis than first-letter patterns, especially when the second-last letter varies (EthAn, JasOn, JaydEn, etc.)
Comparing a chart of how often names ending in ‘n’ show up in the Top 10 with their popularity overall shows, except for brief periods, this fad was not reflected in the most popular names (i.e. the black bars are mostly below the red line):
This screams out for a quantile analysis to determine at what level of popularity names ending in ‘n’ was driven, which you will see at the top of this post.
The pre-World War II distribution of the most popular name, “John” being three quintiles above the second-most, “Benjamin”, is interesting, but the most telling pattern is in the top three quintiles after 1950. Gains in popularity in this kind of graph will always show a “rolling” of peaks from lower quintiles to higher, but in this graph, this gain is not symmetrical; most of it was led by the second quintile, and the most popular names still lag behind the overall popularity of ‘n’ names. In other words, U.S. high schools and colleges right now have lots of Ethans, Masons and Jaydens, but they will be most likely to have multiple Michaels and Jacobs.
The underlying phenomenon, I think, is a linguistic one: there just happen to be a lot of names ending in ‘n’ which strike a balance between conformity and individuality that many parents are looking for. Their son can have a name that doesn’t make him stand out, but that isn’t exactly the same as all the rest. The similarity pool simply happens to be much smaller for other popular names: the #1 name for 2013, announced three days ago, was “Noah”. The rest of the “h” names are even more biblical, which is not what everyone wants. (#334 is “Messiah” — no pressure, kid.) Number 2 is Jacob, and after “Caleb” all the other “b” names are uncommon derivatives “Jakob,” “Kaleb”, etc. (#4852 is “Gleb”, which I hope means the typists’ fingers slipped from the adjacent “n” key a few times.)
If anyone is interested in the code (and some nice IPython notebooks that may be of interest even to non-pythonistas) I used to do the last two analyses and make the charts for boys’ names beginning with n, check my sister blog, prooffreaderplus.