Sex ratio is the clearest indicator of bias in the baby names dataset

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
Sex ratio of male to female births in the social security database

I’ve written before about how the U.S. Social Security baby names dataset, despite being trotted out by plenty of commercial websites aimed at partents, needs to be taken with a grain of salt, and a whole shaker of salt before the 1930s. This is just about the clearest graphical demonstration I’ve come up with.

It’s impossible to quantify race ratio for the dataset, but since only certain occupations were allowed at first, and they excluded most of the occupations that were available for black men and women (for example, day labor and domestic work), it’s safe to say the database is severely unbalanced in that regard as well.

Despite having an extensive work history in biology, I never knew that more male babies are born than female babies, a univeral phenomenon across the world (exacerbated by sex-selective abortions in some regions, unfortunately).

I’ve updated my previous Tableau Public storyboard on the limitations of the Social Security dataset to include this tidbit.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Best Websites to Learn Coding

Best Websites To Learn Coding (For Free) Seeing square brackets [ ] and mystical words in your computer like func, private and void can be