Monday, April 28, 2014

Humans of Earth

Below are both HTML5 and somewhat letterboxed embedded YouTube versions of the video; if the first one doesn't work, try the second!


I caught the bug to make this photo-video-morph while looking at some great albums in Reddit's Human Porn subreddit (totally Safe For Work, not actual pornography) and Pinterest. If you would like to know more about the individual photos including their sources and see them uncropped, here's an album.
The morphing was done using Fantamorph, and ffmpeg was used to convert to HTML5-friendly VP8. My first thought was to make an animated GIF, but Photoshop crashed halfway through calculating its predicted size! Trust me when I say it would have been several times this 6 MB .webm file.
The post title is a riff on the wonderful photo blog Humans of New York.
(Any resemblance to a Michael Jackson video is purely coincidental and only pointed out to me after the fact by my much more savvy sister).

Tuesday, April 22, 2014

The stuttering women of romance comics, 1950s-1970s

When I first experienced Roy Lichtenstein's pop art masterpiece M-Maybe, I instantly understood the visual shorthand: the subject is stuttering over that initial "m" because she is emotionally overwrought, probably with the added effect of cognitive dissonance: she can't really believe her beau was a no-show because he suddenly caught an illness. Lichtenstein drew this motif from romance comics, which were huge after WWII but could never really cope with women's lib and faded out by the late '70s. If you flip through some of these vintage comics, you're struck not only by the incessant pathos of these people's first-world problems, but by the same devices being used over and over: the single tear, throwing oneself on one's bed, holding one's hand to face or temple, the downcast thousand ten-yard stare -- and, of course, the stutter. Why, these poor ladies even lose control of their diction in their interior monologues, when their mouths aren't moving!

Men stuttered sometimes too, of course, but usually when they were flummoxed by the emotional behavior of women, to wit:

Ah, Robin. So chaste, so innocent, so totally not at all homoerotic in those green scaly hotpants* living with a man named Bruce.

If you like vintage comics (especially taken out of context), comicallyvintage's tumblr has over a thousand of them! (I should know, I looked at most of them in search of young ladies with temporary speech impediments.) The old-fashioned use of the words "dick", "gay" and especially "boner" are a chuckle riot.

* Actually, the tights or bare legs, domino masks and capes were a visual shorthand that was totally understandable to the audiences of the '30s and '40s: the circus strongman, the ultimate expression of butch masculinity. It did not stand the test of time.

Wednesday, April 16, 2014

How not to make an infographic

I became unexpectedly unemployed yesterday, and since I don't believe in long mourning periods (or poverty) I started my job search right away, and came across this infographic. Let's be fair: there are far worse infographics out there. But my version of human nature somehow gets more perturbed by almost-competence than by abject failure; I suppose, knowing nothing about the creator, that in my head I'm blaming them for not trying hard enough. Well, if the creator happens to come across this, I totally don't want to hurt your feelings (much), you just need a little more practice, as do we all. 

Monday, April 14, 2014

Google ngrams of 'google' and 'ngram'

I like to test data tools and data sets with "edge cases", a fancy word for using them in ways they were not designed to be used (which is, by the way, the definition of hacking). It's informative to see how far things will bend before they break -- and the good thing with data is it's easy to un-break.

Rare occurrences make good edge cases; so do recursive cases, i.e. run a data tool on itself. We looked briefly at the Google Ngram Viewer a couple of weeks ago; what happens if we determine Google Ngrams of the words "Google" and "Ngram"? (By the way, I like to call this kind of approach 'selfremetacursironiferentiality'. I'm sure it will catch on one day so I look like less of a dork when I say it.)

Of course, the frequency of the word "google" after the company was incorporated in September 1998 is predictable: it becomes a very common word (and is even adopted into that hallowed club, The Verb, where Xerox briefly rested and from which Kleenex was inexplicably barred). The only interesting thing about its 2001-2008 (where the data set ends) rise is that it's pretty linear; I would have intuited either positive or negative curvature, but don't forget this is the word's appearance in published, printed matter, not in conversation.

Let's have a look at "google" and "ngram" (both case-insensitive) from 1880 to 2000, before the rise of Google and with a vertical axis about fifty-fold lower so we can see the edge cases (in my experience, the more jagged a line is*, the more interesting it is.**)

That's a lot of use of the word "google" before the company we all know and... well, know... existed. Using Google Books, the mystery is easy to solve: there was a newspaper comic strip character named Barney Google, and a lot of anthologies were published over the years. Not unusually, the technical term "ngram" lags far behind a term used in pop culture; however, it is surprising that around the dawn of the 20th century a term used in computational linguistics would turn up. Again, Google Books solves the mystery: this is an artifact of a lot of directories of names from around this time being poorly scanned; the name "Ingram" is being recorded as "I, ngram" (which sounds like a terrible book title).

The moral of this story, as with all data sets too huge to be curated by humans (and, coincidentally, every other Aesop's fable): things are not always what they seem, so we'll be sure to dig a little before drawing conclusions, especially in edge cases. The next time someone brings up over the water cooler how ngrams were being studied in 1902, you can nod to yourself knowingly.

* of course, sometimes that means it's just noise, but I find noise interesting too
*** that's what she said.

Webcomic #3: You're so vain

I had a different webcomic planned for this week, but the shiny orange "Publish" button is more tempting to press when I'm half-asleep than the "Save" button, so this one suddenly got to the head of the line!

For those who aren't up on their 1972 celebrity semi-scandals, here's what the comic is referring to. Edmund J. Mittlebaum is entirely invented; the identity of Carly's spurned love has more incompatible theories than the Kennedy assassination. Taylor Swift claims she knows who it is, which makes sense, because if there's one thing Taylor is tight-lipped about, it's ex-boyfriends.

Monday, April 7, 2014

The meteoric rise of boys' names ending in 'n'

It's been noted before that one of the most striking trends when analyzing American baby names is the rise in popularity of boys' names ending with the letter 'n' over the past few decades. What I haven't seen is a visualization that truly demonstrates the scale of this phenomenon. And for a good reason; it's difficult to show trends over time in 26 variables. So I made this animated GIF of bar graphs; pay attention to the 'n' after the mid-70s.

I was also interested in the trends for each letter; in the GIF above, there's a rise and a fall of names ending in "d" (although the rise ends in the mid-1930s, which I've already explained is problematic due to the way data was collected). So here's a grid of every letter; the scales are not the same ("n" is far more popular than "q", for example) so I've shaded each one so that darker green goes along with most popularity, and the overall trends of each one can be seen:

There's still more that can done with this data; only since 2011 have as many as four of the top ten boys' names ended in 'n', so evidently this is a phenomenon that has carried through more than the top tier of popularity; it would be interesting to see the contributions of different names. I also wonder what some of the peaks and valleys for other names represent, and of course one could always do the same analysis to the last letters of girls' names (let me guess: lots of "a"s), the first letters of either sex, and even middle letters or multi-letter patterns. More to come, unless some other shiny data bauble catches my eye first...

Other posts about baby names:

Wednesday, April 2, 2014

Popular Posts

Scroll To Top