Tuesday, May 20, 2014

The five commandments (and fifteen footnotes) of data visualization

The five[1]  commandments[2]  of data visualization[3] 

I'm nobody special in the world of data viz; I have no profound observations or innovations to add to those of the likes of Edward Tufte, Hans Rosling, Hadley Wickham or Mike Bostock; but I think I have a little common sense and boots-on-the-ground experience when it comes to the more mundane, journeyman work of making a PowerPoint slide and being proud it doesn't use Comic Sans[4] . (I use Python now, I'm never going back.) By all means, if you have the time and luxury, concern yourself with data-to-ink ratio[5] ; but before that point, here are a few tips to help ensure there's any ink at all.[6] 

(Comments, suggestions, dissenting opinions and, especially, corrections are very welcome, don't be shy. Yes, I'm saying don't be shy to the Internet. Stand back.)

1. A graph is like a paragraph.

A visualization should have more to say than a sentence or a sparkline, but less to say than a short story or, well, the raw data itself. A data visualization needs to strike a balance between saying too much and saying too little.[7]  A good rule of thumb I use is: if one really well-written footnote might help someone understand the graph a bit better, I've done my job right, and often I don't even end up using that footnote. If the footnote is absolutely necessary or two footnotes present themselves, or if not even a five-year-old would see any value in a footnote whatsoever, then maybe the scope of the graph is wrong.

2. Visualization is translation

The Italians have a saying: "Traddutore, traductore", roughly "Translation is treason". Creating a visualization is translating raw data into another medium, and it involves loss of information, and it involves choices, hard choices, desperate choices [8] . Think of it as describing a movie to your significant other. ("I know you don't like action movies, but it was so cool when Schwarzenegger threw this grenade..."[9] ) You can't describe the entire movie, nor does your audience want you to. They trust you to make the choices necessary to get the pattern hidden in the data across with skill, clarity and integrity. Which leads me to...

3. Visualize with integrity

When you were a child, someone must have told you honesty is the best policy[10]  , and they were right. When in doubt, act with integrity. Actually, when not in doubt too. Actually, especially when not in doubt: if you're not wondering if you're making the right choices when you're deciding how to visually present your data, you're doing it wrong. [11]  Basically everything I need to say about integrity, Fox News has said more eloquently (and just slightly less intentionally) than me[12] :

4. The second-worst outcome is for someone not to understand your graph.

There are lots of very complicated visualizations out there. They certainly have their place, but they probably should be tackled by the elite among us.[13]  But even with more modest goals in a visualization, it's quite possible for the message to get muddled in the medium [14] . Boxplots are brilliant. I absolutely love boxplots. But so few people understand them, more time gets spent explaining how they work than on trying to understand the actual data. Of course, it all depends on your audience, and your mileage may vary.

5. The worst outcome is for someone to MISunderstand your graph.

What's obvious to you as the creator might not be so to your audience. Always ask yourself what a fresh pair of eyes might understand from your graph. If necessary, go and find a fresh pair of eyes. [15]  When someone misunderstands what your visualization presents, no matter how obvious you think it is, you failed. ("You had one job!") You are intimately familiar with your data: others are not, and first erroneous impressions are hard to erase, and people get upset when they need to rearrange their brains. Don't scoff, you do it too.

6. Don't limit yourself to your first idea.

Like "This will be a list with five items on it."[16] 


[1] I just read an article that claimed the most successful web content is list-related, has an odd number of items in the list (exception: 10 items), turn browsers into subscribers and has a (relevant) photo of Jennifer Lawrence. I've got at least three of those criteria covered. Return to article.

[2] I really don't have any basis to be commanding anyone, but "suggestments" just didn't have the same ring. Return to article.

[3] That's what you're supposed to say now, instead of "graphing". For one thing, graphs (which I used to draw on graph paper with a ruler and a bendy curve thing) are now called charts; graphs look like this: (O brave new world that has such network diagrams in't.) Return to article.

[4] Go for Papyrus or Mistral instead. Return to article.

[5] But if you want a lesson to keep in mind from the gurus, you could do a lot worse than Edward Tufte's idea of data-to-ink ratio. In most human endeavours, less is more: art, design, psychology, John Travolta movies, you name it. Return to article.

[6] Or, in most cases, pixels or toner. Don't be pedantic, you know what I mean. Return to article.

[7] Here's where a more trite writer would whip out pairs of words that are oh-so-similar-yet-oh-my-god-crucially-different: Simple, not simplistic. Complex, not complicated. Focused, not targeted. Zuul, not Dana. Return to article.

[8] I would firmly like to state, so that there is no misunderstanding in the future, that I fully support a data visualizer's right to choose. (And they said I could never work an abortion joke into a data viz how-to list. Who's laughing now? Um, probably nobody, it was a really bad joke.) Return to article.

[9] Many times I sat through my girlfriend giving me these rundowns, yet she wouldn't even listen to me gush about the Zooey Deschanel movie I'd just seen in the other cinema. Return to article.

[10] Ironically, as adults, we tend to use the word "policy" mostly to describe the work of politicians, insurance agents and businesspeople. I'm not saying they're not honest, but... actually, yeah, that's pretty much what I'm saying. Then again... I might be lying. Return to article.

[11] If you're agonizing about the choices so much that you curl up into a foetus of self-doubt and vow never to make so much as a bar chart ever again... well, you're doing it wrong then, too. Return to article.

[12] I have seen many equally egregious though perhaps less deliberate non-zero graph origins over the years: don't do it if you don't have a good reason to, and if you have an excellent reason to, don't do it just the same. Return to article.

[13] Preferably when very, very drunk. I'm looking at you, Nathan Yau. (I don't know anything about Nathan Yau or his personal habits, but God, that was fun to write. I'll apologize for my nasty whimsy by suggesting you buy his books.) Return to article.

[14] McLuhan might've moaned at my misuse of a memorable maxim, mmm? Return to article.

[15] Hopefully still attached to a head, they tend to be more useful that way. Return to article.

[16] That goes double for footnotes. Return to article.


Post a Comment

Please leave comments & corrections here. Courtesy is appreciated.

Popular Posts

Scroll To Top