I've just been on a data visualisation workshop and we were shown visuals that really blew my mind.
Then the presenter said, "now tell me what the visual is actually telling you, extract the useful facts".
His point was well made. A lot of visuals are very pretty, often entertaining but in giving precise actionable facts they lag behind.
He talked about something called the Data to Ink Ratio.
- = data ink / total ink used to print the graphic
- = proportion of a graphic’s ink devoted to the non-redundant display of data-information
- = 1.0 − proportion of a graphic that can be erased without loss of data-information
The amount of information a human can take in is limited so 3D charts with multiple colours might look pretty but they actually distract from the true information content.
He listed a number of principles
1. Display neither more nor less than what is relevant to your message
2. Do not include visual differences in a graph that do not correspond to actual differences in the data
3. Use the lengths or 2-D locations of objects to encode quantitative values in graphs unless they have already been used for other variables
4. Differences in the visual properties that represent values (that is, differences in their lengths or 2-D locations) should accurately correspond to the actual differences in the values they represent.
5. Do not visually connect values that are discrete, thereby suggesting a relationship that does not exist in the data
6. Make the information that is most important to your message more visually salient in a graph than information that is less important
7. Augment people’s short-term memory by combining multiple facts into a single visual pattern that can be stored as a chunk of memory and by presenting all the information they need to compare within eye span
During the workshop we were given various tests that demonstrated the validity of the principles. One of the exercises was to look at a typical Excel chart and subtract all the irrelevant parts from it.
On a 3D bar chart the stuff that was removed was
- 3D affects, 2D communicates facts better
- Borders around legends, bars and the graph itself
- Tick marks on vertical axis. The human brain is very good at working out differences in length and height and if you want precision you can always go back to the raw data
- Data labels above bars. Again the human brain gives a very good approximation
- Legend key in combination with data labels. You've got the legen in any case[li]
[li]Vertical background lines. They serve no purpose on a bar chart
- Grey background
- Bold/Underline text on items not requiring emphasis. Don't use bold on axis labels.
Apart from extending the life of your inkjet cartridge the facts in the resulting chart simply jumped out at me. Less really was more!
There is real science behind the points raised. One example given was the image used prior to the Challenger
disaster that was used to recommend that the shuttle not be flown in excessively cold conditions. The relevant information is in the graph if you spend the time to look at it. If a simple plot of faults vs temperature had been plotted the risk of launch in low temperatures would have jumped out.