Data Visualization 101: The Most Important Rule for Developing a Graph

I suspect everyone has seen a bad graph, a mess of bars, lines, pie slices, or what have you that you dreaded having to look at. Maybe you have even made one, which you look at today and wonder what on earth you were thinking.

These graphs violate the most basic graph-making rule in data visualization:

A graph is like a sentence, expressing one idea.

This rule applies to all uses of graphs, whether you are a data scientist, data analyst, statistician, or just making graphs for your friends for fun.

In grade school, your grammar teachers likely explained that a sentence, at its most basic, expresses on thought or idea. Graphs are visual sentences: they should state one and only one thought or idea about the data.

When you look at a graph, you should be able to say, in one sentence, what the graph is saying: such as “Group A is greater than Group B,” or “Y at first improved but is now declining.” If you cannot, then you have yourself a run-on graph.

For example, the above graph is trying to say too many statements: trying to depict the immigration patterns of twenty-two different countries over the course of nearly a century. There are likely useful statements in this data, but the representation as one graph prevents a viewer/reader from being able to easily decipher them.

Likewise, this graph shows way too many lens sizes to meaningfully express a single, coherent idea, leaving the reader/viewer struggling to determine which fields to focus on.

Potential Objection #1: But I have more to say about the data than a single statement.

 Great! Then provide more than one graph. Say everything you need to say about the data; just use one graph for each of your statements.

            Don’t fall into the One-Graph-to-Rule-Them-All Fallacy: trying to use one graph to express all your statements about the data that ends up a visual mess of incomprehensibility. Create multiple easy-to-read graphs where each graph demonstrates one of your points at a time. Condensing everything into one graph just prevents your viewers from determining what you have to say at all.

Bar Chart, Chart, Statistics, Analytics, Data Analytics
One-Graph-to-Rule-Them-All Fallacy: Trying to use one graph to express all your thoughts about the data that ends up a visual mess of incomprehensibility
Statistics, Graph, Chart, Data, Information, Growth
Instead, use one graph for each of your points

Potential Objection #2: I want the viewers to interpret the findings for themselves, not just impart my own ideas/conclusions.

Fair point. When presenting/communicating data, there is a time for showing your own insights and a time to open-endedly display the information for your viewers/readers to interpret for themselves. Graphs are tools for the former, and for the latter, use tables. Tables, among other potential uses, convey a wide scope of information for the reader/viewer to interpret on their own.

Remember that first example above about U.S. immigration from various parts of Europe? A table (see below) would convey that information much more easily and allow readers to track whatever places, patterns, or questions they would to learn about. Are you in a situation where you would like to report a large amount of information that your readers can use for their own purposes? Then tables are a much better starting point than graphs.

 Some situations require that I lean towards sharing my insights/analysis and others towards encouraging my readers/viewers to form their own conclusions, but since most situations require a combination of the two, I generally combine graphs and tables. I try, when I can, to put smaller tables in the document or slides themselves and, when I cannot, include full tables in an Appendix.

Potential Objection #3: My main idea/point has multiple subpoints.

            Many sentences have multiple subpoints needed to express the single idea as well, which does not prevent the sentence structure from meaningfully capturing those ideas. The fancy grammar word for such a subpoint is a claus. Even though some sentences are simple and straightforward with only one subject and predicate, many (like this very sentence) require multiple sets of subjects and predicates to express its thought.

            Likewise, some graphical ideas require multiple subordinate or compounded subpoints, and there are types of graphs that allow this. Consider Joint Plots, like the one below. To present the relationships and combined distribution between the two variables adequately, they also display each variable’s individual distributions above and to the right. That way, the viewer can see how both distributions might be influencing the combined distribution. Thus, it displays each variable’s distribution on the side like a subordinate clause.

The darker colors in this graph signify a higher density of data points, showing the combined joint distribution of the variables.

These are advanced graphs to make, since like with multi-part sentences, one must present the subpoints carefully to make clear what the main point is. Multi-part sentences, likewise, require carefulness in how to organize multiple clauses cohesively. I intend to write a post later describing how to develop these multi-part graphs in more detail.

The general rule still applies for these more complicated graphs:

Can you summarize what the graph is saying in one coherent sentence?

If you cannot, do not use/show that graph. Our brains are very good at intuiting whether a sentence carries one thought, so use this to determine whether your graph is effective.

Photo/Graph credit #1: kreatikar at https://pixabay.com/illustrations/statistics-graph-chart-data-3411473/

Photo/Graph credit #2: Linux Screenshots at https://www.flickr.com/photos/xmodulo/23635690633/

Photo/Graph credit #3: Andrew Guyton at https://www.flickr.com/photos/disavian/4435971394/

Photo/Graph credit #4: TymonOziemblewski at https://pixabay.com/illustrations/bar-chart-chart-statistics-1264756/

Photo/Graph credit #5 (the first graph again): kreatikar at https://pixabay.com/illustrations/statistics-graph-chart-data-3411473/

Photo/Graph credit #6: Michael Waskom provides a helpful tutorial that formed the inspiration behind the random graph I created.

Hello, my thoughts are...