Every skeptic’s new favorite website is Spurious Correlations. The site is brilliant – it mines multiple data sets (such as causes of death, consumption of various products, divorce rates by state, etc.) and then tries to find correlations between different variables. The results are often hilarious.
The point of this exercise is to demonstrate that correlation does not necessarily equal causation. Often it is more effective to demonstrate a principle than simply to explain it. By showing impressive looking graphical correlations between phenomena that are clearly not related (at least proposing a causal connection superficially seems absurd.), it drives home the point that correlation is not enough to conclude causation.
I think most people can intuitively understand that funding on science, space, and technology is unlikely to have a meaningful causal connection to suicide by hanging, strangulation, or suffocation.
Yet – look at those curves. If a similar graph were shown with two variables that might be causally connected, that would seem very compelling.
There are a couple of points about this I want to explore a bit further. First is the important caveat that, while correlation is not necessarily causation, sometimes it is. Two variables that are causally related would correlate. I dislike the oversimplification that is sometimes presented: “correlation is not causation.” But it can be.
The second point is a statistical one. The important deeper lesson here is the power of data mining. Humans are great at sifting through lots of data and finding apparent patterns. In fact we have a huge bias toward false positives in this regard – we find patterns that are not really there but are just statistical flukes or complete illusions.
Correlations, however, seem compelling to us. If we dream about a friend we haven’t seen in 20 years then they call us the next day, that correlation seems uncanny, and we hunt for a cause. We aren’t even aware of the fact that . . .