# This (does not equal) That

##### Also See: Spurious Correlations

(click image for much larger view)

# Correlation and Causation

by via NeuroLogica Blog

Every skeptic’s new favorite website is Spurious Correlations. The site is brilliant – it mines multiple data sets (such as causes of death, consumption of various products, divorce rates by state, etc.) and then tries to find correlations between different variables. The results are often hilarious.

The point of this exercise is to demonstrate that correlation does not necessarily equal causation. Often it is more effective to demonstrate a principle than simply to explain it. By showing impressive looking graphical correlations between phenomena that are clearly not related (at least proposing a causal connection superficially seems absurd.), it drives home the point that correlation is not enough to conclude causation.

I think most people can intuitively understand that funding on science, space, and technology is unlikely to have a meaningful causal connection to suicide by hanging, strangulation, or suffocation.

Yet – look at those curves. If a similar graph were shown with two variables that might be causally connected, that would seem very compelling.

There are a couple of points about this I want to explore a bit further. First is the important caveat that, while correlation is not necessarily causation, sometimes it is. Two variables that are causally related would correlate. I dislike the oversimplification that is sometimes presented: “correlation is not causation.” But it can be.

The second point is a statistical one. The important deeper lesson here is the power of data mining. Humans are great at sifting through lots of data and finding apparent patterns. In fact we have a huge bias toward false positives in this regard – we find patterns that are not really there but are just statistical flukes or complete illusions.

Correlations, however, seem compelling to us. If we dream about a friend we haven’t seen in 20 years then they call us the next day, that correlation seems uncanny, and we hunt for a cause. We aren’t even aware of the fact that  .  .  .

MORE – – –

# True Fact: The Lack of Pirates Is Causing Global Warming

By Erika Andersen via Forbes

It’s true.  This extremely scientific graph proves it:

You can see that as the number of pirates in the world has decreased over the past 130 years, global warming has gotten steadily worse. In fact, this makes it entirely clear that if you truly want to stop global warming, the most impactful thing to do is — become a pirate.

Hope you’re laughing.  My husband told me this wonderful premise a few months ago, and I couldn’t resist sharing it with you, for a very specific reason. I’m fascinated by why it’s so funny. I believe it’s because it’s an only slightly more extreme version of the fake logic we hear every day — the conclusions that pass for critical thinking in these days of completely unleashed 24-7 communication. For example:

• Someone who has cancer drinks gallons of lemon water and their cancer goes into remission: they create a website to talk about how lemon water cures cancer.
• A business is doing badly and they move to a new building and things start to pick up: the CEO writes a book about how changing your environment is the key to success.
• Statistics show that people who leave their jobs after less than a year are more likely to smoke: someone starts a campaign to reduce smoking by encouraging people to stay at their jobs longer.

My older sister, a very wise and smart woman who is a political scientist at Syracuse University, teaches a statistics class to freshmen, where she endeavors to teach them critical thinking.  She talks about this as being the most common error in logic: confusing simultaneity with causality.  In other words, assuming that because two things are happening at the same time, they exist in a cause and effect relationship with each other.

Because anyone can say anything anywhere these days (pretty much), there’s a lot of fuzzy thinking floating around that seems more legitimate than it would have in former times because it’s in print. Now, don’t get me wrong: I’m a huge proponent of free speech.  I just feel we all have to be more discriminating than ever before about what we believe.  Not cynical or negative: discriminating.

So, when someone proposes a cause and effect relationship between two things – reduction in pirates causing global warming; Obama creating the global economic crisis; young people ruining American business – ask for the data that shows they’re related, rather than simply that they’re happening at the same time.

But if you’re dead set on becoming a pirate, I’m not going to stop you.

[END]

# The Curious Case of Correlation ≠ Causation

In my last post, I wrote about how not having enough contextual data can outright boggle the mind. Today, we’re going to read about something else that similarly boggles the mind, albeit not really related to any linguistic phenomena. It’s an interesting little logical fallacy in the field of statistics known as cum hoc ergo propter hoc, or more commonly, ‘correlation does not prove causation’.  Here, we define correlation as ‘when two things happen at the same time’, and causation as ‘when one thing causes the other’.

This logical fallacy is great at showing the glaring inaccuracies caused by lack of data on a specific subject, and how this lack can cause us to reach blindly for (often incorrect) conclusions in the proverbial fogginess of our mind. Additionally, the comedic factor here is amplified if you forego the law of parsimony (also known as Occam’s razor), which states that of all the possible solutions to a question or problem, the simplest one is most likely the truth.

Have you ever been in a recording booth or a really quiet place? If you’re in there for a long time your mind begins to create its own sounds. Essentially, you begin to hallucinate due to a lack of external stimuli. This is basically what goes on in the aforementioned logical fallacy: you end up compensating for a lack of data by drawing a perceived (and often inaccurate) connection between the sole items of data you have.

What does this have to do with a language blog? Essentially, it’s a great way of showing how a lack of the background information required for comprehension can yield wildly inaccurate knowledge. Dig this:

Did you know that children with bigger feet are statistically better at spelling?  This is statistically true. Without additional contextual information, I could hypothesize that having larger foot-size means the children would perform better at sports and have better balance while carrying large and cumbersome schoolbags, making them less prone to falling over in bustling school hallways, making them less likely targets for bullies, leading to an inevitable increase in confidence, leading to better scholastic performance, and thereby, better spelling skills!

The truth is, it’s actually because children with larger feet are probably a lot older than children with smaller feet. Duh.

Did you know that you are more likely to get cancer if you always wear a seat-belt?

MORE . . .