Funny Correlation Is Not Causation Statistics

Correlation is not causation

Why the confusion of these concepts has profound implications, from healthcare to business management

Introduction

In correlated information, a pair of variables are related in that one thing is likely to alter when the other does. This human relationship might pb u.s.a. to assume that a alter to i thing causes the change in the other. This article clarifies that kind of faulty thinking past explaining correlation, causation, and the bias that ofttimes lumps the two together.

The human encephalon simplifies incoming data, and so we can make sense of it. Our brains oft do that by making assumptions nigh things based on slight relationships, or bias. Simply that thinking process isn't foolproof. An example is when we error correlation for causation. Bias tin can make us conclude that i thing must cause another if both alter in the same way at the same fourth dimension. This article clears up the misconception that correlation equals causation by exploring both of those subjects and the human brain's tendency toward bias.

About correlation and causation

Correlation is a human relationship or connexion between ii variables where whenever ane changes, the other is probable to as well modify. But a change in i variable doesn't cause the other to change. That'due south a correlation, but it's non causation. Your growth from a child to an adult is an example. When your top increased, your mass increased as well. Getting taller didn't make yous also get wider. Instead, maturing to adulthood caused both variables to increase — that'due south causation.

Causation in business organization

Permit's say that nosotros want to offer a promotion or discount to some of our customers. Our marketing department wants to maximize the delta, in other words, the increase in sales as a result of the promotion. So we demand to decide which customers will give us the best render on our investment in the promotion or discount. Do we want to offer it but to our top 10% of your clients? Or the bottom 10%?

You lot might assume that the users who bulldoze more sales are the ones more than responsible for your business success. Still, this assumption could be wrong. The best pick of which customers to offering the promotion to might be totally different. In the absence of valid experimentation or analytics, you don't take authentic answers to those questions.

Cognitive bias

At that place are many forms of cognitive bias or irrational thinking patterns that often pb to faulty conclusions and economic decisions. These types of cognitive bias are some reasons why people assume faux causations in business and marketing:

  • Confirmation bias. People desire to be right. They ofttimes tin't acknowledge or take that they're wrong well-nigh something, even if that attitude causes eventual damage and loss.
  • The illusion of causality. Putting as well much weight on your own personal beliefs, over-confidence, and other unproven sources of information often produce an illusion of casualty. An economical case is the recent U.S. housing bubble. Millions of people believed that buying a home for much more its bodily value would proceed to upshot in a return on the investment just because that happened in the past.
  • Money. Y'all want to sell your production. You might spend more than your return on investment (ROI) on marketing and other business expenses if the desire to make coin clouds your logic.
  • Major marketing implications. Marketing statistics and data are frequently complicated and confusing. Information technology tin can be easy to run across relationships betwixt irresolute sales numbers and the many other variables in your business when no causation exists.

Experimentation

To know that something is valuable takes experimentation. Experimentation helps you lot understand if y'all're making the right choices. But it has a cost. If y'all concord a workgroup back by not giving them a feature that brings in value, you'll lose money. But y'all'll learn the importance of that feature.

The value of an experiment lies in the accomplishment of these two things:

  • Decide between different choices.
  • Quantify the value of the best selection.

Experimental variables

A scientifically valid experiment needs to have three types of variables: controlled, independent, and dependent:

  • A controlled variable is kept constant, then other variables that alter in relation to each other tin exist measured in a static environs.
  • An experiment's independent variable is the only one that can be changed.
  • Dependent variables are the results that are observed when changes are fabricated to independent variables.

Any uncontrolled variables, or mediator variables, tin cloud an experiment's accuracy. And so they need to be identified and eliminated in social club to properly assess the experiment's results. Differences in uncontrolled variables tin can also impact the relationship between contained and dependent variables.

Uncontrolled variables add the influence of unrelated factors to an experiment's results. Correlations might be assumed, and an hypothesis might be formed where none exist. Authentic analysis becomes difficult or incommunicable. Examples of conclusions draw from uncontrolled variables are shown in the children'southward music lessons and mobile phone cancer examples that follow.

How our brain tricks usa

It'southward easy to watch correlated data change in tandem and assume that one affair causes the other. That's because our brains are wired for cause-relation cognitive bias. We need to brand sense of large amounts of incoming data, so our encephalon simplifies information technology. This process is chosen heuristics, and it's oft useful and accurate. But non always. An case of where heuristics goes wrong is whenever you lot believe that correlation implies causation.

Spurious correlations

It is a mathematical relationship in which 2 or more events or variables are associated merely not causally related, due to either coincidence or the presence of a sure 3rd, unseen factor

Children and music lessons

After a study of human brain development, researchers concluded that kids between iv and vi years old who took music lessons showed evidence of boosted brain development in the areas related to retentiveness and attention. Based on this study, our biased brain might connect the dots speedily and conclude that music lessons improve brain development. But there are other variables to consider. The fact that the children took music lessons is an indicator of wealth. So they probably had access to other resources that are known to boost brain development like good diet.

The signal of this example is that researchers can't assume from only this much data that music lessons impact brain development. Yes, there'southward conspicuously a correlation, but at that place'southward no actual evidence of causation. Nosotros need more data to become a true causal explanation.

Cancer and mobile phones

If you written report a chart that shows both the number of cancer cases and the number of mobile phones, you'll find that both numbers went up in the concluding twenty years. If your brain processes this information with cause-relation cognitive bias, you might determine that mobile phones cause cancer. Only that'due south ridiculous. There's no proof other than both datapoints happening to increase. A lot of other things have also increased in the past 20 years, and they tin can't all-cause cancer or be caused by mobile phone use.

Explainability

To find causation, we demand explainability. In the era of artificial intelligence and big data analysis, this topic becomes increasingly more important. AIs make data-based recommendations. Sometimes, humans tin't see any reason for those recommendations except that an AI made them. In other words, they lack explainability.

Explainability in medicine

The FDA won't approve cancer treatments that lack explainability. Think about this situation for a minute. Practice yous want the best possible treatment for your cancer, based on an AI's assay of your genomes, your cancer Deoxyribonucleic acid, millions of other cases, and more information, fifty-fifty if you tin can't explain how the computer's neural network came up with that verbal treatment? Or would you rather have a suboptimal treatment that you lot can explicate the reasoning for?

Medical explainability will exist probably one of the biggest topics of this century.

One way versus 2 mode

Correlations become both ways. We tin can say that mobile telephone usage correlates to increased cancer hazard and that cancer cases correlate to the number of mobile phones. Basically, you can swap the correlation. In causation relationships, nosotros tin can say that a new marketing campaign caused an increase in sales. But saying that the increase in sales (after the campaign ran) caused the marketing entrada doesn't make whatever sense.

Whatsoever causal statement, by definition, is 1 style. That's a big clue virtually whether you're dealing with correlation or causation.

The big dilemma

In "The causal issue of instruction on earnings," David Card says that meliorate teaching is correlated to higher earnings. But the well-nigh important thing he says is that if we tin can't practice an experiment, with all our

variables constant, we can't infer causation from a correlation. Nosotros can always bring explainability to the table. Merely in real life and with big plenty problems, causations based on explainability are hard to bear witness. From a scientific viewpoint, they can't be chosen anything more than a theory.

In the absenteeism of experimental bear witness, it is very difficult to know whether the higher earnings observed better-educated workers are acquired by their higher teaching, or whether individuals with greater earning capacity have chosen to learn more schooling.

— David Carte du jour, The causal effect of teaching in earnings

Does higher-earning cause higher educational activity? Does college education cause higher earning potential? We don't know. However, we tin can make predictions. Nosotros can utilize this correlation to predict the earning potential of an individual based on his education. We can also predict his education based on his earnings.

Good predictions are based on correlations

Information technology sounds like a contradiction, given the context of this article. Correlation is about analyzing static historical datasets and considering the correlations that might exist between observations and outcomes. Even so, predictions don't change a system. That's decision making. To make software evolution decisions, nosotros need to empathize the deviation it would make in how a system evolves if you take an action or don't take action. Decision making requires a casual understanding of the touch on of an activeness.

What are predictions?

We don't make better predictions by developing a better casual understanding. Instead, we demand to know the precise limits of the techniques we utilize to make predictions and what each method can do for us.

References

Lovestats (2019). "Cartoons." The LoveStats Blog. Retrieved from lovestats.wordpress.com.

Menu, D.. (1999). "The causal effect of education on earnings." Handbook of Labor Economic science, vol 3.

allensurvis.blogspot.com

Source: https://towardsdatascience.com/correlation-is-not-causation-ae05d03c1f53

0 Response to "Funny Correlation Is Not Causation Statistics"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel