Perils of Injecting Correlation in Decision-Making

6 min readDec 15, 2021

by Ricardo Baeza-Yates and Carlos Escapa

Equivant is a US consulting company that provides software and services for law enforcement and legal processes. They own a software product called COMPAS that has been used to predict recidivism in law offenders. In May 2016, the non-profit news desk ProPublica flagged COMPAS as being racially biased and highly punitive to African Americans. The problem was that COMPAS’s data scientists confused correlation (patterns of past crime data) with causation (that being African American makes you more likely to commit a crime).

This correlation comes from a societal bias that is encoded in police enforcement (neighborhoods that are more patrolled will naturally have more data, race of people that are more likely to be arrested, etc.) and justice decisions. This was the first example of mistaking correlation for causation that reached the front page of American news. Sadly, this was just the first one and as the use of machine learning (“ML”) grows increasingly fast, similar mistakes follow the same pattern. As the confusion also happens among data scientists, in this article we try to clarify these concepts and outline the implications and dangers.

Differentiating correlation and causation is actually quite easy. Correlation is a term that describes whether two numbers change together — either directly or inversely. It is a mathematical term, so you can think of correlation as an abstraction. Causation is much deeper: it implies that one event is always preceded by another, and there is an understanding of the fundamental reasons why this is so. Below we have an example of a spurious correlation as there is no causality between the two variables.

Causality states that any event has a cause and hence focuses on finding the relationships of cause and effect. However, causation is a tricky business as medicine shows us. For example, we may detect an illness from a symptom but the symptom is not the cause. Worse, sometimes the cause is not a single fact but a combination of facts and then it becomes difficult or impossible to understand the chain of events that yield a given effect. Hence, modeling causality is difficult because it needs new concepts and tooling. As Judea Pearl, a Turing Award winner from UCLA and co-author of the Book of Why, stated in 2019: “There is no way to answer causal questions without snapping out of statistical vocabulary”. To find causes he proposes a formal way to study causality that uses counterfactuals and is controversial for many statisticians. So, in comparison, correlation is much easier.

ML is exclusively correlative. To be clear, there is no causation available in the field of ML: the steps that we take in building ML models are (1) collect data about an event or phenomenon, (2) translate it into numbers, (3) find a function that fits those numbers, and (4) test the function. If the function yields sufficiently accurate results, then it can be used to make predictions.

And under what circumstances can we rely on an ML model to solve a problem? We can now rephrase the question as “can I legally, morally or ethically use a correlation to drive an automated decision, without understanding the underlying cause?”

That is why the low hanging fruit for ML has been online advertising, and is far-and-away the largest use case. For instance, ML models are able to find that a specific background color or font of an advert gets more clicks, so AI can be allowed to automatically tune the adverts to drive more clicks. We do not need to know why the users prefer a certain color or font, and we do not ask ourselves ethical or legal questions about it.

There are also vast areas of our economy where we can make automated decisions without understanding the underlying causes, such as entertainment, gaming, logistics and manufacturing. Activities where our well-being, our food chain and vital services are not affected can potentially be automated and enhanced with correlations without needing a causality analysis.

Healthcare is at the opposite end of the spectrum. AI is being used to assist medical personnel, make them more efficient and (particularly) to help avoid mistakes. In other words, correlations can help guide judgment and make better decisions. However, AI is not used to communicate diagnoses or provide prescriptions without human validation and supervision. Vital services in our societies function as a chain of responsibilities, and the chain would break if we based medical decisions on correlations.

And yes, there is a gray zone in between, where causation is desired but impossible or impractical, and human judgment is applied to determine whether we can accept the risk of using correlations to fix problems. This can be the case with an incurable disease where a given drug has shown to have a positive effect, but researchers are not able to understand why. The doctor and patient may decide to accept the risk of taking the drug, knowing that it may have side effects or accelerate death.

Autonomous vehicles (AV) are caught in this limbo precisely for this reason — ML can handle routine driving in predictable environments, but we cannot reliably understand how an autonomous vehicle would behave in exceptional weather like freezing rain, or in locations where pedestrians and drivers share the pavement and have a protocol based on eye contact and hand gestures. The consensus is that today, correlations are not sufficiently reliable to enable AVs outside of highly structured environments like farms and mining operations.

So when researchers from well known universities use ML to predict political or sexual preferences using facial bio-metrics and finding correlations of 70%, without any proof of causality, most are finding just style stereotypes in hair, clothing, ears, etc. In this case, even if causality exists, ethically we should not use ML-based correlations to infer personal preferences, as in many countries this will put people in danger.

Hence, when we add the ethical dimension to ML, everything becomes more complicated. For example, today accuracy is the driving measure behind putting a model in operation. However, you may have a 99% accuracy on a prediction but what really matters is the impact of the 1% of errors that you make. Our societies do not accept the use of ML-based solutions that can cause harm in certain cases, even if they are more accurate than human beings; ML solutions are expected to function correctly 100% of the time whenever human lives are at stake, such as autonomous vehicles.

In summary, we note that the injection of ML-powered correlations into decision-making is enhancing productivity and reducing errors, but there are clear ethical bounds where correlation must not be used unsupervised or without a clearly documented causal analysis. Article 5 of the EU proposed regulation for AI forbids its use when it may cause psychological or physical harm to citizens, but leaves to the courts the decision as to what is considered “harm”. For example, imagine that you predict that a user likes fast food and you target her with that type of ads. If that person has a metabolic illness, you might be causing psychological and physical damage. Therefore, much like architects are required to file blueprints for buildings, we would propose to require ML model developers to document the data and features engineered for their models, as well as the intended use of the correlations and available causal analysis. We expect legislation to be enacted in this regard as we see increased evidence that transparency in ML fuels innovation and adoption, and not the opposite.

— — — — — — Published also in LinkedIn — — — — — —

Note: This article was triggered by one of the key questions in an article of the first author: Ten Key Questions Every Company Should Ask Before Using AI

Perils of Injecting Correlation in Decision-Making

Written by Ricardo Baeza-Yates