"While it is easy to lie with statistics, it is even easier to lie without them." - Frederick Mosteller
Statistics and probability are complicated. Here at Invrea, we are acutely aware of that. Partially because it's what we spend most of our time doing, and partially because much of the time we're not doing it, we're trying to explain what we've done to others. Since much of our statistics work is now available to alpha users in release 1.4.0, we thought it would make sense to put out another blog post doing a little more explaining.
In this post, we’re going to focus on one of the most common statistical errors made by non-statisticians and statisticians alike: confusion of the inverse. This fallacy occurs when likelihoods are used to reinforce conclusions that should be made using posterior distributions. If you already know what that means, Bayesian probability aficionados, sit down, because this post is not for you. This post is for those who have little idea of what the above terms mean, and more importantly, little idea just how many seemingly-bulletproof conclusions from data this simple error has lain waste to.
The best way to learn to reason correctly is to be faced with examples of wildly incorrect reasoning. To that aim, imagine the following scenario. You enter the lottery tomorrow after broadly ignoring all statistical arguments to the contrary, and after a few hours, you find out that you have won. You immediately buy a yacht and set a course for Bermuda, but before you've even left port, you receive a call from the police accusing you of cheating and requesting that you appear in court tomorrow.
In court, you and your fabulous (and fabuously well-paid) lawyers argue your case. You can produce the lottery ticket, show that you bought it, and argue that there is no evidence that you cheated. The prosecutor even admits that there is no evidence that you cheated. However, he then argues, "The odds that someone who plays the lottery wins by chance are one for to a million against. Therefore, there is upwards of a 99.99% chance that you cheated. At the very least, your winnings should be revoked; at worst, you should be charged with fraud and breach of contract."
What do you feel in response to that argument? Do you have a specific counterargument, a multitude of counterarguments competing for space in your head, or just a general feeling of "This is wrong"? Whatever your response to what's known as the prosecutor's fallacy, hold on to it, because this should be your response to confusing the inverse always - even when the errors are much more insidious and hard to root out, which they will be.
The standard textbook example of confusing the inverse is the cancer-screening test. Assume that 0.2% of the general population has cancer. You are a doctor who has access to a cancer-screening test with the following properties. When the patient is cancer-free, it outputs 'cancer-free' with 99.5% probability and a 0.5% chance of a false positive. When the patient has cancer, it outputs 'patient has cancer' with 99.9% probability and a 0.1% chance of a false negative. You have just tested a patient, and the machine has reported that they have cancer. What is the probability that the reading was not in error, and that they actually do have cancer?
This question achieved fame because a study in 1982 asked doctors a similar question, and 95% of those surveyed answered wrongly - and not just slightly wrongly, but off by ten times. We could find the answer to the above question using Bayes' rule, but here, we'll tackle it using Invrea Scenarios. Consider the following spreadsheet:
Cancer screening spreadsheet
To follow along with the demonstration, if you have the Invrea Scenarios plugin, download and run the cancer screening spreadsheet here.
In this model, cell F4 is 1 if the patient in question has cancer, and cell G4 is 1 if the detector gives a positive result. By right-clicking cell G4 and recording an actual value of 1, as we explained in our previous blog posts, we focus our attention on only cases where the detector gives a positive result (Note that this has already been done in the demonstration spreadsheet). Given that, to find the probability that the patient has cancer, we can simply run 50,000 simulations and plot the result of cell F4:
As you can see, given a positive detection result, there is actually only a small chance (approximately 16.7%) that the patient has cancer. Most people are surprised by this because they forget to compare the large number of false positives to the small fraction of the population that actually has cancer.
The fallacies above are simple examples of a class of errors that have been made by scientists and statisticians alike for as long as there have been scientists and statisticians. If you're not convinced that you have made or may make these mistakes, consider the story of Charles Reep.
In 1951, Brentford Football Club (soccer to an American reader) hired ex-RAF Commander Charles Reep as an analyst and consultant. Reep sat on the sidelines during Brentford football games, and later did the same for Wolverhampton, taking down statistics and notes for multiple thousands of games. He later used these to provide the teams with advice on which play styles performed better than others.
Reep's most famous and controversial advice was to espouse the "long ball" - making risky passes far upfield - over shorter, safer, slower passes. His reasoning was as follows. Empirically according to his data, 50% of goals are scored after zero or one passes, and 80% of goals are scored after three or less passes. Therefore, long chains of passing don't increase the likelihood of scoring, and may actually decrease it, as the number of passes increases the likelihood of losing possession. Therefore, minimizing the number of passes will maximize the number of goals scored.
Remember that feeling of wrongness generated by the prosecutor's fallacy? Did you feel it again?
Reep aimed to demonstrate that the probability of scoring given that there were few passes is higher than the probability of scoring given that there were many passes. However, he only took into account the probability of there being few passes, or many passes, given that scoring occurred. This is the same mistake made by the prosecutor and doctors in the previous examples. Reep's argument fails because that football is a game in which the vast majority of possessions are short-lived. Therefore, we would expect even if making more passes substantially increases the probability of a goal that the majority of goals would occur after less than three passes.
In fact, this is what the data does say. For a given team, the probability of scoring increases with the number of passes that a team can successfully make without losing possession. This does not necessarily mean that more passes equals more goals - simply that a team that is good at keeping possession by making successful passes will on average score more than a team that is not. One possible explanation of this phenomenon can be seen in the following spreadsheet:
Spreadsheet model for the impact of passing in football
Again, if you have the Scenarios plugin, download the spreadsheet model to follow along with the demo here.
This model makes the assumption that the more possession time a team has, the more likely they are to score on any given attempt, and the longer their chains of successful passes will be on average. The number of passes per offensive is sampled in column E, and whether or not a goal was scored on each attempt is stored in column F. If a goal was scored, the number of passes made is in column H.
By using Invrea Scenarios, we can easily validate Reep's observation that the majority of goals occur after a small number of passes. The following is a plot of the average number of passes before a goal, over 15,000 simulations:
However, if we plot the average number of passes overall against the goals scored over the whole game, we see a very different story:
This heatmap shows us that as the number of goals scored goes up, the average pass length is likely to as well. This effect can be seen more plainly using the techniques explained in our previous blog posts, if we condition on at least five goals being scored, and then plot the posterior distribution over the average number of successful subsequent passes of teams that manage to score at least five goals. The distribution over pass lengths leans to the right, indicating that longer average pass lengths correlate with more goals scored:
As a result of Reep's faulty advice, many of his clients were coached to make longer, riskier passes that only served to lose possession and decrease the number of goals scored.
This fallacy, confusion of the inverse, is far from the only common statistical error. Optimism bias, generalisation error, and hindsight bias seek to falsify your models of reality. When your models of reality are wrong, your advice and predictions cannot be trusted.
Invrea Scenarios helps a lot with making these kind of predictions but it doesn't stop there. The plugin can model uncertainty and make predictions given our assumptions and new data for business decisions, insurance claims, consulting cases, etc. If you can model your decision as relationships between cells in an Excel spreadsheet, then it's quite likely that Scenarios can help. The team at Invrea is dedicated to opening this kind of machine learning to every industry possible. If you would like more information, a more detailed demo, or some help setting up a worksheet of your own, we'd love to lend a hand. You can find us at this email.
The alpha version of Invrea Scenarios is free. You can request a download link here.