When making decisions with data, the idea that things will “even out” may ring true, but it’s not always helpful. A gambler at the roulette table might see an unlikely sequence of five red, and believe that black is highly likely to come up next. In fact, the odds are the same, regardless of previous results.
The Law of Large Numbers, more popularly known as the “Law of Averages”, is at the heart of probability theory, and an underlying assumption made in most statistical models. It is also the most misused and misunderstood statistical law.
The law states that “the average of a sequence of independent random events will converge to its expected value”.
A tale of Two-Face and Batman
Let’s look at an example. Out walking in Gotham, you run into the villain Two-Face. He’s captured Batman and is about to throw him in a shark tank. Two-Face pulls out a coin and tells you it is fair, which means there is an equal probability it will land on heads or tails. He will flip his coin to decide whether or not to throw Batman in the tank. Accompanied by some suitably villainous banter, he flips his coin five times, and gets the following results: heads, tails, tails, tails, tails.
Two-Face is about to flip the coin to make his final decision, and asks you to decide whether heads or tails will put Batman in the shark tank. You know the Law of Large Numbers. Do you pick heads or tails? There’ve been four tails in a row. Does that mean heads is more likely on the next toss? The Law of Large Numbers tells you that the head-to-tails ratio needs to “even out” to 50/50, so won’t heads be more likely?
The common misconception is that heads is more likely, but in reality it’s still a 50/50 chance.
If the Law of Large Numbers does not tell you that heads is more likely, what does it say, and does it apply to Batman’s predicament? To start, let’s break down the definition.
The expected value for an outcome is the weighted sum of all possible outcomes. In the Two-Face example, there are two possible outcomes, heads and tails. The expected value, or probability, of getting heads is 1 (the outcome of heads) divided by 2 (all possible outcomes), which is 50%. The law states that in a sequence of random events repeated over and over again, the average of the results for a specific outcome will be ever closer to, or converge to, its expected value. That is to say, you’ll expect to get heads 50% of the time.
The misconception in the law comes from the convergence portion. This means if we could watch Two-Face flip a coin an infinite number of times, the expected value of getting heads will equal 50%. The law does not guarantee anything about a finite number of events. When we are talking about an infinite number of events, eventually the individual event of flipping a coin one time will not matter. Adding one event to a weighted sum of an infinite number of events has no impact on the final value.
When can I apply the law?
There are two questions you should ask yourself before deciding to apply the Law of Large Numbers and assuming that things will even out.
1. Is each event in the sequence independent?
If the outcome of an event depends on any of the previous outcomes, do not apply the law. For example, this is very often the case when making a prediction. Most predictions are made based on the outcomes from previous events. For example, predicting which candidate will win an election depends on prior events such as how much campaigning they have done, which political party won the previous election, and so on.
2. Are you considering a large number of events?
Naturally, the Law of Large Numbers does not apply to an individual event or a small number of events. If you’re not sure where to draw the line between a small number of events and a large number of events, here’s a rule of thumb. Ask yourself: will it take me less than three minutes to count the number of events I’ve observed? If yes, do not apply the law.
The most common example of a violation of this rule is in sports. Just because Kobe is a 78% free throw shooter and has made 6 out of 6 free throws this game, it doesn’t mean he’s more likely to miss his next free throw.
This rule is also commonly violated when making decisions about which historical data to keep and which historical data to throw away. The law does not imply that knowledge of individual events become less valuable over time: it applies to the expected value of an outcome, not a small set of individual events. As more data is collected, observing one more event will have less weight in determining the expected value of the outcome, but is not a less valuable observation. In fact, observing more events yields more accurate predictive models and statistical results.
If you can answer “no” to questions 1 or 2, you can apply the Law of Large Numbers. The law is a foundational statistical law and is usually only applied in theory. Though rare, there are a few real life applications. And as the law’s name suggests, you don’t need to go all the way to infinity to begin to see the same result.
Let’s say you’ve developed an app and would like to know the probability of it crashing during any given hour. You first check rules 1 and 2 and determine that each crash is an independent event, and that you have observed crashes over 1,000 hours, a number that will take more than three minutes to count to. To be specific, say you’ve seen 35 crashes in the last 1,000 hours, so the current expected value for crashing in any given hour is 0.035%. If you observe one more hour and see no crashes, you now have seen 35 crashes from 1,001 hours = 35/1001, which is close enough to 0.035% to make little difference. Even for 1,000 hours, one additional event did not significantly change the expected value.
Meanwhile, back in the shark tank…
Let’s return to Batman’s predicament and see if we can apply the Law of Large Numbers.
#1: Does the outcome of this event depend on the previous outcome? No
#2: Will it take less three minutes to count the number of events I’ve observed? Yes
Since we answered “yes” to Question #2, the Law of Large Numbers will not be of any help in Batman’s predicament.
Finally, there is still one law that can be applied in dealing with Two-Face, and that is the unwritten law that all villains must lie. Two-Face has told us that all outcomes, heads or tails, have an equal probability of occurring, and is most likely lying to us about his coin being fair. Because 4 tails out of 5 coin flips is somewhat unlikely for a fair coin, tails may actually be more likely on the next coin flip.