Frequentist vs Bayesian statistics- this has been an age-old debate, seemingly without an end in sight. Both these methods approach the same problem in different ways, which is why there is so much talk about which is better.
This is particularly important because proponents of the Bayesian approach blame the Frequentist approach for the reproducibility crisis in scientific studies.
For instance, a team at biotech company Amgen found that it could not replicate 47 out of the 53 cancer studies it had analyzed.
Many experts believe this is because of the use of frequentist statistics and that the Bayesian approach is an alternative that could solve this crisis. In order to understand the difference between the two approaches, let’s begin by figuring out how they work.
Frequentist vs Bayesian Statistics
In order to illustrate what the two approaches mean, let’s begin with the main definitions of probability. These include:
- The probability of an event is equal to the long-term frequency of the event occurring when the same process is repeated multiple times. As per this definition, the probability of a coin toss resulting in heads is 0.5 because rolling the die many times over a long period results roughly in those odds.
- The probability of an event is measured by the degree of belief. In other words, the likelihood of an event occurring depends on the beliefs about the occurrence of such event. or the truth of a hypothesis, or the truth of any random fact. That is, probabilities simply represent how certain you are about the truth of statements.
- The probability of an event is measured by the degree of logical support there is for the event to occur. According to this definition, a probability is nothing but a generalization of classical logic.
(i) Ronald Fisher – Probability as Long-Term Frequency
(ii) Frank Ramsey – Probability as Degree of Belief
(iii) Rudolf Carnap – Logical Probability
The frequentist approach follows from the first definition of probability. According to the frequentist definition of probability, only events that are both random and repeatable, such as flipping of a coin or picking a card from a deck, have probabilities.
These probabilities are equal to the long-term frequencies of such events occurring. The frequentist approach does not attach probabilities to any hypothesis or to any values that are fixed but not known.
The Bayesian approach, on the other hand, is rooted in the second and third definitions described above. Therefore, the Bayesian approach views probability as a more general concept; thereby allowing the assigning of probabilities to events which are not random or repeatable.
For example, Bayesians would find it perfectly okay to assign a probability to an event like Donald Trump winning the 2016 election. In the frequentist approach, this wouldn’t be possible because you can’t repeat the event many times over a long period of time.
Frequentist vs Bayesian Example
The best way to understand Frequentist vs Bayesian statistics would be through an example that highlights the difference between the two & with the help of data science statistics.
Here’s a Frequentist vs Bayesian example that reveals the different ways to approach the same problem.
Say, the problem involves estimating the average height of all men who are currently in or have ever attended college. We assume that the height has a normal distribution and that the standard deviation is available. Therefore, all we need to estimate is the mean.
(i) The Frequentist Approach
A frequentist would reason that since the mean height is an actual number, they cannot assign a random probability to it being equal to, less than, or greater than a certain value.
Therefore, a Frequentist would collect some sample data from the universal data and estimate the mean as the value which is most consistent with the actual mean. This is known as a maximum likelihood estimate. When the distribution is normal, this estimate is simply the mean of the sample.
(ii) The Bayesian Approach
A Bayesian, on the contrary, would reason that although the mean is an actual number, there is no reason not to assign it a probability. The Bayesian approach will do so by defining a probability distribution based on possible values of the mean.
This distribution will then be updated using data from the sample. This update is done by applying the Baye’s theorem which is shown below.
The sample data makes the probability distribution narrower around the parameter’s true and unknown value. The Baye’s theorem is applied to each possible value of the parameter. Here’s a short video highlighting the differences in Frequentist vs Bayesian ab testing.
Frequentist vs Bayesian Statistics – The Differences
Based on our understanding from the above Frequentist vs Bayesian example, here are some fundamental differences between Frequentist vs Bayesian ab testing.
(i) Use of Prior Probabilities
The use of prior probabilities in the Bayesian technique is the most obvious difference between the two. Frequentists believe that there is always a bias in assigning probabilities which makes the approach subjective and less accurate. Bayesians, on the other hand, believe that not assigning prior probabilities is one of the biggest weaknesses of the frequentist approach.
(ii) Data Prediction
Since the Frequentists don’t believe in assigning prior probabilities, their estimate is based on the maximum likelihood point. Bayesians, on the other hand, have a complete posterior distribution over possible parameter values. This allows them to account for the uncertainty in the estimate by integrating the entire distribution, and not just the most likely value.
(iii) Mitigating Uncertainty
The Bayesian approach to mitigating uncertainty is by treating it probabilistically. Frequentists don’t have that luxury. However, this doesn’t mean that there is no uncertainty in the frequentist approach. The estimate derived from sample data can, and is often, wrong. In order to mitigate this uncertainty, Frequentists use two techniques.
- The use of confidence intervals.
- Null hypothesis significance testing (NHST) which is related to P-values.
Frequentist vs Bayesian AB Testing – Which is a Better Approach?
The Frequentist approach has held sway in the world of statistics through most of the 20th century. It has been particularly attractive to statisticians because it promises no-nonsense objectivity.
However, in the last 15 years, the Bayesian approach has really been coming into its own, leading to a lot of debates about which approach is superior.
In 2013, for instance, the US Coast Guard used the Bayesian approach to find a Long Island fisherman in the Atlantic ocean. The Coast Guard knew the 9 hour time window in which the fisherman fell off his boat but nothing more than that.
They input the information into a Bayesian program called SAROPS (Search and Rescue Optimal Planning System) and kept adding more information like prevailing currents, clues found by the boat’s captain and places the search helicopters had already flown. As a result, the program was able to narrow down the location and the fisherman was rescued.
Download Detailed Curriculum and Get Complimentary access to Orientation Session
Time: 10:30 AM - 11:30 AM (IST/GMT +5:30)
Similarly, scientists have been able to use the Bayesian approach to determine the age of the Universe. They have factored in events like supernova explosions, patterns seen in radiation left over from the Big Bang, and the distribution of galaxies to calculate that the Earth is 13.8 billion years old. Previously, they could only estimate that its age was between 8 and 15 billion years.
With the examples above and other Bayesian approaches showing dramatic results, people have begun to question the efficacy of the Frequentist approach. Many advocates of the Bayesian approach point out a major limitation of the Frequentist approach.
A result is considered statistically significant if it has a p-value of less than 5%. However, accepting every such result means that 1 out of every 20 “statistically significant” results are just noise and not significant at all.
Findings published in reputed journals are even more likely to be error-prone as they often have unexpected findings.
While a certain bias towards Bayesian statistics is emerging, most statisticians feel that the debate is overrated. According to them, most errors in Frequentist approaches are not a result of choosing the Frequentist approach but of applying it incorrectly.
The major lapses and error-prone results are due to errors of critical reasoning rather than due to an inherent shortcoming of any statistical approach.
Moreover, the frequentist approach continues to be used in path-breaking research. For instance, physicist Kyle Cranmer helped develop a frequentist technique that was recently used to discover the Higgs-Boson particle.
Plus, it’s not like the Bayesian approach is without its own inherent limitations. The Bayesian approach makes it mandatory to start with an estimate and assigning numbers to subjective assumptions can often be very difficult.
At the end of the day, both the Frequentist and Bayesian approaches have their own merits and limitations. Most errors in research arise not from an inherent weakness in either of the approaches but from a wrong choice of approach or its incorrect application.
Both Frequentist and Bayesian approaches have been used in data science to facilitate path-breaking findings and that is unlikely to change in the near future.