So for the past month or so, I have been writing a lot of posts about Bayesian Statistics.You might be wondering why I’ve spent so much time in the area. You might even suspect that you know the answer, but let me share 3 reasons why I enjoyed working with Bayesian Statistics this last month with you.
1. These Models are Interpretable
One of the biggest reasons I like bayesian statistics is I can share the confidence intervals with confidence. I understand what a confidence interval is, and I even know how to calculate it. Try explaining that to an executive. “Given the data, the true population parameter (usually the mean) lies within this interval 95 out of 100 times if we repeat this experiment 100 times”. What?! That’s a nutty concept to explain to an executive. That isn’t to say that confidence intervals are bad. I think it is incredibly useful to say something about how repeatable your experiments should be. Frequentist approaches are actually pretty good at that, and it is a cornerstone of modern science, but it doesn’t help make a gut level decision.
Let’s try interpreting a credible interval for an executive. “Given that my prior is correct, and the data, there is a 95% probability that the true parameter value lies in this interval.” Okay, that sounds good. Here’s the wrinke in this problem. I’ve said nothing about how trustworthy my parameter estimate is. Do I believe that my prior is correct? What if I change my prior? Will it change that credible interval? Yep. So I lose a certain level of objectivity. And my experiments might come back wrong way more than 95% of the time.
Here’s the problem as I see it. Bayesian vs Frequentist debate, at least in my experience is really about being completely objective and possibly having nonsense answers vs. being completely interpretable and possibly setting up the problem so that you can never get the right answer. What is an applied person to do.
Pick the tool that works the best for what you are trying to do. My goal was to explain to a non-technical person something in a way that they will understand it. Bayesian is the way to go. Writing a scientific paper that needs to be verified, let’s go Frequentist.
So one last thing. Usually, if you set up your Bayesian and Frequentist approaches correctly, and there isn’t some sort of structural issue with the thing you are measuring (like being a highly discretized problem space), your confidence and credible intervals will most likely be nearly identical. So you can get away with interpreting a Frequentist confidence interval incorrectly, usually.
One of the cool things about Bayesian Statistics is that you get a full posterior distribution for all of your parameters. So that means that given some data you can get probability distributions from which to draw. This is great news. It means that you can draw samples for values that you have never seen, and thus you can simulate a process.
I did this in the post about March Madness. After estimating each team’s latent offensive and defensive strengths we simulated games between different teams using the posterior distribution of the teams. This is really cool because it gives us the opportunity to simulate unknowable situations, for example, we can simulate games between two teams that haven’t played against each other.
In terms of retail sales, this means that we can simulate the sales for a given item on a store by store basis. That way we can come up with better purchasing decisions. Using a simulation methodology like this, you can get the right mix to different products in the right stores. You can imagine having a model that gives the different selling strength for different stores for different product types, sizes, etc. That way you get a very detailed picture of the future demand at different locations for different types of products.
3. It’s Rationally Consistent
Here is a subject that I haven’t given a lot of attention to. You get the full posterior distribution of your parameters. As such, that means that you have proper probabilities to work with. That means that you can make rational decisions. What do I mean by that?
I mean that you can drop your decisions into an expected utility, or expected profit scenario. You can make decisions that are neither risk loving, nor risk averse. That means that you can price your risks approriately, and take the most actuarially favorable actions.
That is important because if you do not have actuarially fair to favorable assessments of the risk, you can make mistakes by just going with your gut, or going with a naive approach. So these Bayesian models will help you make the right decisions and also tell you how uncertain you should be about that decision.