This is the last of three posts. In the first post we did some really fun Bayesian modeling. Specifically, we built a longitudinal or panel data model using a Bayesian model with fixed effects for states and time. And what we found was that increasing the cigarette tax would decrease health care expenditures in a state. In the next post we built a bokeh app that displayed these results. This post is a little bit different. We aren’t really doing much in terms of data science and more computer science. We’re going to spin up a light server, with the configuration that we need and go on our merry way to publishing our previously built app.
What is Heroku?
For those of you who don’t know, heroku is a cloud service provider. They let you build apps for free on their servers and hosts them. When I heard what they do and the price tag (FREE), I was like:
Of course their hope is that you become wildly successful, and it would be so painful to ever move your code to anywhere else that you will gladly pay when you fall off out of the free limits.
So how do I get my bokeh app on heroku?
Great question, and the answer is a little bit surprising. We need to make a detour through github. I tend to gloss over this. Mostly, because with this blog, and the code that I share here, I tend to be pretty bad at commiting it to github, or bitbucket, or any other version control. That being said, YOU SHOULD BE USING VERSION CONTROL! Yes, yes, I know I just shouted at you, but I was making a point. Version controlling your code is important. For anything that isn’t throw away code, I use bitbucket, since again, free private git repositories.
But today we will use github because of free public repositories, best known git version control around, and it plays really nice with heroku. If you want to cheat and grab the code here is the full github repo. I’m going to assume that you already have the following files in a github repo since we generated them in the last two blog posts. I know that I did. 😉
- Expenditures.csv
- Taxes.csv
- analysis.py
- bokah.py
- trace.pkl
Next you need to open an account on heroku and another one on github. Again, I’m going to assume that you have the ability to create an account on heroku by yourself and that if you don’t have an account on github yet, you can fill out a registration form or two. Once those two steps are complete, you just need to link github with heroku. It is a typical oauth process that you do through heroku. I won’t beat that down your throat, mostly because we’re still in the prerequisite phase. If you need help, on how to do it, you just need to follow these instructions. It really isn’t that hard. In fact, you just need to find this button.. Once it is clicked, you just enter in your github credentials and select the repository that you want to link the app to.
Setting Up The Server
Alright, so at this point if you linked to the github repo that you already have, you are good to go. Heroku will have automatically detect that you have python code. We just need to create instructions for heroku to download any dependencies that we have, and another file to tell heroku how we want to make the code accessible to the outside world. Luckily, we don’t have a terribly complicated app, that has a need for a database and a whole bunch of other bells and whistles.
So we’ll need a requirements.txt file. This will allow us to tell heroku to install a bunch of libraries that we want to use when it spins up our server. It isn’t a terribly complicated document. So we’ll just dive right in and make it by hand. Generally, we do something like pip freeze on a python environment, or a conda environment, but again, our app is pretty simple.
Open a text editor and type the following into it:
pandas numpy matplotlib pymc3 bokeh
Save it as requirments.txt and commit it to your github repo, then push that bad boy out to your remote repo. You’ll notice that it is just a list of libraries that we used to build the app, and that they are pretty standard data science libraries to boot. When you do that, heroku will automatically update, provided that you connected it to your github repo. So that is pretty slick.
Now heroku knows what it needs in order to run the app, but we are still missing the bit where we tell it how to run the app. So we need to create another file. Now the name of this file is very important. We can’t go around naming it willy nilly. It has to be exactly what heroku wants it to be. So open a text editor and type the following:
web: bokeh serve --port=$PORT --address=0.0.0.0 --allow-websocket-origin=tobacco-tax-viz.herokuapp.com --use-xheaders bokah.py
Then save that as “Procfile” note that isn’t “Procfile.txt”, it is “Procfile” with a capital “P”. Please remember to ignore the quotation marks. So, let’s break down what the one line in the file does.
“web: bokeh serve” – This part of the line tells heroku to make a bokeh server accessible to the internet. Hence on the web serve your bokeh app.
“–port=$PORT” – This tells heroku to use the port that it needs to use to get the bokeh app on the internet.
“–address=0.0.0.0” – This portion says to use an ip address that is publicly available, and not locked down to be inside of an internal network.
“–allow-websocket-origin=tobacco-tax-viz.herokuapp.com” – This part is super important. It says to allow websockets to connect to the app. Since bokeh uses websockets we have to allow them to connect to the app. And we even have to tell heroku what app to allow them on. This line will change depending on what you name your heroku app.
“–use-xheaders” tells heroku to use xheaders.
“bokah.py” tells heroku which file to run, in this case the file bokah.py.
Now, commit this new file and push it to the remote github repo. If that repo is connected to heroku, your app will automatically update again. This time though, heroku will start a server, download all the dependencies and put the app on the internet. It will assign a url to the app, and you can call it from anywhere in the world and see those cool dynamic visualizations that we built last time. That’s it you’ve successfully pushed that dynamic visualization into a production environment. Congratulations are in order.
Thx mate, nice and clear. thats what i ve been looking for days…