SKLearn in Production

Alright, I’m kind of in a time crunch this week because I took off for Labor Day weekend. As such I just want to write a quick little post so that you guys don’t think I’m slacking in my posting something every week on Wednesday. So this is basically a quick tutorial on how to go from a basic sklearn model, to a production ready model that is delivered up as a REST API. What we’ll need is:

  • Some training data
  • Sklearn
  • Pandas
  • A “pickling” function
  • Flask

So without further ado we’re going to use the hello world of datasets, the iris dataset. You may recognize it from posts like every tutorial ever. We’ll need some libraries to train a model. You can get them using the following code:

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.externals import joblib
from sklearn.datasets import load_iris

df=load_iris()

Next we need to do a pipeline. Now I’m going to get on a soap box for a little bit. You absolutely need to do this inside of a pipeline. Anything that you are going to send to production needs to be inside of a pipeline! There is no substitute. Don’t try to change this. You have to build your models in a pipeline! Always use a pipeline! Let me repeat it one more time to get it into your head, and hopefully your heart. Production machine learning models go in a pipeline!

Okay, let me explain why. Pipelines are completely transferable. What do I mean by that? In a production server, you want your machine learning model to be plug and play. That way, you are never, ever, changing code on the server itself. You just supply a file to the server, and the code looks at that file for the machine learning model. That way, if you want to change out your model, or add a feature, or change the model from a logistic regression to a neural network, or, or, or. The server, only needs to be updated with a new file.

So let’s build a really dumb model. We’ll build a model that uses one feature, but we use a quadratic of a single feature. Here’s the code to do it in a pipeline:

lr=LinearRegression(fit_intercept=False)
pipeline = Pipeline([('square',PolynomialFeatures(2)),('lr',lr)])
pipeline.fit(df['data'][:,0].reshape(-1,1),df['target'].astype('float'))

print(pipeline.named_steps['lr'].coef_)

The last line prints the coefficients, of the fitted model. It prints out an array of 3 numbers. These are, in order, the intercept, the linear term, and the quadratic term. Here’s the output:

[-8.79995804  2.57637728 -0.15088507]

So what does this pipeline do? It takes a single feature as input, squares it, and submits both of the original feature and the squared feature to the linear regression. The already fitted linear regression, can then make a prediction.

Perfect, now let’s move it to production. All that we need to do, is save the pipeline as a file. We can do that with a single line of code:

joblib.dump(pipeline, '/home/ryan/model.pkl') 

Okay, now all that we need to do is to build a server that we can wrap around our saved pipeline. Flask makes this super easy to do. In fact we can build our server in about 10 lines of code. Here is what you need to run a server:

from flask import Flask,jsonify,make_response
from sklearn.externals import joblib
import pandas as pd

model=joblib.load('/home/ryan/model.pkl')

app = Flask(__name__)

@app.route('/',methods=['GET'])
def index(var):
    return make_response(jsonify({'output':model.predict(pd.DataFrame([var]).T)[0]}))

if __name__ == '__main__':
    app.run(debug=True)

So let’s just go through this code really fast so that you understand it. The first 3 lines are just standard imports to get the flask framework, something that can read our file we created, and pandas to deal with data. Nothing too fancy there. Then we load our model. We do this as a global variable so that we aren’t using resources every time the server is called to load the model.

The next six lines are where the server is actually built. We define the flask server, and then we use a decorator to say when this http resource gets pinged fire off this function. The function makes an http response of a dictionary that we turned into JSON. This dictionary has one entry, our predicted value. The neat thing here is that the url is the variable that we are feeding into the model.

The last two lines startup the server when the python script is called. And that’s it! We now have a model living in “production”. We just need to move this code to a server, like amazon AWS, or heroku. Go ahead fire yours up and see. Here is a screenshot from when I ran mine.

You can see that you can change that url to have any number that you want and the web page will display the output. What’s even neater is if you change the pipeline and save it the server will adjust accordingly. Go ahead and try changing the number of polynomial terms to 3, 4, or 5. Or changing to some other model type. It works really well.

I put these into 2 different scripts to illustrate that you can separate the machine learning from the server. Here are the two files that I generated to write this blog post:

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.externals import joblib
from sklearn.datasets import load_iris

df=load_iris()
lr=LinearRegression(fit_intercept=False)

pipeline = Pipeline([('square',PolynomialFeatures(2)),('lr',lr)])
pipeline.fit(df['data'][:,0].reshape(-1,1),df['target'].astype('float'))
print(pipeline.named_steps['lr'].coef_)


joblib.dump(pipeline, '/home/ryan/model.pkl') 

and

from flask import Flask,jsonify,make_response
from sklearn.externals import joblib
import pandas as pd

model=joblib.load('/home/ryan/model.pkl')

app = Flask(__name__)

@app.route('/',methods=['GET'])
def index(var):
    return make_response(jsonify({'output':model.predict(pd.DataFrame([var]).T)[0]}))

if __name__ == '__main__':
    app.run(debug=True)