How to Create and Deploy a Pre-Trained Word2Vec Deep Learning REST API

By Arun Kirubarajan, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

In production, standard methods of developing computationally expensive programs are too slow for programmers to reliably use. Development on a laptop or local machine can be sluggish to train the machine learning model for deep learning engineers, often taking hours or days for a single build. Thus, it is the industry standard to make use of cloud resources with more compute hardware to both train and subsequently run our machine learning models. This is good practice since we abstract complex computation and instead make AJAX requests as necessary. In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up with Alibaba Cloud Elastic Compute Service (ECS).

Prerequisite Knowledge

  1. Understanding of python and pip commands
  2. Knowledge of how to use the Linux operating system to create/navigate/edit folders and files

An Introduction to Word Vectors

At its core, word embeddings are word vectors that each correspond to a single word such that the vectors “mean” the words. This can be demonstrated by certain phenomena such as the vector for king — queen = boy — girl. Word vectors are used to build everything from recommendation engines to chatbots that actually understand the English language.

Another point worth considering is how we obtain word embeddings as no two sets of word embeddings are the same. Word embeddings aren’t random; they’re generated by training a neural network. A recent powerful word embedding implementation comes from Google named Word2Vec which is trained by predicting words that appear next to other words in a language. For example, for the word “cat”, the neural network will predict the words “kitten” and “feline”. This intuition of words appearing “near” each other allows us to place them in vector space.

However, it is an industry standard to use the pre-trained models of other large corporations such as Google in order to quickly prototype and to simplify deployment processes. In this tutorial we will download and use Google’s Word2Vec pre-trained word embeddings. We can do this by running the following command in our working directory:


Setting Up Python Environment

First, we install virtualenv, a Python module that allows us to separate our working directories such that libraries don't interfere with one another.

pip3 install virtualenv

Next, we create a virtual environment named venv. Note that it is important to both specify and consistently use the same python version. It is recommended that you use Python 3 for best support. The venv folder will contain all the python modules specified in requirements.txt

virtualenv -p python3 venv

Although we’ve created a virtual environment, we haven’t activated yet. Whenever we want to use the project and it’s dependencies, we must source it using source. The file we actually want to call source on is named activate located in a folder named bin.

source venv/bin/activate

Once we are finished with our project, or we want to switch virtual environments, we can use the deactivate command to exit the virtual environment.


Installing the Magnitude Package

pip3 install pymagnitude flask

We’ll also add it to our dependency tracker with the following command. This creates a file named requirements.txt and saves our Python libraries so we can re-install them at a later time.

pip3 freeze > requirements.txt

Making Model Predictions


Next, we’ll add the following lines to to import Magnitude.

from pymagnitude import Magnitude 
vectors = Magnitude('GoogleNews-vectors-negative300.magnitude')

We can play around with the gensim package and the deep learning model by using the query method, providing an argument for a word.

cat_vector = vectors.query('cat')

However, for the core of our API, we will define a function to return the different in meaning between two words. This is the backbone for most deep learning solutions for things such as recommendation engines (i.e. showing content with similar words).

We can play around with this function by using the similarity and most_similar functions.

print(vectors.similarity("cat", "dog"))
print(vectors.most_similar("cat", topn=100))

We implement the similarity calculator as follows. This method will be called by the Flask API in the next section. Note that this function returns a real value between 0 and 1.

def similarity(word1, word2):
return vectors.similarity(word1, word2)

Wrapping The Model in a REST API

We will create a file named with the following contents. We import flask and request to handle our server capabilities and we import the similarity engine from the module we wrote earlier.

from flask import Flask, request
from model import similarity
app = Flask(__name__)@app.route("/", methods=['GET'])
def welcome():
return "Welcome to our Machine Learning REST API!"
@app.route("/similarity", methods=['GET'])
def similarity_route():
word1 = request.args.get("word1")
word2 = request.args.get("word2")
return str(similarity(word1, word2))
if __name__ == "__main__":'', port=5000, debug=True)

Our server is rather bare bones, but can easily be extendable by creating more routes using the @app.routedecorator.

Dockerizing the Application

To begin the containerization process, we will begin by creating a Dockerfile. A Dockerfile is the entry point for the entire Docker process. We use this file to define dependencies, access files, set environment variables and to run our application.

touch Dockerfile

Next, we will add a command for Docker to be aware that our current directory is the directory container the Dockerfile. Then, we will install our Python dependencies for the server.

ADD requirements.txt /
RUN pip install -r requirements.txt

Next, we will install wget so that we can download the word embeddings. We'll rename them to match the convention we use in our Flask server using the MV Docker command.

RUN apt install wget
RUN wget

Finally, we can start our server by adding the final line to our Dockerfile. This runs our Flask server.

CMD [ "python", "./" ]

Running the Dockerized Application

We will first run the docker build command, specifying a -t flag to create a name for our image and a . to tell Docker our Dockerfile is in our current directory.

docker build -t model .

Finally, we’ll run our image using the docker run command, specifying a -p flag to bind our model to port 8000 (the port our Flask server is running on) and expose it to port 8000 (the port we want to use on our localhost).

docker run -p 8000:8000 model

Making API Calls

Another option to test our API is using the command line. While we can use the browser (i.e. Chrome or Safari) to test our API by using GET routes, we are limited by being unable to use POST requests. An alternative is to use the curl tool, which comes bundled with Unix operating systems.

We use curl to specify both the word1 and word2 arguments and to view the response in the command line.

curl -X GET 'http://localhost:8000/similarity?word1=dog&word2=cat'

In our terminal, we should be able to see our response accurately classified.



Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website: