Running PyODPS and Jupyter Notebook on Alibaba Cloud in 10 Minutes

Why Notebook by Jupyter

When you’re exploring a dataset, you need to start by loading the data and getting it into a convenient format. And if the dataset is quite large, it can take almost half a minute just to read the training data from disk. If you’re fitting a model, you usually need to set up a feature matrix and do other pre-processing first, so it can sometimes start to feel like “Edge Of Tomorrow:” read the data, trim the outliers, build some features, make a feature matrix, start fitting a model and the you suddenly get killed because you forgot to load a library you needed. That means going back to the beginning, and starting the cycle all over again.

Our Beloved MaxCompute (formerly known as ODPS)

For the PyODPS from its definition on the website (, it is a Python version of the ODPS SDK, which provides the basic operation of the ODPS object; and provides a Data Frame framework that makes it easy to perform data analysis on ODPS. ODPS is now rebranded as MaxCompute, but the name of the SDK remains as ODPS.


  1. An ECS instance (4 core 8GB at minimum)
  2. Approximately 10 minutes of your time

How to install

  1. Installing Docker on your instance
  1. Installing Notebook container by Jupyter
  • $ docker pull jupyter/notebook
  • $ docker run -p 8080:8888 jupyter/notebook jupyter notebook --no-browser
  • $ docker run -p 8080:8888 /home/notebooks:/notebooks jupyter/notebook jupyter notebook --no-browser
  1. Installing PyODPS
  • $ docker exec -i -t [container name] /bin/bash
  • $ pip install --upgrade pip
  • $ pip install 'pyodps[full]'
  • $ python -m unittest discover
  1. Finalizing the installation
  • $ docker commit -c='CMD ["jupyter", "notebook", "--no-browser"] ' [existing container name] [your new container name with tag]
  1. Running Notebook as normal
  • $ docker run -p 8080:8888 [your container name with tag]



