Installing a Dependency Library for Function Compute

In common programming practice, projects, libraries, and system environments must be installed and configured in synergy. Alibaba Cloud Function Compute runs in a prefabricated runtime environment, which pursues higher concurrency and security by sacrificing flexibility. Because the system and code directories are read-only during runtime, dependency libraries need to be pre-installed to the code but not system directory. The installation tool of the new Function Compute platform cannot yet address these changes. This article explains how to use existing tools to install the dependency library to a project with minimal manual intervention.

Two types of dependent packages are required for developing Function Compute, one of which is the DEB software package installed by the APT package manager. The other is the packages installed by a specific language environment manager (such as Maven or pip). The following analyzes the package managers in different language environments.

Installation Directory of a Package Manager

maven

pip

Before 2004, setup.py was recommended for Python installation. To use it, download any module and use the setup.py file supplied with the module.

python setup.py install

setup.py is developed from Distutils. Released in 2000 as part of the Python standard repository, Distutils is used to build and install Python modules.

Therefore, you can also use setup.py to release a Python module or

python setup.py sdist

even package the module into an RPM or EXE file.

python setup.py bdist_rpm
python setup.py bdist_wininst

Similarly to MakeFile, setup.py can be used for building and installation. However, the building and installation processes are integrated, thus you always have to build a module when installing the module, which wastes some resources. In 2004, the Python community released setuptools, which contains the easy_install tool. After that, Python began to support the EGG format and introduced the online repository PyPi, which respectively correspond to the JAR format and the Maven repository from the JAVA community.

The online module repository PyPi offers two key advantages:

  1. It only requires the installation of the pre-compiled EGG package, improving efficiency.
  2. It automatically downloads dependent packages from PyPi and installs them.

Since its release in 2008, pip has gradually replaced easy_install to become the de facto standard Python package manager. As it is compatible with the EGG format, pip prefers the Wheel format and supports installation of the module from a code version repository (for example, GitHub).

The following describes the directory structure of a Python module. The directories of installation files in both EGG and Wheel formats are classified into five types: purelib, platlib, headers, scripts, and data.

DirectoryInstallation locationPurposepurelib$prefix/lib/pythonX.Y/site-packagesPure Python implementation libraryplatlib$exec-prefix/lib/pythonX.Y/site-packagesPlatform-related DLLheaders$prefix/include/pythonX.Yabiflags/distnameC header filesscript$prefix/binExecutable filesdata$prefixData files, such as .conf configuration files and SQL initialization files

$prefix and $exec-prefix are Python compiler parameters, which can be retrieved from sys.prefix and sys.exec_prefix. Their defaults on Linux are both /usr/local.

npm

Troubleshooting Problems

Dependencies Installed to the Global System Directory

Native Dependencies

When Function Compute runs on Debian or Ubuntu, the APT package is used to manage system installation programs and libraries. By default, these programs and libraries are installed to a system directory, for example, /usr/bin, /usr/lib, /usr/local/bin, or /usr/local/lib. Therefore, native dependencies also need to be installed to a local directory.

Recommended Solutions

  1. Ensure that the development system for dependency installation is consistent with the production execution system. Use fcli sbox to install the dependencies.
  2. Place all the dependency files in a local directory. Copy the modules, executable files, and .ddl or .so files from pip to the current directory.

However, in practice, it is difficult to place the dependency files into the current directory.

  1. Library files installed by pip and apt-get are scattered to different directories. This means that you must be familiar with different package managers in order to find these files.
  2. Library files have transitive dependencies. When a library is installed, other libraries on which the library depends are also installed. This makes it very tedious to manually retrieve these dependencies.

In this case, how can we manually install dependencies to the current directory with minimal manual intervention? The following describes some methods used by the pip and APT package managers and compares their pros and cons.

Installation of Dependencies to the Current Directory

Python

pip install --install-option="--install-lib=$(pwd)" PyMySQL

When --install-option is used, parameters are passed to setup.py. However, neither the .egg nor the .whl files contains the setup.py file. Therefore, using --install-optiontriggers the installation procedure based on the source code package while setup.py triggers the module building process.

--install-option supports the following options:

File typeOptionPython modules — install-purelibextension modules — install-platliball modules — install-libscripts — install-scriptsdata — install-dataC headers — install-headers

When --install-lib is used, the values of --install-purelib and --install-platlibare overwritten.

In addition, --install-option="--prefix=$(pwd)" supports installation to the current directory, but a sub-directory named lib/python2.7/site-packages will be created under the current directory.

Advantages:

  1. You can install the module to a local directory, such as purelib.

Disadvantages:

  1. This method is inapplicable to modules that do not contain source code packages.
  2. A system is built without making full use of the Wheel package.
  3. To fully install the module, many more parameters need to be configured, which is tedious.

Method 2: Use the --target or -t parameter

pip install --target=$(pwd) PyMySQL

--target is a parameter newly provided by pip. When this parameter is used, the module is directly installed to the current directory without creating the sub-directory named lib/python2.7/site-packages. This method is easy to use and is applicable for modules with a few dependencies.

Method 3: Use PYTHONUSERBASE in conjunction with --user

PYTHONUSERBASE=$(pwd) pip install --user PyMySQL

When --user is used, the module is installed to the site.USER_BASE directory. The default value of this directory is ~/.local for Linux, ~/Library/Python/X.Y for MacOS, and %APPDATA%\Python for Windows. The environment variable PYTHONUSERBASE can be used to change the value of site.USER_BASE.

Similar to --prefix=, when --user is used, the sub-directory named lib/python2.7/site-packages is created.

Method 4: Use virtualenv

pip install virtualenv
virtualenv path/to/my/virtual-env
source path/to/my/virtual-env/bin/activate
pip install PyMySQL

virutalenv is a recommended method from the Python community because it does not contaminate the global environment. When virtualenv is used, both desired modules (such as PyMySQL) and package managers (such as setuptools, pip, and wheel) are saved to a local directory. Although these modules increase the size of the package, they are not used during runtime.

apt-get

apt-get install -d -o=dir::cache=$(pwd) libx11-6 libx11-xcb1 libxcb1
for f in $(ls ./archives/*.deb)
do
dpkg -x $pwd/archives/$f $pwd
done

Running Method

Python

> import sys
> print '\n'.join(sys.path)
/usr/lib/python2.7
/usr/lib/python2.7/plat-x86_64-linux-gnu
/usr/lib/python2.7/lib-tk
/usr/lib/python2.7/lib-old
/usr/lib/python2.7/lib-dynload
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages

By default, sys.path includes the current directory. Therefore, the second method can ignore setting sys.path because the module is installed in the current directory when you use the -- target or -t parameter.

You can use sys.path.append(dir) when the program starts because sys.path is an editable array. To improve the portability of the program, you can also use the PYTHONPATH environment variable.

export PYTHONPATH=$PYTHONPATH:$(pwd)/lib/python2.7/site-packages

apt-get

PATH

The PATH variable indicates a list of directories that the system uses to search for executable programs. Add bin or sbin directories such as bin, usr/bin, and usr/local/bin to the PATH variable.

export PATH=$(pwd)/bin:$(pwd)/usr/bin:$(pwd)/usr/local/bin:$PATH

Note that the preceding content is applicable to Bash. For Java, Python, and node.js, make adjustments accordingly when modifying the PATH environment variable of the current process.

LD_LIBRARY_PATH

Similar to PATH, LD_LIBRARY_PATH is a directory list in which you can search for DDLs. Typically, the system places dynamic links in the /lib, /usr/lib, and /usr/local/libdirectories. Some modules are placed into the sub-directories of these directories, such as /usr/lib/x86_64-linux-gnu. Typically, these sub-directories are recorded in the files under /etc/ld.so.conf.d/.

cat /etc/ld.so.conf.d/x86_64-linux-gnu.conf
# Multiarch support
/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu

Therefore, the so files in the directories declared in the files under $(pwd)/etc/ld.so.conf.d/ also must be retrievable from the directory list set by the LD_LIBRARY_PATH environment variable.

Note that modifications to the LD_LIBRARY_PATH environment variable during runtime may not take effect, which is true, at least, for Python. In the LD_LIBRARY_PATH variable, the /code/lib directory has been preset. Therefore, you can soft-link all the dependent so files to the /code/ lib directory.

Conclusion

The four methods provided by Python are applicable to any common scenario. Despite the slight differences described above, you can choose an appropriate method based on your needs.

apt-get is another method. Compared with other methods, this method reduces the package size because it does not require to install the deb package that is already installed in the system. To further reduce the size, you can delete unnecessary files that have been installed, such as the user manual.

This document is part of the technology accumulation process for customizing better tools. On this basis, we will provide better tools in the future to simplify development.

References

  1. Pip User Guide
  2. python-lambda-local
  3. python-lambda
  4. Guide to Python package management tools
  5. Running apt-get for another partition/directory?

Reference:https://www.alibabacloud.com/blog/installing-a-dependency-library-for-function-compute_594410?spm=a2c41.12532268.0.0

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com