Using Brotli to Compress Large Files

This article discusses how you can use Brotli to compress large files so that you’re able to upload them to Function Compute.

For some background on this, Function Compute limits the maximum size of the zipped code package to be uploaded to 50 MB. However, in some scenarios, code packages may exceed this size limit, for example, uncropped serverless-chrome. Other examples include LibreOffice and trained machine learning models.

Currently there are three methods that can be used to solve this problem:

  1. You can use compression algorithms with a high compression ratio, for example, the Brotli algorithms described in this article.
  2. Use the OSS runtime for downloading.
  3. Use NAS for file sharing.

The following table looks at the advantages and disadvantages of these three methods.

Image for post
Image for post

Normally, the startup speed is relatively fast if the code package is less than 50 MB. Putting data and code together can simplify the engineering work: No additional scripts are required to update OSS or NAS.

Brotli: The Compression Algorithm

Brotli is an open-source compression algorithm developed by Google engineers. Brotli is used as the compression algorithm for HTTP transfer in the latest version of the popular browsers. The following figures obtained on the Internet show the benchmark testing for Brotli and other common compression algorithms.

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post

As shown in the three figures, compared with gzip, and also xz and bz2, Brotli has the highest compression ratio and the slowest compression speed and its decompression speed is close to that of gzip.

However, the example scenario that will discussed in this article is not sensitive to the compression speed. Only one compression task is required during the development preparation phase.

Create a Compressed File

First, let’s see how to create a compressed file.

Install Brotli

If you’re using MacOS, you can use this command in Brew:

And if you’re using Windows, youcan go to this page to download Brotli:

Package and Compress Files

The two files before packaging are 7.5 MB and 97 MB, respectively.

After the files are packaged and compressed with gzip, the size is 44 MB.

Package the files again with the z option removed from tar. The file size is 104 MB.

After compressed, the file is 33 MB, much smaller than the file compressed with gzip (44 MB). The total time is up to 6 minutes and 18 seconds while using gzip only takes 5 seconds.

Unzip the Runtime

Take the Java Maven project for example.

Add the Unzip Dependency

commons-compress is a compress library by Apache and provides consistent abstract interfaces for various compression algorithms. For Brotli, only decompression APIs are supported, but this can still meet the requirement in this scenario. The org.brotli:dec package is the underlying implementation of the Brotli decompression algorithm provided by Google.

Implement the Initialize Method

Implement the initialize method of the FunctionInitializer interface. The decompression begins with four nested streams:

  1. FileInputStream: reads files
  2. BufferedInputStream: provides cache, describes the context switch resulting from invocation and displays the read speed
  3. BrotliCompressorInputStream: decodes byte streams
  4. TarArchiveInputStream: extracts the files from the tar package

Files.setPosixFilePermissions restores the permissions of files in the tar package. The code is too long and not provided here.

The preceding code segment will print the decompression time (about 3.7s).

Do not forget to configure Initializer and InitializationTimeout in template.yml.

By Yi Xian

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store