Static Compilation of Java Applications at Alibaba at Scale

By Sanhong Li, Ziyi Lin, Chuansheng Lu and Kingsum Chow from the Alibaba JVM team.


A microservice architecture breaks a monolith application into many micro-applications (microservices). This is an attractive approach for applications targeting cloud computing platforms. We can start with the number of microservice instances needed to handle the initial load and scale out with more instances when demand is higher, improving resilience by leveraging the ability of clouds to scale horizontally.

The Java platform has become one of the most widely used platforms. Despite its popularity, Java has received many criticisms, such as Java is very slow to boot; Java takes too much memory; Java syntax is verbose. Notably, long boot time in Java has inhibited horizontal scalability. From a business standpoint, customers may have to wait a long time for an application to boot before the results of a request are received. Speeding up the boot time of Java applications on a horizontally scalable platform is our motivation. To achieve that, we adopted GraalVM native image in serverless computing.

GraalVM native image at Alibaba

At Alibaba, we use the native image technology of GraalVM to statically compile a microservice application into an ELF executable file which results in native code startup times for Java applications. This is needed to address the horizontal scaling challenge described above.

In our scenario, this serverless application is developed based on the SOFABoot framework. Its fat jar size is 120MB+. Many typical components in Java Enterprise space are included such as Spring, Spring Boot, Tomcat, MySQL-Connector, and many others. We refer to applications using this framework as SOFABoot applications. SOFABoot applications were originally running on top of Alibaba Dragonwell (OpenJDK based) designed for a distributed architecture, handling online transactions, and communicating with many other different applications through RPC.

In the global online shopping festival (also called Double 11, or Nov 11) last year, we deployed a number of SOFABoot applications compiled as native images. They managed to serve real online requests in our production environment on a day with the highest transaction volume.

Besides the SOFABoot application, we have also explored the possibility of introducing statically compiled applications into Alibaba Cloud. We successfully deployed a native image version of the Micronaut demo application on Alibaba Cloud’s function computing platform.

In the following sections, we describe the challenges we overcame to use GraalVM native image to do the static compilation to achieve the performance gains in our production environment.

How We Did That

While most traditional Java features are supported by native image to build and run applications, there are still some limitations that prevent the automatic migration from traditional Java to statically compiled Java programs. Native image requires programmers to provide additional information or modify the original implementation of an application to get the program compiled and run as expected. The challenges we faced while adapting the SOFABoot application were:

Slow build time: Static compilation consumes a large amount of memory resources and time. The build time is long. In the beginning, it took around 100 GB of memory and 4000 seconds to build the SOFABoot application. We observed that the majority of the time was consumed on type flow analysis in the static analysis phase. So we employed a less precise but much lightweight CHA analysis to replace the original type flow analysis for the scenarios that require the fast build. After we employed the CHA approach, the memory needed to build was reduced from 100GB to 20GB and the build time was reduced from 4000 to less than 1000 seconds. We were delighted to see a 4X speedup in the build time which helped speed up the deployments of our applications.

Class initialization: Classes are initialized at runtime in traditional Java programs. Native image enables class initialization at build time whenever possible to improve the runtime performance. Eager class initialization at build time is not always safe and it still needs programmers to adjust the class initialization timing manually. Class initialization may happen in a chain so postponing one class initialization to runtime without postponing its predecessors in the chain may lead to class initialization errors at build time. For example, the following code has a class initialization chain of A->B->C.

class A {
static B b = new B();
class B {
static {
class C {
static long currentTime;
static {
static void dosomething(){…}

For application correctness, class C MUST be initialized at runtime due to the call on System.currentTimeMillis(). As a result, the user MUST also do the initialization for class A at runtime since class A is the root of this class initialization chain — when class A gets initialized it triggers the initialization of B and then eventually C. However, in the actual scenario, when the developers observed class C has been mistakenly initialized at build time it was difficult for them to find out that class A was the root cause of the issue, i.e., the developer had mistakenly configured class A to be initialized at build time. Native image provides an initialization tracing feature based on instrumentation to resolve this kind of issue, but it fails when the class cannot be instrumented, e.g., when the bootstrap class loader loads the class. In our solution, we modified the Hotspot code to track the class initialization chain at the VM level and helped our developers to track the class initialization chain and locate the root cause of this kind of mistake. Thus our solution enables the broader use of ahead-of-time compilation by the Java developers.

Dynamic class loading: Dynamic class loading is defining and loading classes at runtime with the bytecodes of classes not known at build time. Dynamic class loading has been widely used in real-world applications, libraries, and frameworks. Some typical examples include the serialization/deserialization mechanism in Java which relies on dynamically generated constructor accessors, Spring which uses cglib for proxies, and Derby which uses a dynamic generated classes for SQL statements. We support dynamic class loading with 4 steps: 1) Modify the dynamic names of generated class’ as fixed ones. We guarantee the same class always has the same name across different runs. 2) Implement method interceptors in native image agent to dump dynamically generated class with fixed name pattern to the file system. 3) Compile the dumped classes into the native image at build time. 4) Find the prepared “dynamically generated” classes at native image runtime instead of defining them. We have committed this feature to the community.

Slow GC performance

This approach is relatively straightforward and useful for many small workloads, but when we tried to support larger workloads such as Spring-based services, the full GC time and frequency become a headache. We observed a single GC pause time could exceed 1.5 seconds for some Java services. That is unacceptable for online applications. So we made some improvements to the garbage collector component of native image as follows:

  • Age information is added to objects in the young generation. Age is added to the memory chunk of a group of objects and it is increased by 1 if these objects survived a young GC; live objects are promoted to the old generation only after reaching a certain age threshold.
  • A background thread is used to un-map memory asynchronously. Native image uses memory chunks to hold Java objects. When it wants to release free chunks to the OS to lower the footprint, it just un-maps the chunk. We observed that for a typical Java application un-mapping memory might cost a long time, so we make this operation asynchronous and execute it outside the stop-the-world pause.
  • Image roots are scanned based on a card table in the young GC. For some specific workloads the final executable image may be large after static compilation, which usually holds a vast set of GC roots and has to be scanned thoroughly for any GC. In the existing design of the native image garbage collector this may cost much time. We added a card table for the image roots, and for young GC operations we only scan those references that got dirtied since last GC pause.

Some of these changes have been committed into the GraalVM project.

Performance Gains

Startup time speedup

Sofaboot Application Startup Time Comparison

We also ran the statically compiled version of the Micronaut-based application on the function computing platform of Alibaba Cloud. The result is also fascinating. native_image_hello is a statically compiled application, and springboot_hello is the same application deployed as a jar and run on top of traditional Java runtime. We have shown the results in the Figure below: native_image_hello is 100x faster at startup with 1/6 memory cost, which can help customers save 80% ("billed duration" is the time the customer is charged for on the cloud platform). The response time of these two deployed applications is nearly the same.

Traditional Java Function vs Statically Compiled Function

GC performance

GC time improvements


We are very pleased with the results in our production environment. We are looking forward to driving innovation through the collaboration with the GraalVM community.

Big thanks to the Alibaba JVM team for sharing their experience using GraalVM and contributions to the community!

Thanks to Shaun Smith.


The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.