Status Quo and Technology Trend Report of Java
With the release of the Java Development Manual, let’s summarize the latest technology trends and discuss Java’s future. This article describes the status quo of basic Java technologies in terms of JavaSE open-source status, OpenJDK version ecosystem, and OpenJDK technology trends. It will further discuss the future evolution trends of Java Virtual Machine (JVM) technology. JVM technology is the foundation for supporting Java applications in the cloud-native, Artificial Intelligence (AI), and multi-language ecosystem fields.
In 1991, James Gosling and his team started the Oak project, which was Java’s predecessor. In 1995, Java 1.0 was released. You are probably familiar with the Java language slogan, “Write once, run anywhere”. In the beginning, Java was mainly intended to be used for Interactive Television. A few years later, SUN (later acquired by Oracle in 2010) wanted to use Java to build a network operating system for desktops to replace Windows. Java achieved little in the desktop field but unexpectedly achieved a lot in the enterprise-level application field. Now, Java is almost dominant in this field.
JavaSE Open-source Status
At the Java One conference in 2006, SUN announced open-source Java technology. At the end of 2006, SUN released HotSpot and javac under the GNU General Public License (GPL), which was a milestone in Java development. Alibaba signed the Oracle Contributor Agreement (OCA) and participated in the development of OpenJDK in 2012.
OpenJDK is the open-source Reference Implementation of JavaSE. In the JavaOne keynote of 2017 (JavaOne was renamed CodeOne by Oracle in 2018), Oracle promised to make all commercial implementation functions contained in Oracle JDK open-source.
In Java 11 released in 2018, Oracle made the binary files of OpenJDK and Oracle JDK similar in functions, in spite of some differences between some options of OpenJDK and Oracle JDK.
In addition to OpenJDK, the open-source trend of basic Java technologies has intensified in recent years. In 2017, IBM made J9 virtual machines, which had been used internally for more than 20 years, open-source, and contributed them to the Eclipse Foundation. In 2018, Oracle made GraalVM 1.0 open-source. Its core included the Just-in-Time (JIT) compiler and Graal written in Java, SubstrateVM and the Truffle framework that supports multilingual interpreters. Enterprises expect to build and benefit from a more powerful language ecosystem through open-source.
The combination of cloud and open-source allows common developers to use top tools (chains) at lower thresholds. Any enterprise can use the same technologies as any large organization (democratizing), which ushers the developers’ golden age.
Which JDK Should You Choose While Java Is Still Free?
Java is still free of charge. However, as the Oracle JDK license starts to be charged for, OpenJDK will gradually replace Oracle JDK as the mainstream choice. This is also shown in the JVM 2020 ecosystem report: The usage of Oracle JDK dropped from 70% in 2019 to 34% in 2020.
Charging for Oracle JDK objectively intensifies the trend to fragmentation of the OpenJDK ecosystem. Multiple OpenJDK-based optional implementations emerge, including Alibaba Dragonwell.
Select a Java vendor’s JDK based on the following factors:
- Security and Stability: Check whether the latest upstream updates are synchronized to the JDK version in time, including security patches and critical bug fixes.
- JavaSE Standard Compatibility: Check whether the JDK version is compatible with standard Java.
- Performance and Efficiency: Check whether effective tools are provided for troubleshooting and performance tuning to help developers efficiently solve Java problems. Check whether any optimization features are available for enterprise business scenarios at the JDK (class library) level in JVM to help improve resource utilization and stability of production systems.
- Fast Adoption of New Technologies: Along with charging, Oracle manages the Java version lifecycle by using Long Term Support (LTS). Oracle will specify a Java version of LTS every three years. Java 8 and Java 11 are LTS versions. It is difficult for most enterprises, especially large- and medium-sized enterprises, to keep up with the semiannual release of Java, such as the feature release (FR) versions like Java 12 and Java 13. Then, if you choose to stay on an LTS version such as Java 11, can you enjoy the technical benefits of the JVM or JDK released in a later version (Java 11+) in advance without an upgrade?
I will share the plans for Alibaba Dragonwell and our thoughts on these aspects.
Alibaba Dragonwell is the open-source version of Alibaba JDK (AJDK) that is widely used within Alibaba. As the cornerstone, Alibaba Dragonwell supports almost all Java-based businesses in the Alibaba economy and has withstood the test of big promotions such as the Double 11 Shopping Festival. Alibaba Dragonwell aims to improve Java’s application stability, efficiency, and performance upon large-scale Java application deployment in a data center.
Alibaba Dragonwell 8.0.0 was made open-source in March 2019. We have been fulfilling our commitment to make the internal features of AJDK open-source gradually. In the recently released Alibaba Dragonwell 8.3.3, we have made many functions open-source, such as JWarmup, ElasticHeap, multitenancy, and JFR. We are also planning to make the coroutine Wisp 2.0 and GIAC Certified Incident Handler (GCIH) open-source.
As the downstream of OpenJDK, Alibaba Dragonwell synchronizes the latest upstream updates in each released version, including security updates and bug fixes. In addition, each released version has passed the large-scale application cluster test in Alibaba.
In terms of new technology adoption, Alibaba Dragonwell has released and maintained two LTS versions: Java 8 and Java 11. The Alibaba JVM team will port features of Java 11+ to Java 8 and Java 11 based on actual business conditions. In this way, Alibaba Dragonwell users enjoy the technical benefits of these features, without upgrading to FR versions such as Java 12 and Java 13.
OpenJDK Technology Trend
Over the past 20 years, Java technology has evolved around productivity and performance. In many cases, productivity outweighs performance in Java design. While the Garbage Collector introduced in Java frees programmers from complex memory management, Java applications are affected by Garbage Collection (GC) downtime. Based on the intermediate bytecode design of stack virtual machines, Java properly abstracts the differences between different platforms (such as Intel and ARM) and uses the JIT compiler to solve the peak performance problem of Java applications. However, JIT leads to the warmup cost. In normal cases, Java applications need to load, interpret, and execute classes before executing a highly optimized code.
In terms of JVM, the emerging technologies of the OpenJDK community can be classified into tools, GC, compilers, and runtime:
Oracle has made its commercially available JFR open-source since Java 11. JFR is a powerful tool for Java application troubleshooting and performance analysis. Alibaba Cloud is a major contributor and has ported JFR to OpenJDK 8 with the community, including RedHat. OpenJDK 8u262 (Java 8) that will be released in July 2020 will provide the JFR feature by default so that Java 8 users will be able to use the JFR feature for free.
Concurrent copy GC has been available in both the Z Garbage Collector (ZGC) that has been released with Java 11 by Oracle and the Shandoath that has been implemented by RedHat for several years. Concurrent copy GC solves the GC downtime problem upon a large heap. In JDK 15, to be released in September, ZGC will change from experimental to available for production. Actually, in AJDK 11, the Alibaba JVM team has done a lot of work to port ZGC from Java 11+ to Java 11 and fixed related problems. During the Double 11 Shopping Festival in 2019, the Alibaba JVM team worked with the Alibaba database team to run database applications on ZGC. Upon a heap memory of more than 100 GB, the GC downtime remains less than 10 ms.
The new-generation JIT compiler developed on the basis of Java is used to replace the C1/C2 compiler of the HostSot JVM. The Ahead-of-Time (AOT) technology on OpenJDK is developed based on the Graal compiler.
The OpenJDK community coroutine project corresponds to the Wisp 2.0 implementation of AJDK.
Aggressive Java: Future-oriented Evolution
2020 is a brand new stage for Java. This article discusses Java’s future development in three aspects: cloud-native, AI, and multi-language ecosystem. Some discussions are beyond Java.
Language Evolution for Cloud-native
In the era of cloud-native, the way of delivering software has been fundamentally changed. Take Java as an example. Java developers previously delivered applications as JAR or WAR files. In the era of cloud-native, Java developers deliver containers.
In terms of operation, requirements for cloud-native applications are as follows:
- Always Watching
- Extreme Low Memory Footprint
- Quick Boot-time
Leading in the enterprise computing and internet fields, the Java language provides consistency, a rich ecosystem, rich third-party libraries, and various serviceability supports. As applications are microservice-based and serverless in the cloud era, Java application speed cannot be further improved due to the Java startup and running overhead.
In the new cloud-native context, our discussion about language evolution is not limited to the compiler level at runtime. The new computing form must be accompanied by the transformation of the programming model, which involves a series of transformations on libraries, frameworks, and tools of program languages. In the industry, there are many upcoming projects such as the next-generation programming frameworks Quarkus, Micronaut, and Helidon of GraalVM and static compilation technology of Java (SVM). Quarkus puts forward the concept of “container first”. The layer-based lightweight uber-jar is exactly in line with the trend of delivery by the container. The “Checkpoint Restore Fast Start-up” technology (AZul proposed a similar idea at the JVM technology summit 2019) jointly developed by Red Hat’s Java team and the OS team implements fast Java startup on the underlying technology stack.
We have also carried out relevant R&D work in Cloud-native Java. Java is a static language, but it contains a large number of dynamic characteristics, including reflection, class loading, and Bytecode Instrumentation (BCI). These dynamic characteristics are essentially counter to the Closed-World Assumption (CWA) principle of GraalVM and SVM. This is also the main reason why it is difficult for traditional Java applications running on JVM to be compiled and run on SVM. The Alibaba JVM team has made static tailoring on AJDK to find a definite boundary between static and dynamic Java characteristics, making static compilation possible for Java at the JDK level. In addition, the JVM team works with the intermediate team of Ant Financial to define a Java programming model for static compilation and uses a programming framework to ensure that the development of Java applications is user-friendly for static compilation. We have statically compiled the service registry’s meta node application that is built based on Ant Financial’s open-source middleware SOFAStack. Compared with the traditional applications that run on JVMs, the performance of this application has been improved greatly: The service startup time is reduced by 17 times, the size of an executable file is reduced by 3.4 times, and the runtime memory usage is reduced by half.
AI Emerging as a New Challenge for The Heterogeneous Computing of Programming Languages
In 2005, Justin Rattner, the CTO of Intel at that time, said “We are at the cusp of a transition to multicore, multithreaded architectures”. In more than a decade, the fields of programming languages and compilers have been exploring optimization for the parallel architectural paradigm. With the emergence of artificial intelligence (AI) in recent years, new challenges are raised for the fields of programming languages and compilers for Field Programmable Gate Array (FPGA) and graphics processing unit (GPU) heterogeneous computing at different times in similar scenarios.
In addition to the automatic parallelizing work done by traditional compilers such as IBM XL Compilers and Intel Compilers, in terms of exploring ultimate performance, the polytope model-based compilation optimization technology is used as the solution for program parallelization and partial data optimization, which has become a hot topic for research in the field of compilation optimization.
In terms of parallel languages, CUDA reduced the difficulty of GPU programming for C and C++ developers. However, the essential difference between GPU and CPU results in excessively high development costs. Developers need to learn more about the underlying hardware details and the huge gap between underlying hardware models and advanced languages faced by advanced development languages such as Java.
In the Java field, AMD shared its Sumatra project as early as the JVM Technology Summit of 2014, to try to implement interaction between JVM and target hardware of Heterogeneous System Architecture. Recently, the TornadoVM project launched by the University of Manchester contained a JIT compiler (supporting the mapping from Java bytecode to OpenCL), an optimized runtime engine, and a memory manager that maintains memory consistency between the Java heap and the heterogeneous device heap. TornadoVM aims to enable developers to write heterogeneous parallel programs without knowledge of GPU programming languages or GPU architecture. It transparently runs on AMD GPUs, NVIDIA GPUs, Intel integrated GPUs, and multi-core CPUs.
In the general CPU field, the Vector API project (a subproject of Panama) of the OpenJDK community achieves computing performance improvement several times by using single instruction, multiple data (SIMD) of the CPU. The Vector API project is also widely used in big data and AI computing. The Alibaba JVM team has ported the Vector API project to AJDK 11 and will make it open-source in Alibaba Dragonwell. We have obtained the following basic performance data.
The shorter the time (in milliseconds), the better the performance.
Polyglot Programming Connecting Multi-language Ecosystems
Polyglot programming is not a new concept. In the managed runtime field, IBM made Open Managed Runtime (OMR) open-source in 2017. Oracle made the Truffle and Graal technologies open-source in 2018. OMR and Graal technologies allow developers to develop languages at much lower costs. OMR provides GC, JIT, and reliability, availability, and serviceability (RAS) in the form of C and C++ components. Developers implement high-performance languages based on these components in the form of ‘glue’. Truffle is a Java framework that relies on the AST parser to implement a new language. Essentially, it maps your new language to the JVM world. Different from languages (such as Scala and JRuby) that are built on JVM ecosystems and are still Java languages essentially, OMR and Truffle provide production-level GC, JIT, and RAS support, and newly developed languages do not need to re-implement these underlying technologies.
Java is over 20 years old but is growing in use and popularity. This report aims to help Java developers review the status quo of Java technologies and discuss the evolution trends of Java technologies in important fields such as cloud and AI. It also introduces some practices of Alibaba. As one of the largest Java users in the world, Alibaba has been exploring the application of cutting-edge Java technologies in real production environments through experiments in various business scenarios at Alibaba. We are also very willing to share our experience, practices, and technical insights in the Java field, including the Java Development Manual, to promote Java’s development.
The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.