Simulation Cloud with E-HPC for the Manufacturing Industry

By Alibaba Cloud E-HPC Team

When Alibaba Cloud started public beta testing of Elastic High Performance Computing (E-HPC) in September 2017, a simulation enterprise customer began to request E-HPC services. This was because the customer had been using some cloud computing products for business needs and faced development issues of the traditional manufacturing industry.

After initial communication with the customer, we found that the customer’s major pain points were as follows: The simulation enterprise’s revenues were from the traditional manufacturing industry, so its major customers were mainly automakers, aerospace companies, and shipyards. These customers had different requirements for computing power at different stages. Sometimes, if they had large simulation demands, the simulation enterprise might fail to meet production requirements due to small machines in its own equipment room. In general, customers’ requirements for computing power kept changing throughout the year.

Initial Elasticity

Auto-Scaling Elasticity

Integration with Cloud Desktop and GPU Servers

Super Computing Cluster

At the beginning of 2018, Alibaba Cloud Super Computing Cluster (SCC) began its public beta testing. By providing computing, storage, and network infrastructure that can run super-computing applications, SCC can provide a nearly linear speed-up ratio for finite element analysis (FEA) software such as fluid simulation. The customer quickly completed the proof of concept (POC) testing for ECC, with the help of the elasticity provided by E-HPC.

From the preceding figure, we can see that SCC has helped significantly increase both the single-node computing power and the multi-node speed-up ratio during FEA of tens of billions of elements. The customer gave feedback on the test as follows:

  1. “Superior computing performance: Both the single-node computing power and multi-node distributed computing power have been significantly improved. Within the computing scale of the test project, satisfactory speed-up ratio has been achieved.”
  2. “High cluster interconnection I/O performance: The high-speed interconnection of RDMA can meet the computing requirements of large-scale mechanical and fluid simulation applications within a certain range and achieve remarkable effectiveness.” With performance and elasticity, the customer had more confidence to migrate the simulation production system to the cloud.

Migrating Simulation Applications to the Cloud

The simulation system provides a unified portal for different manufacturing enterprises to complete the simulation workflow with consistent experience. The early system structure is shown in the following figure. We can see that the early structure is based on traditional super-computing architecture and integrates computer-aided engineering (CAE) parallel computing, computing resource scheduling, software and hardware resource management, remote graphical desktops, CAE professional applications, and other technologies, so as to provide simulation computing services for users. It was costly for the customer to own the infrastructure as means of production to serve its own customers. After communication, we knew that the customer was specialized in simulation but not professional in operating the IT infrastructure, which was just to support its simulation production system. The customer wanted to migrate the IT infrastructure to the cloud so that it could focus on simulation services.

By migrating the simulation system to the cloud, the customer expected to achieve the following effects:

  1. Carry out simulation analysis through the web without the need to purchase any IT hardware resources.
  2. Manage and deploy professional software in a uniform manner to make full use of expensive CAE software resources.
  3. Make full use of cloud resources for simulation through the elasticity of cloud computing.

After gradual verification, the customer has simplified the structure of the simulation system on Alibaba Cloud, as shown in the following figure.

From the preceding figure, we can find that the customer now focuses more on the simulation workflow and uses IT infrastructure simply by calling various open APIs on Alibaba Cloud. When a cluster is required, the customer can create an SCC through an open API. When the computing power is insufficient, the customer can create a computer cluster through an open API. When there are no jobs, the customer can release a computer cluster through an open API. When the customer does not want manual operations, clusters can be automatically scaled in or out through an open API. The customer no longer needs to concern about equipment room building, stock-up, expansion, and equipment maintenance.


Like other enterprise-class IT applications, simulation applications are greatly changed by cloud computing technologies. On the simulation cloud platform, enterprises can design, improve, and innovate products, quickly verify models, and compare schemes. In the traditional manufacturing industry, the biggest value of cloud computing technologies is to free enterprises from purchasing and managing physical computing clusters, so as to allow them to change the traditional simulation process and focus on simulation services. Based on cloud computing technologies, enterprises can use software with more flexible prices and perform modeling anytime and anywhere to solve complex simulation application problems. With the ability to simultaneously simulate multiple design schemes, simulation based on cloud computing technologies can support the traditional manufacturing industry in easier product design and engineering simulation. By means of simulation on Alibaba Cloud, enterprises can quickly acquire elastic resources and perform a complete simulation production process in a short time. No matter whether enterprises want to accelerate product innovation, meet the growing simulation demands of the manufacturing industry, or strengthen global collaboration to increase the return on IT investment, E-HPC can achieve immediate results.

About the Author

Mu Hui, a virtualization platform expert from Alibaba Cloud, specializes in E-HPC and has received several patent certificates and awards in virtualization technology.


Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store