Breakthrough in Alibaba Cloud Computing Capabilities — BigBench Reaches 100 TB World Record
In the first day of the 2017 Hangzhou Computing Conference on Oct. 11, Alibaba Cloud President Hu Xiaoming introduced a next-generation computing platform MaxCompute + PAI.
In the main forum on the 12th, Zhou Jingren, Alibaba Group Vice President and director from the Search Division and Computing Platform Division, said that data lays the foundation for artificial intelligence innovation, and possessing plenty of computing capabilities to help fully release the value of the data. Later, Zhou Jingren released BigBench On MaxCompute 2.0 + PAI with Rob Hays, Vice President of Intel’s Data Center Division. The release broke the best records set by TPCx-BB and reflected the extremely robust data processing capabilities of MaxCompute and the absolute strength of public cloud compared to the traditional model.
At present, the maximum capacity publicized by TPC is 10 TB, the best performance is 1491.23 BBQpm, and the best price/performance ratio is 589 Price/BBQpm. Alibaba Cloud’s BigBench on MaxCompute 2.0+PAI expands that capacity to 100 TB for the first time in the world, which is also the first benchmark to be based on public cloud services. Engines running on this platform achieve 7000 points.
It was reported that MaxCompute test environment would be open for one month on public cloud after the conference and that the BigBench On MaxCompute+PAI SDK (inherited from TPCx-BigBench and enabling it to run on the big data environment of Alibaba Cloud) would be open-source for developers to use.
The great capacity breakthrough of BigBench on MaxCompute owes to MaxCompute’s mass data processing capabilities and machine learning algorithm efficiency. MaxCompute, based on the Apsara distributed OS developed by Alibaba Cloud, can connect more than 10,000 servers in a single cluster and process Exabytes of data.
MaxCompute next-generation engines get continuous and in-depth performance optimization in the Compiler, Optimizer, and runtime. In addition to high-performance computing, Alibaba Cloud PAI provides users with a robust algorithm experiment platform which includes traditional machine learning as well as the latest in deep learning and enhanced learning. PAI provides a great number of algorithms and tools to meet algorithm requirements in different business scenarios. The platform is also optimized for performance and data capacity.
Furthermore, MaxCompute and Intel processor integration and in-depth optimization enable full use of Intel Xeon® Scalable Processor’s structural strengths. Rob Hays, Vice President of Intel’s data center division, said “We are delighted to be working with Alibaba Cloud to optimize MaxCompute on the latest Intel® Xeon® Scalable processor platform and to witness the excellent performance of MaxCompute in the BigBench test.”
Well, What computing bonuses does BigBench on MaxCompute2.0+PAI bring for developers to help them seize more market opportunities?
- Break through the capacity bottleneck. When BigBench data capacity exceeds 10 TB, most products will be bottlenecked and unable to expand. BigBench on MaxCompute enables data capacity to be expanded to 100 TB, which meets users’ increasing data capacity requirements.
- Lower cost. Conventional hardware + software building mode requires servers. Though the server cost can be apportioned throughout the lifespan of the servers, purchasing hardware means that your future computing resources come at an increased relative cost, as hardware inevitably drops in price year by year. BigBench employs price/QPM to calculate the price/performance ratio. Compared with the conventional hardware mode, MaxCompute supports prepayment and data-based payment, which offers pricing flexibility and competitive price/performance ratios.
- Meet scalability requirement. The demand of data on the internet means that an explosion of traffic could happen at any time. The traditional hardware model requires a long adjustment period to meet increased demand. BigBench on MaxCompute enables on-demand computing capacity expansion, satisfying enterprises’ capacity expansion requirements at any particular time.
- Save O&M workload. Traditionally, a data room needs to be maintained by an O&M team. Usually, the maintenance quality cannot be guaranteed. BigBench on MaxCompute runs on public cloud, saving enterprise customers from investing additional manpower to carry out maintenance.
BigBench on MaxCompute is modified based on TPCx-BB, so it is compatible with all TPCx-BB semantics. As an industrial benchmark, TPCx-BB covers all operation types of big data processing, including SQL, MapReduce, Streamling, and MachineLearning. The full coverage capability of BigBench on MaxCompute reflects MaxCompute’ software stack integrity in big data processing. The following table lists the software stacks of BigBench on MaxCompute:
BigBench on MaxCompute is also an industrial benchmark, which demonstrates the software stack integrity of MaxCompute in big data processing and the superior performance in capacity, cost, and scalability.
BigBench on MaxCompute is very easy to access. Enterprise customers can connect to the platform provided they have prepared:
- Alibaba Cloud account;
- BigBench on MaxCompute toolkit;
- and MaxCompute client.
For details on the platform use, see MaxCompute Official Documentations.
 BigBench on MaxCompute is derived from TPCx-BB, so it is compatible with all TPCx-BB semantics.
 TPCx-BB (BigBench) was released by Transaction Processing Performance Council (TPC) in Feb. 2016. First E2E big data analysis app-level benchmark.