How Can Xianyu Achieve Trillion-level GMV Transaction with Cloud Native?
Step up the digitalization of your business with Alibaba Cloud 2020 Double 11 Big Sale! Get new user coupons and explore over 16 free trials, 30+ bestselling products, and 6+ solutions for all your needs!
By Wang Shubin, Head of Alibaba Xianyu Architecture
Contributed by Alibaba Cloud Serverless
On June 28, 2014, a team of 28 people, after working day and night in a tea room in Hangzhou for three months, launched a secondhand trading platform called Xianyu, also known as Idle Fish. In May this year, Alibaba released the data of Xianyu in its annual report, including 200 billion yuan in gross merchandise volume (GMV), which is an increase of 100% year-on-year, with more than 30 million active sellers per day. In just 6 years, Xianyu has grown from an ordinary business product to a leading C2C platform in China.
According to iiMedia Data, the transaction volume of the second-hand goods trading market in 2020 will reach more than a trillion yuan. In order to maintain the growth of this industry, businesses need to constantly adjust and evolve the technical architecture to support the rapid development of their business. For Alibaba, Xianyu represents innovation, which is more than just bringing revenue to the company. Xianyu is innovative not only in its business model, but also in its exploration of technical architecture, that is, towards Flutter, cloud native, and Serverless.
In 2009, Wang Shubin, who worked for UTStarcom for three years after graduation from Zhejiang University, joined Alibaba. In 2017, he introduced Flutter to Xianyu. In 2018, Wang started to lead his Xianyu technical team into a bigger game in business: to explore Serverless architecture. Disruptive innovations often take place at margin places. For Xianyu, adopting cloud native technologies or Serverless is a brand new path. However, the success of Xianyu will be of a valuable example for many companies of online transactions.
Now, let’s get to the story between cloud native and Xianyu.
As a frontend business that relies on the Alibaba e-commerce system, Xianyu has unique business features and user requests. While relying on Alibaba system at the bottom layer, Xianyu needs a more suitable, rapid, and flexible R&D system at the presentation and business layers.
The IT system of Xianyu, if developed in its original way, will face many challenges, such as:
- The boundaries among client interaction layer, business glue layer, and domain layer on service side are not clearly defined. As a result, even small business demands require the involvement of personnel of the full procedure, resulting in high collaboration costs and long development and debugging cycles.
- Huge applications that take a lot of resources exist on the server, and there are serious R&D coupling, release coupling, and O&M coupling. Moreover, system stability is also greatly challenged because a single business problem often affects the entire application.
- The O&M cost is extremely high. To ensure the business stability and availability, Alibaba has corresponding regulations and rules for each application to be released. Even if the application is internally used and has only one or two access requests per day, it must comply with existing regulations for release, which will inevitably consume some fixed resources. The resources consumed by a single application of this kind may be limited. However, the accumulated resources consumed by all these applications are no longer small. Meanwhile, for huge applications that can make great impact, more rigorous processes and procedures are required for release. If this is the case, at least 6 hours are required for each release, resulting in high O&M costs.
With the emergence of Serverless, the cloud-end integrated R&D becomes possible, greatly reducing the collaboration cost required by small business. In addition, compared microservices, Serverless provides a more reasonable way to split up huge applications in the business glue layer.
The following triangle shows the mutual restrictions of the cost (speed), stability, and quality of traditional huge applications.
The emergence of new technologies such as cloud native and Serverless can submerge O&M capabilities of applications. Thus, the problems of mutual restrictions of the cost (speed), stability, and quality of traditional huge applications can be broken. In the process of implementing new technologies, Xianyu first focused on developing the hybrid engineering system and the high-performance component library of Flutter. Then, it focused on the cloud-end integrated R&D system and the service-side architecture of business assembly layer based on Serverless.
The Xianyu client performs architecture evolution and innovation based on Flutter. After improving the R&D efficiency by unifying Android and iOS systems with Flutter, Xianyu hopes to solve many collaboration problems among different roles through the combination of Flutter and Serverless. It is those problems that lead to low overall R&D efficiency and cause mobile ends to get further away from business. As a result, developers have no time for underlying field improvement on the service side. By introducing Serverless, the overall R&D efficiency of Xianyu will be obviously improved.
Practicing While Exploring
In 2018, Xianyu technical team began to explore Serverless, which can be divided into four stages of self-building the Dart server, relying on the Functions as a Service (FaaS) platform, realizing cloud-end integration, and achieving Serverless-based traditional huge applications.
In May 2018, the Dart Server framework with 2-second cold boot was built based on Serverless, which was designed for lightweight development of the service-side business glue layer.
From the end of 2018 to the beginning of 2019, Xianyu launched a project in collaboration with a Gaia team to build Dart Runtime based on the Gaia platform. Some services have already been released online. Note: Gaia is a FaaS platform for Taobao business, which is packaged based on Alibaba Cloud and features of Taobao’s business.
In 2019, Xianyu, based on the Dart Runtime, explored the on-cloud integrated programming with the combination of Flutter and FaaS, as well as metadata-directed domain interfaces. At last, the architecture of glue layer, such as Nexus, came into being and was implemented in more than 20 business of Xianyu.
In 2020, Xianyu began to integrate engineering and tools in the cloud, with the goal of realizing multiple deployments of single engineering. At present, Wang is working with the technical team to solve governance issues of traditional huge applications in the glue layer. They are also integrating huge applications with Serverless. “We will make everything right in 3 to 6 months.” said Wang.
Specifically, the practical results of Xianyu by adopting Serverless over the past two years can be shown in following five aspects:
1) On-cloud Integrated Programming Framework (Nexus API)
This framework aims to unify the programming models of Flutter and FaaS, and to integrate user interface (UI), interaction, data, and logic. Wang mentioned that when they decided to conduct the integration of Flutter and FaaS, they only had a vague understanding of the word “integration”. All they knew was that Dart can be used to write FaaS functions, which was still the language-level integration. In terms of capabilities of FaaS, they only knew few of FaaS capabilities in the backend for frontend (BFF) level, which was implemented for a long time in the frontend.
It took them quite a long time to realize that under the Dart ecosystem, the frontend FaaS was actually not efficient in R&D delivery. There are two major problems in the R&D phase:
The programming language is not unified. Although the programming language itself is not the biggest obstacle, it does set a lot of thresholds to frontend development. What’s more, the ecology, environment, and system behind the language are more difficult and complex for frontend developers.
The development mode and architecture are separated, and the environment is complex. The end-side project and the FaaS side project are independent. Both of them have their own tool chains for building, debugging, integrating, and publishing. In addition, FaaS also has its own supporting environment, runtime, and framework. Faced with such a complex FaaS R&D environment and dual R&D workflows, developers can hardly achieve efficient delivery.
After discussion and understanding, Wang and his team finally have a clear consensus on integration, that is, to achieve integrations of two key points:
• Language integration
• Integration of development mode and architecture
The integration of programming languages can provide developers with a technology stack that they are familiar with. The integration of development modes and architectures can help developers solve the problems of separated projects and handle the complicated local running environment of FaaS. By doing so, developers can have the same experience as that of original R&D model.
The purpose of above two integrations is to minimize the gap between the development of Flutter page and FaaS. For example, the client Flutter of Xianyu used to be developed under the Redux framework. Now, with the Nexus API framework, Redux and FaaS calls can be seamlessly integrated.
2) Standardized CLI Development Tools
In cloud-based integrated development, some details of FaaS development are shielded by using command line interface (CLI). Thus, the development experience of FaaS development is standardized, meeting the local development practices of the client developers.
3) BaaS-based Basic Services
Over the past two years, we have been simplifying the basic service capabilities, such as object storage, messaging, and search. At the same time, a metadata center for business domain layer services is built. These simplified basic service capabilities, together with the existing business domain layer services, allow the client developers to quickly assemble different services.
4) On-Cloud Project Integration
After the successful introduction of Flutter, Xianyu has formed a cross-end R&D system at the end side, with Flutter as the main part and H5 as the auxiliary. Thus, Xianyu integrates the traditional R&D on Android and iOS. When the productivity on the end is released, the end developers have the opportunity to move a little bit deeper to the lower layer. Thus, the simple data assembly-oriented logic on the service side can be completed by end developers in a closed loop. This model is especially suitable for some small business demands. Similar attempts have already been made in the industry, as you can see from the popularity of GraphQL frameworks and the formation of the frontend BFF layer. However, with Serverless, the development of lightweight code on the service side can be greatly simplified. That is why Xianyu chooses to promote cloud-based integration at this time.
Cloud-based integration concerns cloud-based programming frameworks, tool chains, engineering systems, BaaS-based basic services, and submerging of domain services. It also involves organizational support, labor remodeling, and production safety training.
5) Serverless Transformation of Traditional Huge Applications
Serverless is not a silver bullet, but it magically matches the features of the business glue layer. It is very suitable for splitting up traditional huge applications of the glue layer. As such, how to bring it into practice is the next challenge that Xianyu is tackling.
Challenges and Solutions
It is not easy for Xianyu to implement a Serverless transformation. Wang mentioned that during Serverless cloud-based integration, the team was confronted with some technical difficulties. They had to solve problems such as the heterogeneous language access of Java rich clients, the unifying of open environments, and a lack of familiarity with domain interfaces among client developers.
The Java system of Xianyu involves a large number of Java rich client applications. For the problem of heterogeneous language access of Java rich clients, Xianyu establishes Java proxy in Sidecar mode to solve this problem.
Then, Xianyu develops its own CLI tool (namely, GCLI) to unify the development environment. GCLI is a command-line tool that supports the FaaS R&D lifecycle. It defines the closed loop of Xianyu FaaS development, unifies the FaaS R&D environment, and serves as a powerful tool to improve the FaaS R&D efficiency. GCLI disassembles the R&D loop into development commands that are suitable for Serverless R&D. To allow users to apply their R&D habits and tools, Xianyu chooses the local-based development solution. Xianyu also adopts the Docker technology to unify the development environment and declares the running environment in Dcoker (software plus configuration) on which the Dart FaaS technology stack depends. With the container technology, the FaaS software environment can be transplanted to any operating system that supports Linux operation. Thus, the problem of how to unify environments is solved. Moreover, GCLI realizes the interoperability between local and functional platforms through FaaS OpenAPIs, forming a complete R&D loop.
In addition, to solve the problem that client developers are unfamiliar with domain interfaces, Xianyu develops the domain layer metadata center.
Cloud-end integration has reshaped the traditional boundaries between the cloud and ends, reduced collaboration, and made the labor division more flexible. In addition, it significantly improves the R&D efficiency and quality of technologies. These changes directly benefit the business in making the business iterate and adapt to the changes of market and user demands more quickly.
Cloud-end integration is currently applied in both interaction-centered and lightweight business scenarios of Xianyu. The improvement of technical R&D efficiency and quality can be easily presented in the form of quantitative data. For example, based on the sampling statistics of typical large and medium-sized business demands, the requirement on personnel and time of development decreases by 30%, and the bug rate per 1,000 lines of code reduces by 20%. The improvement will be more obvious if scattered demand statistics are used. In the past, it often took several weeks to solve a small business demand due to the high requirement on the developer number. However, thanks to the cloud-end integration, the flexibility of resources is significantly enhanced, greatly improving the demand response speed.
“However, there are still some problems to be solved.” said Wang. In terms of splitting up huge Serverless applications, Xianyu has encountered more serious problems, such as:
- Selection of microservices and Serverless
- Code reuse between functions
- Unified upgrade of function dependency
The solutions for these problems are still being verified. We will share them with you as soon as they are proven useful.
Reference and Reflection
Which company, application, or scenario should choose Serverless architecture? At present, there is no specific explanation. The key is to think through the situation, which means to balance the benefit, cost, efficiency, and the ability to respond to the market. Among them, the cost is the factor that requires more attention from enterprises, which includes the cost of infrastructure construction, O&M, expansion, and security.
Netflix is a successful example of adopting Serverless. Netflix is always innovative in designing products. In addition to continuous A/B testing, many new features are released every week. In order to achieve such wonderful work results, an API service platform is needed to help client developers quickly and effectively deploy the modification requirements to the service layer. To do so, FaaS abstracts all platform components related to services into business logic. Serverless, on the contrary, provides a platform for Netflix, so that engineers even without server or operation experience can develop highly available services.
The adoption of FaaS is essentially the customization of transaction speed and possibility. The FaaS service performance of some applications is great, like Netflix API. Netflix runs relatively unified microservices and only needs to access and modify the data of downstream services. However, if services require to be customized, for example, changing the components of a service platform, such as Remote Procedure Call (RPC), data access, caching, and authentication, the FaaS mode may not be flexible enough for these services.
Self-built Serverless platform has higher requirements on IT personnel of enterprises, and the construction cost is also high. In addition, the mature service ecology is required to adopt Serverless. In most cases, enterprises that are already in the cloud should give priority to Serverless products of cloud service providers. Enterprises that are not in the cloud should consider whether the ecology of the existing system can be compatible with Serverless products of their potential cloud service providers.
For the selection of Serverless products, it is suggested to consider the following aspects: ecological maturity, supported development languages, feature richness, and charging standards. The key is to consider the needs of the enterprise’s own business development.
O’Reilly once conducted a survey on the application of Serverless. The survey showed that Serverless received concerns from and was adopted by developers in the software industry mostly, which was not a surprise. However, the financial and banking industries are also paying close attention to Serverless. One of the reasons is that more and more financial technology start-ups are founded. These start-ups with traditional architectures are accepting and embracing Serverless with more opened altitude.
As for the reason for refusing Serverless, 60% of the interviewees said they were worrying about security problem. Many industries have high security requirements for IT environments, and the adoption of any new technology may bring security risks.
In addition, developers also worried about being bound to providers. As a result, organizations in certain sizes start to build their own Serverless platforms based on open-source solutions such as Knative. Once an open-source solution becomes the mainstream, cloud providers will take the initiative to provide services that are compatible with open source standards and increase their investment in the open source community.
Serverless not only affects technologies and business services, but also imposes new requirements on the organizational structures of enterprises and technicians.
Firstly, Serverless has changed the communication structure. According to Conway’s law, the organizational structure should align with the new communication structure. In the past, personnel concerning the client and service side in Xianyu are different. Now, under the brand-new Flutter and Serverless architecture, the organizational structure needs to be adjusted accordingly. After discussion, Xianyu decided to regroup the developers of the client and service side according to the business line.
Secondly, Serverless offers the client developers more opportunities to learn about the business. Therefore, client developers must be more sensitive to the business than before. At the same time, Serverless helps client developers expand their technical boundaries, so developers also need to better understanding certain concepts of the service-side development.
Finally, Serverless requires the original service-side developers to have better data modeling and domain modeling capabilities, so that higher reusability of the underlying interface can be achieved.
In the beginning, Xianyu was not favored by the public and even ridiculed as “salted fish” (someone with no achievement or prospect). But now, Xianyu has tens of millions of daily active users (DAU) and has revitalized a market worth at trillions. The emergence of Xianyu has greatly influenced both the frontend e-commerce ecology and user lifestyle on the Internet.
To support the trillion-yuan-level transaction, Wang and his technical team are racing to transform traditional huge applications through Serverless. Wang says that “I will be satisfied if we can totally apply Serverless.”