How Yuque, Alibaba’s Work Collaboration Software, Has Evolved Over Time

The Evolution of Yuque’s Technical Architecture

Yuque in the Early Stages of Its Development

Yuque was first created in 2016, when Ant Financial needed a tool to host its documents. At that time, technical staff at Ant Financial used their spare time to build the documentation tool. In the early stage of the project, no personnel or resource support was available. So, to be able to quickly verify the prototype, the team chose the least costly technical solution. The underlying services were completely based on the BaaS service and container-hosting platform provided by the Technology Experience Department in Alibaba Group:

  • File service: This is a file storage service encapsulated on Alibaba Cloud Object Storage Service (OSS).
  • DockerLab: This is a container-hosting platform.

An Internal Service

As the team saw the potential of this online documentation tool continue to grow, the goal of Yuque evolved from simply providing a documentation tool for Ant Financial to having an internal solution that could replace competing products such as Confluence. And then this even further went on to become an important knowledge management platform in Alibaba. Yuque is oriented towards technical innovators, team leaders, and knowledge-base creators. However, there were still hiccups, with the major problem being that simply providing a markdown editor wasn’t enough to allow non-technical personnel to be able to use Yuque efficiently.

Becoming an External Product

With the increasing internal influence of Yuque, some Alibaba alumni who had left the company began to ask Yubo: “Yuque is very useful. Have you ever considered releasing it as a product, so external companies can use it?” After less than 6 months of preparation and refactoring, Yuque became an official product of Alibaba in 2018.

  • Task service: For example, the preview service for massive local files provided by Yuque consume many resources and have complex dependencies. We extracted it from the primary service to avoid the impact of uncontrollable dependencies and resource consumption on the primary service.
  • Function Compute: Tasks such as plantuml preview and mermaid preview are not sensitive to responsiveness, and their dependencies can be packaged into Alibaba Cloud Function Compute. So, we run these tasks in Function Compute to reduce costs and ensure security.

Full-Stack JavaScript

Full-Stack JavaScript and Product Engineers

In Yuque, we don’t define developers who develop in full stack JavaScript as “full-stack engineers,” but rather as versatile product engineers. They are the “technical partners” of products. And many of them feel a sense of ownership for the products, participate in product discussion and design with product managers, and provide technical suggestions to product design solutions. They independently complete full-stack development of product functions and track product performance after release.

Full-Stack JavaScript and Node.js

When we talk about full-stack JavaScript, Node.js is a topic that cannot be avoided. As a server runtime that is highly integrated with the frontend, Node.js has become an advocate for full stack development. Yet is Node.js really suitable for large-scale commercial projects? Many people have their doubts.

Hybrid Application Architecture

When the system grows to a certain size, should we continue to add functions to large standalone applications or split them into microservices? The coexistence of these two architectures proves that they have their own pros and cons. Your architecture selection should be determined based on your current business scale and team distribution. Following this precise logic, the technical architecture of Yuque became a hybrid architecture along with our evolving business format.

  • Task cluster: Some CPU-intensive tasks or services with complex third-party dependencies are placed in an independent task cluster. For example, various file preview services may depend on other services and account for a large amount of computing costs. Therefore, it is best to place these services into a task cluster to eliminate concurrency through queues.
  • Serverless Function Compute: As far as possible, we try to migrate services that are less responsive and can be functionalized to Alibaba Cloud Function Compute. Such services include plantuml, mermaid, and other text drawing services.

Common Fields Other than Language Fields

In addition to the programming language, other aspects need to be considered in any commercial system. Among them, the two most important aspects are security and stability.

  • Server security risks: horizontal permission issues, unauthorized access, sensitive information leakage, Server-Side Request Forgery (SSRF), and SQL injection
  • Cloud service security risks: SMS or email bombing, data leakage, and content security
  • When running user code on the server, put it in a sandbox.
  • When requesting resources transmitted by users from the server, the request must be filtered for SSRF.
  • Use the response serialization method to filter sensitive information.
  • SQL statements cannot be spliced.
  • Abnormality monitoring and tracing: This includes the monitoring of frontend business tracking logs and exception logs, end-to-end tracking and collection of logs on the server, and the monitoring and analysis of system performance. Eventually, we will be able to promptly detect and track exceptions and locate and analyze performance problems.

How Did Yuque Choose an Appropriate Technology Stack?

Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store