How Do We Manage Code Branches at Alibaba?
There are a lot of exciting engineering practices gaining popularity internally at Alibaba Group and Alibaba Cloud. Some of these practices use tools and workflow that are effective in the massive environment of the Alibaba Group but would be difficult to replicate in the outside world. On the other hand, there are standard practices that we see all around. For example, on the issue of branch management, tools and habits both play important roles. Some of the tools and habits discussed in this article are unique to Alibaba Cloud, but organizations around the world can replicate the same for their benefit.
Alibaba Group has a lot of research teams, with various departments using different publication flows. The branch strategies are not necessarily the same across all of them, but from a broader perspective, they are still quite unified. Out of these practices, one popular pushing method and an associated branching method are collectively called “AoneFlow.” The thinking behind this set of work methodologies is unique and not frequently seen outside of Alibaba. This article will focus on these practices and talk a little bit about the issue of branch management.
When speaking about branch management, people generally restrict themselves to TrunkBased and GitFlow methodologies.
The TrunkBased methodology came about as a result of the continuous integration (CI) line of thinking. It includes a master branch and several pushing branches, each of which you can create from a specific submission version of the master branch and is use it for online deployment and hotfixes. In the TrunkBased method, there is no explicit feature branch. Of course, in reality, Git’s distributed nature allows for each individual to have a local branch, and TrunkBased does not rule out the possibility of having short-term feature branches. However, when speaking about his method, most people will not place much emphasis on this point.
Even though there have been a lot of example cases over the past few years, TrunkBased has still not managed to win the hearts and minds of developers. Its failure is quite apparent; when multiple teams are working on the master branch, the entire project could get easily destroyed when it comes time to push (mainly when teams are developing two or more versions concurrently). One way of making up for this is using FeatureToggle, frequently merging, and applying for sufficient test coverage. However, this imposes high requirements on the ability of the development team. Currently, most teams primarily use the TrunkBased method in SaaS projects that do not need to maintain multiple historical versions concurrently. It is especially useful for smaller services that have transformed into microservices.
You can find two forms of TrunkBased method. The OneFlow method employs much of the same philosophy as TrunkBased, but more strictly defines the operation workflow, and adds things such as a hotfix branch. The multiple trunk method (usually two trunks, a set development trunk and a production trunk) is a special version of TrunkBased that employs a set production branch.
The GitFlow method is a combination of several other methods and involves a master branch, a development branch, multiple feature branches, numerous production branches, and hotfix branches, as well as a combination of several tedious rules. There is a Git plugin for this method, but it hasn’t been maintained for quite some time. The plugin clearly defines the operations in every stage, and it was once the darling of several enterprises that placed heavy emphasis on workflow. However, using this methodology isn’t simple. Most merge conflicts are unfriendly to integration tests, which is the biggest malady of the system.
Though there is also the GithubFlow method, its strategy is similar to that of TrunkBased, and it adds a personal repository and pull request operations. The process is identical to the process of adding personal branches to the same repository and is best suited to distributed teams. GithubFlow also has an evolved version that supports multiple-environment deployment and includes GitlabFlow which connects the repository or branch to the environment. Nonetheless, these methods are rough and heavy handed like TrunkBased, or tuned and tedious like GitFlow. So are there any other options?
AoneFlow is Alibaba’s alternative to GitfFlow. You will see the influences of some other branching methods in AoneFlow. It essentially combines the ease of continuous integration that comes with TrunkBased with the ease of management of GitFlow, while still avoiding tediousness of GitFlow.
Let’s take an example to understand AoneFlow. AoneFlow only uses three types of branches: master branch, feature branch, and production branch, and three basic rules.
Rule 1: Before starting the work, create a feature branch from master.
AoneFlow’s feature branches are based on those of GitFlow and do not feature any noteworthy changes. Each time the team begins a new task (for example a new feature or fixing an issue), they must first create a branch from the master that has been recently pushed to the production and give it a name that begins with “feature/.” Then, one can make changes in the code on this branch. That is to say, that each work item (it could be completed by one individual or by multiple developers collaborating), has a corresponding feature branch. One cannot submit a change directly to the master.
Rule 2: Generate the production branch by merging the feature branches.
The production branch of AoneFlow is quite intelligent, and it is the lifeline of the entire system. In Gitflow, you can first merge the completed feature branches with the common master (also called the development branch), and then pull it to the production branch. TrunkBased takes a similar approach by waiting for all necessary feature branches before you include them in the master and then pulls the production branch from a specific point on the master. AoneFlow’s approach, on the other hand, is to pull a new branch first from the master, then merge each of the feature branches that it needs to include in the present version onto this branch one by one, and thereby form the production branch. The name of a production branch usually begins with “release/.”
This rule is quite simple, but putting it into action involves a lot of details.
First, the use of the production branch can be quite flexible. The basic methodology is to take each production branch and correlate it to a specific environment; for example, the release/test branch would correlate to the deployment and testing environment, and release/prod would correlate to the formal online environment, etc. Then, we need to combine a branch with the pipeline tool, scan the code in each environment and pass it through an automated testing stage. After that, we can push the produced deployment package directly to the appropriate environment.
A more advanced methodology is to take the environment that corresponds to a production branch; for example, link the phased release branch with the formal production branch, and then add some manual steps in the middle. The most advanced method then is to associate feature branches according to the iteration plan to create a set production branch according to the evolution of iterations and then add a series of environments to the pipeline of this production branch. This can look like a classic continuous integration pipeline. Another option is to create a production branch that combines all feature branches to test all of the proposed mergers. This essentially produces the same effect as TrunkBased. Of course, these fancy advanced methods are just something I dreamed up. The production branches at Alibaba Cloud are typically quite regular.
Moreover, the combination of features on the production base is dynamic and simple to adjust. In modern enterprise setups where Agile methodologies are common, sometimes a feature is already ready and waiting to go online. However, because of frequent changes in the marketing strategy or the demands of the client, the team has to delay a release or toss it out completely.
It’s also possible that a feature carries serious development problem before going online and the team has to scrap it. Traditionally speaking, when this happens, we have to go in and manually comb through the code and get rid of anything having to do with the feature that the team has merged to the production or the master branch. Anyone who has done that will know the complicated and frustrating nature of this job.
However, when using the AoneFlow methodology, you can recreate a production branch in a matter of minutes. All it needs is you to delete the original production branch, then pull a new branch from the master and give it the same name. The only step remaining is to merge all the features that are final. The series of actions makes the whole process much more automatic and keeps the repository clean and free of discarded code.
Furthermore, production branches are loosely coupled, meaning that multiple production environments can undergo feature testing simultaneously, and managing the feature integration in different deployments is much more convenient. While the production branches are only loosely coupled, that does not mean that they are unrelated. The test environment, integration environment, pre-production environment, phased release environment, and formal online environment are usually run in order. This ensures that as long as a branch is only passed on to the next environment after confirming that all features are working properly, features are released in a kind of a funnel. Alibaba Cloud has an integrated platform capable of automating the complete group of features during merger across production branches. We will go into more details in the following section on tools.
Rule 3: After pushing to the online environment, merge the corresponding branch to the master and add labels to the master, simultaneously deleting the feature branches related to that production branch.
Once the pipeline of a production branch has completed a deployment to the online environment, the corresponding features get released, and the production branch is merged to the master. In order to avoid accumulating outdated feature branches in the code repository, you should clear out the feature branches that you have already released. As in GitFlow, the newest master branch will always be different from the branch that’s currently online. If we have to roll back to a previous version, then all we have to do is find the correct tag on the master branch.
In addition to these three basic rules, there are some unwritten rules also. For example, when applying a hotfix after the service is online, the normal operation method would be to create a new production branch that corresponds to the online environment (essentially the hotfix branch). Simultaneously, you need to create a temporary pipeline for this branch to automate the implementation of necessary checks and tests before release.
However, a simple way to do this is to clear all of the feature branches from the production branch that corresponds to the current online environment, directly fix that branch, and then use the existing pipeline to release it automatically.
What if we have to fix a bug on a historical version? In this case, we can find the location of the appropriate tag on the master branch, and then create a hotfix branch from that location. However, because most of Alibaba Cloud’s products are online SaaS services, we don’t often run into this kind of situation.
The simple rules that we have described above form the unique core of the AoneFlow system. While the steps in AoneFlow look simple, they did not merely come out of thin air; rather they are the product of the accumulation and refining of experience over several years. Next, let’s talk for a moment about the technological thresholds of AoneFlow and how Alibaba Cloud handles them internally.
Optimizations of AoneFlow
Oftentimes, knowledge of techniques isn’t enough; practice and experience help people master new techniques. At Alibaba Cloud, our teams use similar dedication to develop great code using a complete set of tools. The habits we will be talking about here are not just about writing clean code and keeping branches organized, but also include a fair number of common conventions.
Let’s consider an example. In the AoneFlow process, each time we recreate a production branch, we have to re-merge and compile the code to produce a new deployment package. However, since the contents of the code are different, if it depends on different third party software packages, then the final product could behave differently from before. Therefore, the coding conventions at Alibaba Cloud clearly state that code on the online production branch cannot use dependency packages from a snapshot version (one that is not available online). This ensures that the product is the same each time we build it. There are a lot of little details like this, so good development habits are key to ensuring product quality.
Tools make cooperation among team members smoother. Even though one must understand the guiding principles, you can complete every branch creation, merging, and updating operation in AoneFlow using Git commands. However, there are some operations (for example selecting the right feature branches to merge into a production branch) that are prone to manual errors. Managing these tedious and complicated operations manually (on a daily basis) is not a productive use of time.
Typically at Alibaba Cloud, teams using AoneFlow don’t have to use Git themselves for branching purposes; instead, they use a platform called Aone (henceforth referred to as the platform) that was developed internally especially for use at Alibaba Cloud. This platform handles 80% of the team’s work from determining product requirements, user stories, and online deployment, and carries several efficiency-increasing tools embedded as service components. Among them, production components significantly improve the user experience of AoneFlow. One rather obvious aid is “Efficacy” which includes the features we have described below.
Automation of the Entire Workflow
Since it is an internal tool, the entire platform is incredibly cohesive. You can perform all project operations in one location. This includes determining project requirements, breaking them into tasks, creating online feature branches accordingly, integrating production branches, automatically creating test environment according to a template, all the way to post-production operation and maintenance.
This process is already far and beyond more evolved than the scope of task management. However, it is precisely for this reason that the platform links the feature branches and project requirements together and ensures that the names of feature branches are named accordingly. It further connects the production branch and deployment behavior to ensure the reliability of the source of each environment version. This creates an important end-to-end delivery flow.
Automating the Production Branch Pipeline
As a method of automating the workflow, the CI/CD pipeline is a common practice among many delivery teams. There are several code branches in an AoneFlow lifecycle, and when someone creates or updates any of these branches, a series of behavior must always follow. The pipeline is capable of connecting the branches generated by these daily development processes with the deep layer blueprints they represent (for example submitting code and then integrating it for testing). Particularly, each of AoneFlow’s production branches gets linked to a specific deployment environment, so we need to check and deploy the code in a timely manner.
Under ideal conditions, we should serve each branch with a unique, matching pipeline. The production branches in AoneFlow are set, so continuous integration is easier than in the GitFlow method. In theory, we can use any pipeline tool with AoneFlow. However, Alibaba Cloud’s integrated platform provides a way for the pipeline to review code, inspect it for security, deploy it online, etc. It also increases the usefulness of AoneFlow within the Alibaba Cloud team.
Branch Connection Management
Maintaining the connected relationships between feature branches and production branches is a problem unique to AoneFlow. Remember that determining which feature branches decide the classification of a production branch is critical to know which groups of feature branches the production branch needs to change. For example, when we need to remove a feature from a specific production branch, we typically merge all of the features aside from the ones that need to be removed and use that branch to replace the original one. Recording all of this information is no small matter, so it is very helpful if we can display this information and aid in its operation through the platform.
When we place a group of functions on a low-level production environment (for example an integrated test environment), then once the verification is complete, we hope to migrate its contents to the branch corresponding to a higher level environment (like a pre-production environment). This way we can ensure that the online version has already gone through pre-production verification. The pre-production version has already been through integration verification and is pushed accordingly, linking all production branches into a chain. Similarly, you can achieve this operation using normal Git commands; however, a graphical tool makes the workflow more intuitive.
Furthermore, the platform is capable of integrating and displaying the code from each of the branches on the platform, including the deployment environment and machine information corresponding to the branches as well as the record of operations. These “high value-added” aids make AoneFlow more resilient and are crucial to supporting complex projects for the Alibaba Cloud team.
There is no one-size-fits-all approach for code branching. However, the key is that the rate of release matches the scope of the project. Alibaba Cloud’s cooperative development platform AoneFlow has generated a complete set of branch management methods. It uses a flexible, highly effective, and simple to use workflow to ensure the delivery of all of the products under Alibaba Cloud’s flag. If you’re still not sure about which branching method to use, and cannot quite tear yourself away from Gitflow’s concurrent feature development, or do not want to part with the continuous integration features of TrunkBased, then AoneFlow may very well be worth your consideration.
Read similar articles and learn more about Alibaba Cloud’s products and solutions at www.alibabacloud.com/blog.