Deploying Cross-AZ Windows Server Failover Clustering (WSFC) in Alibaba Cloud

WSFC is a feature of the Windows Server platform, which is generally used to improve the high availability of applications and services on your network. WSFC is a successor to the Microsoft Cluster Service (MCS). We recommend you use Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability Groups as your SQL Server high availability (HA) solution on Alibaba Cloud’s Elastic Compute Service (ECS) instances.

An Alibaba Cloud ECS Instance provides fast memory and the latest Intel CPUs to help you to power your cloud applications and achieve faster results with low latency. All ECS instances come with Anti-DDoS protection to safeguard your data and applications from DDoS and Trojan attacks.

The Alibaba Cloud ECS allows you to load applications with multiple operating systems and manage network access rights and permissions. Within the user console, you can also access the latest storage features, including auto snapshots, which is perfect for testing new tasks or operating systems as it allows you to make a quick copy and restore later. It offers a variety of configurable CPU, memory, data disk and bandwidth variations allowing you to tailor each Instance to your specific needs.

When using WSFC in conjunction with Alibaba Cloud ECS, if one cluster node fails, another node can take over. We can configure this failover to happen automatically, which is the usual configuration, or we can manually trigger a failover.

In this tutorial, we deploy a Cross-Availability Zone (AZ) WSFC on an Alibaba Cloud ECS instance. This tutorial assumes a basic understanding of Alibaba Cloud’s suite of products and services, the Alibaba Cloud Console, failover clustering, the Active Directory (AD), and the administration of Windows Server.

1. Introduction

1.1 The Architecture

We recommend the following configuration, which contains three servers and runs across the Alibaba Cloud Virtual Private Cloud (VPC) to provide an isolated cloud network to operate your resource in a secure environment:

• A primary ECS instance running Windows Server 2016.
• A secondary ECS instance, configured to match the primary instance, running in another Availability Zone.
• An Active Directory (AD) / domain name server (DNS) instance. This server will serve several roles:

  1. Providing a Windows domain.

Note: the quorum is sometimes referred to as the Disk or File Witness. It is simply a small clustered disk which is in the available cluster storage group.

Figure 1: The Architecture

1.2 Understanding the Network Routing

When the cluster fails, requests must go to the newly active node. This routing is usually handled by the address resolution protocol (ARP), which associates IP addresses with MAC addresses.

However, in Alibaba Cloud, the VPC system uses software-defined networking, which does not provide MAC addresses. This means the changes broadcast by ARP don’t affect routing. To make routing work, we need to make use of an Alibaba Cloud product called HAVIP (Highly Available Virtual IP).

In this scenario we need to form a cluster across two different subnets in two availability zones. So, we will need to employ two HAVIPs.

1.3 Understanding a Failover

When a failover happens in the cluster, the following changes take place:

  1. Windows failover clustering changes the status of the active node to indicate that it has failed.

That’s it! Let’s start the tutorial from the Alibaba Cloud Console.

2. Preparing the Environment in the Alibaba Cloud Console

First, login to your Alibaba Cloud Console. We are now going to set up your Alibaba Cloud account to work with the WSFC environment.

2.1 Create Your VPC

  1. In the Alibaba Cloud Console, find and click “VPC” on the left-hand menu.

2.2 Create Three ECS Instances

  1. In the Alibaba Cloud Console, find and click “Elastic Compute Service” on the left-hand menu.

2.3 Create Two HAVIPs

Next, we need to create two HAVIPs, one in each availability zone, and then bind the corresponding instance to that subnet behind the HAVIP.

In Alibaba Cloud, all IPs on any VPC and underlying switches are assigned dynamically. So, you must use “HAVIP” to configure a static IP that can be used as Virtual IP for Windows Server Failover Cluster and other application clusters on ECS.

By default, HAVIP button is not available for use. So, you will need to log a support ticket “To whitelist HAVIP”.

Once HAVIP is available under VPC, complete the following steps:

  1. Click on “Create a HAVIP Address”.
  1. Select Vswitch and Specify the Private IP that you want to use as a static virtual IP
  1. Add both Nodes that will be part of the High Availability Cluster
  1. The Primary should be called the “Master”, while secondary is known as the “Slave”.
  1. Check this new HAVIP is reachable from the ECS instance. If you can successfully ping it, this IP can now be used for your Windows Cluster.

2.4 In Summary

For the remainder of this tutorial, we will assume the following environment has been set up:

3. Configure Both Instances to Join the Domain

  1. Use RDP to connect to the wsfc-a instance.
  • [wsfc-a]> $DNS = "" # Private IP of ad-1 instance [wsfc-a]> $LocalStaticIp = "" # Private IP of this instance [wsfc-a]> $DefaultGateway = ""
  1. Obtain the address interface of the private static IP, in this case it is showing “Ethernet”:
  • [wsfc-a]> netsh interface ip show address Configuration for interface "Ethernet" DHCP enabled: No IP Address: Subnet Prefix: (mask Default Gateway: Gateway Metric: 1 InterfaceMetric: 15
  1. Set the static IP address and default gateway to:
  • $LocalStaticIp $DefaultGateway 1```
  1. Note: RDP might lose connectivity for a few seconds or require you to reconnect.
  1. Enter the credentials of an account with the permission to join the domain when prompted.

4. Configuring the Cluster

  1. Use RDP to connect to the wsfc-a instance with the credentials we created in previous step.
  1. Right-click on Failover Cluster Manager > Create Cluster
  1. Click the Select Servers page and add both servers.
  1. Click Next and keep the option to run configuration validation tests.
  1. Click Next twice to run the tests. Make sure none of the tests have failed.
  1. Click Finish to return to Create Cluster Wizard.
  1. Click Next twice to create the cluster and then Finish to complete the wizard.
  1. We can now move on to create the file-share witness to help the cluster to achieve quorum.
  1. Click Next.
  1. Click Next after confirming the settings.

4. Testing the Setup

  1. In the HAVIP web console, both servers in their respective HAVIP have been promoted to Master. But, from WSFC perspective, the cluster resource is online for ‘’, in this case, it is wsfc-b that is the active node in the cluster setup.
  1. Next, we will try to simulate a failover and make sure the connection is working as expected.
  1. RDP to either one of the instances as part of the cluster. Within the Failover Cluster Manager page, right-click onto our cluster, select More Actions > Move Core Cluster Resources > Select Node
  1. Since the current resource is up on wsfc-2, we only see wsfc-1 here as candidate to failover the resource. Select the node and click OK to complete this action.
  1. The failover should complete very quickly, but if we go back to the ad-1 server, after refreshing the DNS, we can perform the ping again and notice that the failover is assigned to
  1. To failback to the previous server, we can repeat step 5 above and we will see wsfc-b in the selection list.

Testing the Setup

That’s it! We have successfully created a cross-AZ failover cluster using Windows Server on Alibaba Cloud.

To read the other tutorials covering Windows Server Failover Clusters, SQL Servers and Windows Server Failover Clustering, you can visit:


Follow me to keep abreast with the latest technology news, industry insights, and developer trends.