Interested to learn more about Alibaba Cloud? Check out our course on Handling Large Traffic with a Load Balancer and get certified today!
Traditionally, we need a web server to provide and deliver services to our customer. Usually, we want to think if we can have a very powerful web server that can do anything we want, such as providing any services and serving as many customers as possible.
However, with only one web server, there are two major concerns. The first one is there is always limit to a server. If your business is booming, lots of new users are coming to visit your website so one day your website will definitely reach your capacity limit and will deliver a very unsatisfying experience to your users.
Also, if you only have one web server, a single point of failure may occur. For example, power outage or network connection issues may happen to your servers. If your single server is down, your customers will be totally out of service, and you cannot provide your service anymore. This is the problem you may suffer when you have only one web server, even if it is very powerful.
How Can a Server Load Balancer Help?
Now you may be wondering how to extend the capability of your web server. Usually, you need to add one more server when businesses keep growing. However, if you add one more server, how can the end-users know which server they need to access? For a better user experience, end-users should not feel the complexity of the backend set-up.
Therefore, what kind of service or device can we put between the end-user and the backend server?
The answer is a load balancing device or software. We can sit it in the middle so that the agent could accept the request from the end-user and use a specific mechanism or algorithm to distribute the service to the backend servers in order to balance the load. That is why it is called a load balancer. It can not only solve the problems of the single point failure and the service up-limit, but also bring consistently satisfying user experiences to end-users.
Now let’s look at how Alibaba Cloud provides the load balancing service through Server Load Balancer (SLB). Alibaba Cloud SLB is a traffic distribution control service that can distribute the incoming traffic among multiple Elastic Compute Service (ECS) instances based on the configured forwarding rules.
Benefits of Using a Server Load Balancer
There are four major features of SLB. The first one is high availability. If you are setting up most of SLBs in different regions, and there are two servers in each region, by default SLB will be deployed in one zone with a master instance and with a slave instance in another. This shows the high availability feature of SLB.
Also, with SLB, you will obtain scalability because when you scale behind the SLB, and the end-users will never know and feel any turbulence. Additionally, SLB service costs less comparing to any physical hardware costs.
The last feature is security: SLB is one of the Alibaba Cloud network services. It can, by default, leverage most of the security products and features provided to protect your business. Regarding DDoS attacks, SLB has the basic DDoS protection capability.
The Inner Workings of Server Load Balancer
Let’s talk about the basic architecture of SLB. Now we have Alibaba Cloud SLB sitting between the end-users and the backend servers which is ECS. SLB consists of three major components; the instance, listeners, and the servers.
As an end cloud user, you can create and put two instances in a region. For every load balancing instance, you need to create and put more than one listener in. For the back-end, you need to tell the load balancer what your backend server is, how many servers you want to put under the protection by SLB. Hence, these are the three major components you need to consider when you are creating an SLB and trying to configure it.
There are two kinds of instances for SLB based on their connector capabilities. The first one we called on the Public Network SLB. Based on the name, you can tell that this is the SLB that can be combined with a public IP address, and the pay mode is Pay-as-You-Go. This means you need to pay for the instance renting and public traffic. If you use extra volume, you need to pay for the extra volume. So far, this is only in the Pay-as-You-Go mode that you can choose for the Public Network SLB.
The private network SLB is totally free. Based on the name, you can tell it can only be used in a private network environment, and we will never assign a public address IP to it. We use both public and private SLB to enhance capabilities. We want to use public SLB to serve as the users’ requests incoming from the Internet, but internally we want to use the private SLB to forward different traffic or requests to different sets of the backend servers. A two-layer architecture is more scalable and elastic.
Customizing Your Server Load Balancer
When we talk about the component listener, we need to pay attention to the following four configurations. The first one is the routing rules, including what kind of scheduling mechanism you use to shift the traffic to the backend server.
Another feature is called the session stickiness. It means sometimes if you don’t want SLB to break the session but forward the same session to another server, you can just click the section stickiness option to enable it, and then the load balancer will remember and keep sending the package from the same session to the same backend user until the session is in break.
The next feature is the health check configuration. For the load balancer, it’s not just trying to forward the request blindly, but actually have a mechanism to check the backend server through different protocols. If it finds out the server is down or not in a healthy status, SLB will stop sending the request to the unhealthy servers so that the end-user wouldn’t feel response turbulence.
Also, you can use the SLB configuration to define peak bandwidth. It means you can give an up-limit to SLB to make sure it will not exceed the peak bandwidth. However, no matter how many listeners you are having at the backend, they will share the same peak bandwidth.
I just mentioned that, for the listeners, you need to configure the request scheduling mechanism. Thus, we are supporting three kinds of scaling algorithms. The first one is the most straightforward one called the Round robin. The request will be just to send to servers one after another. It’s a very easy-to-understand mechanism.
With Round robin, we can set another mechanism called Weighted round-robin (WRR). We can still use the Round robin, but based on the different way you set the backend server, the most weighted services get the most requests.
The last one called Weighted least connection (WLC). Based on not only the weight but also the last requests SLB send and the evaluation of the connections every server gets, the least one will get new requests.
Ready to test your knowledge? Take the Handling Large Traffic with a Load Balancer course and get certified today!