Auto Scaling, a popular cloud service orchestration product on Alibaba Cloud, enables automatic adjustment of elastic computing resources based on your defined policies and business needs. It ensures the most rational and effective infrastructure costs while supporting any changes in business loads. Based on your scaling policies and scaling mode, Auto Scaling automatically increases Elastic Compute Service (ECS) instances to ensure computing capabilities when your business demand grows, and reduces ECS instances to save costs when your business demand declines. Besides, Auto Scaling automatically replaces unhealthy ECS instances to ensure normal business loads at any time. In this way, Auto Scaling delivers true elastic processing capabilities for business loads in complex scenarios without manual intervention.
We have received a lot of valuable feedback from users. Recently, we have fully upgraded Auto Scaling Service to allow you to deal with business changes more flexibly and effectively and ensure fast and stable business development with higher cost-effectiveness. The following part further describes these updates.
Richer Configurations and More Flexible Management
Scaling groups support adding or modifying SLB instances and ApsaraDB for RDS instances
In practice, you usually need to add or modify Server Load Balancer (SLB) instances or ApsaraDB for RDS instances that are already bound to a scaling group. However, once a scaling group is created, its configurations of SLB instances or ApsaraDB for RDS instances could not be modified. Therefore, you must create a new scaling group in that case. After the function upgrade, Auto Scaling now supports attaching and detaching SLB instances and ApsaraDB for RDS instances, making it easy to deal with architectural changes or upgrades without creating a new scaling group.
Server Load Balancer
Auto Scaling Service integrates with Server Load Balancer (SLB), which allows you to attach an SLB instance to a scaling group and distribute traffic to each instance in the scaling group using the SLB instance. For a long time, SLB instances were specified only when a scaling group was created and cannot be modified. It means that you must carefully consider your business demand and required number of SLB instances when you create a scaling group. This awkwardness is now eliminated as we have launched two new functions of Auto Scaling Service: AttachLoadBalancer and DetachLoadBalancers.
Attach an SLB instance to a scaling group
You can attach an SLB instance to a scaling group, during which either of the following actions is performed based on the specified value of forceAttach:
- When forceAttach is set to true, all instances in the current scaling group are attached to the SLB instance backend when you attach the SLB instance to the scaling group.
- When forceAttach is set to false, existing instances in the current scaling group are not attached to the SLB instance backend when you attach the SLB instance to the scaling group.
To add all instances in a scaling group to the backend of an exiting SLB instance, you can attach this SLB instance to the scaling group again and set the value of forceAttach to true.
Note that due to the limitations of SLB instance types, the SLB instance to be attached to a scaling group must meet the following conditions:
- The SLB instance must be undeleted.
- The SLB instance must be in the same region as the scaling group.
- The SLB instance must be in Active state.
- The SLB instance must have at least one listener and its health check function is enabled.
- If both the SLB instance and the scaling group are of VPC type, they must in the same VPC environment.
- If the scaling group is of VPC type but the SLB instance is of classic network type, the VPC instance (if any) on the SLB instance backend server must be in the same VPC environment as the current scaling group.
- The number of SLB instances attached to the scaling group must be less than the quota for the scaling group.
Detach an SLB instance from a scaling group
When you detach an SLB instance from a scaling group, either of the following actions is performed based on the specified value of forceDetach:
- If forceDetach is set to true, the SLB backend instances associated with the scaling group are also detached when you detach the SLB instance from the scaling group.
- If forceDetach is set to false, the SLB backend instances associated with the scaling group are not detached when you detach the SLB instance from the scaling group.
Before detaching an SLB instance from the scaling group, you should confirm that the SLB instance no longer distributes requests to other instances in the scaling group, to avoid loss of service requests. Furthermore, unlike the attachLoadBalancer operation, you cannot try to detach the same SLB instance from the scaling group multiple times.
ApsaraDB for RDS
ApsaraDB for RDS, a stable and reliable online database service of Alibaba Cloud, supports MySQL, SQL Server, PostgreSQL, and PPAS. It provides a complete set of solutions for disaster recovery, backup, restoration, monitoring, migration, and other features, to free you from database O&M. Auto Scaling Service integrates with ApsaraDB for RDS, which enables the instances in a scaling group to be automatically added to a whitelist, ensuring secure access to the ApsaraDB for RDS instances.
Attach an ApsaraDB for RDS instance to a scaling group
When you attach an ApsaraDB for RDS instance to a scaling group, either of the following actions is performed based on the value of forceAttch:
- If forceAttach is set to true, the private IP addresses of all instances in the scaling group are added to the IP address whitelist when you attach the ApsaraDB for RDS instance to the scaling group.
- If forceAttach is set to false, the private IP addresses of all instances in the scaling group are not added to the IP address whitelist when you attach the ApsaraDB for RDS instance to the scaling group.
If you attach an existing ApsaraDB for RDS instance to the scaling group again, the number of ApsaraDB for RDS instances in the group remains unchanged but the private IP addresses of all instances in the group are added to the IP address whitelist.
Note that the ApsaraDB for RDS instance to be attached to a scaling group must meet the following conditions:
- The ApsaraDB for RDS instance must be undeleted.
- The ApsaraDB for RDS instance must be unlocked.
- The ApsaraDB for RDS instance must be in running state.
- For the default ApsaraDB for RDS instance group, the total number of IP addresses in the whitelist must not exceed 1,000.
Detach an ApsaraDB for RDS instance from a scaling group
When you detach an ApsaraDB for RDS instance from a scaling group, either of the following actions is performed based on the specified value of forceDetach:
- If forceDetach is set to true, the IP addresses of instances associated with the scaling group are removed from the whitelist when you detach the ApsaraDB for RDS instance from the scaling group.
- If forceDetach is set to false, the IP addresses of instances associated with the scaling group are not removed from the whitelist when you detach the ApsaraDB for RDS instance from the scaling group.
You can set the value of forceDetach as needed. Note that you cannot repeat the removal operation for the same ApsaraDB for RDS instance.
Auto Scaling supports configuration modification, image-based default password, and other functions
Similar to scaling groups, to avoid creating new scaling configurations, Auto Scaling Service supports configuration modifications and more ECS features, such as image-based default password.
Auto Scaling Service supports modification of the following parameters:
- passwordInherit (image-based default password)
- hostName (host name)
Auto Scaling supports setting UserData, KeyPair, RamRole, and Tags
To ensure higher flexibility and elasticity, Auto Scaling Service supports four additional features, namely, UserData, KeyPair, RamRole, and Tags. UserData allows you to complete the automatic configuration process quickly and securely. When the number of ECS instances changes with the business needs, you can perform application-level scale-up and scale-down quickly and securely. Also, you can configure KeyPair, Tags, and other parameters for more efficient and intelligent ECS instance management.
Custom instance data (UserData)
UserData, custom instance data, is a feature provided by Alibaba Cloud ECS to customize instance startup behaviors and import data. This feature is compatible with Windows instances and Linux instances and is mainly used as:
- Instance customization script, to be executed when an instance is started.
- Normal data, to be imported into an instance. You can reference the data in the instance.
To automatically scale up and scale down application-level ECS instances using Auto Scaling Service based on your business needs, you can use custom images generally or use open-source IT Infrastructure management tools, such as Terraform. After the UserData parameter is added to Auto Scaling Service, you only need to prepare custom script data (UserData) and import it into the scaling configuration as Base64-encoded data. When Auto Scaling Service is used to scale up and scale down ECS instances, it automatically executes the custom instance script (UserData) for application-level scale-up and scale-down. Compared with custom images or other open-source tools, the native UserData feature of Auto Scaling Service ensures more fast and secure automatic scale-up and scale-down.
Pay attention to the following aspects when you create a scaling configuration and use the UserData parameter:
- UserData is only supported for the scaling configuration in the VPC environment.
- UserData must be Base64-encoded.
- UserData is input without encryption. Therefore, do not input confidential information (such as passwords and private keys) in plaintext. If such confidential information must be input, it is recommended that the information be encrypted and Base64-encoded, and then decrypted in the instance in the same mode.
For more usages of UserData, see Alibaba Cloud Custom Instance Data.
SSH key pair (KeyPairName)
You can log on to a remote Linux server over SSH using the password or SSH key. When you need to manage multiple server clusters, it is not only time consuming to enter the password frequently, but also easy to cause logon failures due to password input errors. In such case, if you log on to the server using the SSH key, you only need to configure you public key and private key. The keys are valid for a long time once configured.
The SSH key created by Alibaba Cloud only supports RSA 2048-bit key pairs. When Alibaba Cloud generates the SSH key, it keeps the public key and returns the private key to you.
The KeyPairName parameter in the auto scaling configuration allows you to log on to the server using the SSH key. When you create a scaling configuration, you can select the desired key pair name for the KeyPairName parameter. An ECS instance created using Auto Scaling Service keeps the public key. You only need to configure the private key on your local device, to log on to the server quickly using the SSH key.
Pay attention to the following aspects when you create a scaling configuration and use the KeyPairName parameter:
- Ignore this parameter for Windows ECS instances. The input KeyPairName parameter is also invalid.
- If you input the KeyPairName parameter, the password logon method for Linux ECS instances is set to forbidden upon initialization.
RAM role name (RamRoleName)
Resource Access Management (RAM) is a service provided by Alibaba Cloud for user identity management and access control. Using RAM, you can create and manage user accounts (for example, employees, systems, and applications) and control the operation permissions for resources under them. If multiple users in your enterprise collaboratively work with resources, RAM allows you to avoid having to share the AccessKey of your Alibaba Cloud account with other users. Instead, you can grant users the minimum permissions necessary for them to complete their work, reducing your enterprise’s information security risks.
RAM supports creating roles that have different operation permissions for different cloud products. You can set RamRoleName, a new parameter for Auto Scaling Service, to specify different roles for your ECS instance. Then the ECS instance has different operation permissions for cloud products. Note that when you specify the RamRoleName parameter, you must make sure that the current RamRole policy allows your ECS instance to act as the specified role; otherwise, the scaling configuration cannot effectively make the ECS instance available.
The tag service is available for Alibaba Cloud ECS instances. That is, you can bind different tags to your ECS instance for category management.
You can query tags to obtain a list of matching ECS instances, and in turn, you can query ECS instances to obtain the matching tags. You can set Tags, a new parameter for Auto Scaling Service, to specify the tag pairs of all available ECS instances for category management. Each scaling configuration supports up to five pairs of tags currently. When the specified number of tag pairs exceeds five, the scaling configuration fails.
Higher Creation Success Rate and Business Availability
Support for multi-zone scale-up and multiple instance types (Alibaba Cloud is the first cloud service provider in the world that supports multiple instance types)
The core of Auto Scaling lies in availability of ECS instances for horizontal scale-up. However, the inventory of cloud computing resources changes dynamically, and inventory shortage is always a problem we face. To maximize the creation success rate, Auto Scaling supports multiple instance types in addition to multi-zone scale-up. This is our edge over other vendors.
Support for multi-zone scale-up
Due to limitations of previous Auto Scaling Service, only one VSwitch can be configured for each VPC scaling group in the past. One VSwitch belongs to one zone only. When you configure a VSwitch for the scaling group but the VSwitch cannot create an ECS instance in its zone due to inventory shortage, the scaling configurations, scaling rules, and alarm tasks for the scaling group are valid. To solve the preceding problem and improve the scaling group availability, we add VSwitchIds.N, a new multi-zone parameter. You can set VSwitchIds.N to specify multiple VSwitches for your scaling group. If one VSwitch cannot create an ECS instance in its zone, Auto Scaling Service automatically switches to another zone. Pay attention to the following aspects when you use this parameter:
- If you use the VSwitchIds.N parameter, the VSwitchId parameter is ignored.
- In the VSwitchIds.N parameter, the value range of N is [1, 5]. That is, up to five VSwitches can be configured for each scaling group.
- The VSwitches specified by the VSwitchIds.N parameter must be in the same VPC environment.
- In the VSwitchIds.N parameter, N stands for the priority of VSwitches. The VSwitch with N = 1 has the highest priority to create an ECS instance. The higher the N value, the lower the priority.
When the VSwitch with the highest priority cannot create an ECS instance in its zone, the system automatically selects the VSwitch with the next highest priority to create an ECS instance. When you create a scaling group using the VSwitchIds.N parameter, use the VSwitches in different zones but in the same region if possible. This can effectively reduce occurrence of the problem that one VSwitch cannot create an ECS instance in its zone, improving the scaling group availability.
Support for up to 10 instance types
Due to limitations of Auto Scaling Service, only one scaling configuration is valid in a scaling group and it is used for one instance type only in the past. Due to the limitations, only one instance type is valid in a scaling group. If the current instance type is unavailable due to inventory shortage, the scaling group cannot create an ECS instance. You need to select another scaling configuration in the scaling group or create a scaling configuration to restore the scaling group. To solve the preceding problem and improve the scaling configuration availability, we add InstanceTypes.N, a new multi-instance-type parameter. You can set InstanceTypes.N to specify multiple instance types for your scaling configuration. If one instance type is unavailable due to inventory shortage, Auto Scaling Service automatically switches to another available instance type. Pay attention to the following aspects when you create a scaling configuration using the InstanceTypes.N parameter:
- If you use the InstanceTypes.N parameter, the InstanceType parameter is ignored.
- In the InstanceTypes.N parameter, the value range of N is [1, 10]. That is, up to 10 instance types can be configured for a scaling configuration.
- If your scaling group is of classic network type, the region where the scaling group is located must support all ECS instance types you configured in classic networks. If not, the scaling group cannot create an ECS instance. You can use the querying zone list API to query the instance types supported by the current region and the network type supported by each instance type.
- If your scaling group is of VPC type, the zones of VSwitches configured for the scaling group must support all ECS instance types you configured in VPCs. Multiple VSwitches can be configured for each scaling group.
- In the InstanceTypes.N parameter, N stands for the priority of instance types in the current scaling configuration. The instance type with N = 1 has the highest priority. The higher the N value, the lower the priority.
- When the instance type with the highest priority is unavailable for creating an ECS instance due to inventory shortage, the system automatically selects the instance type with the next highest priority to create an ECS instance.
- When you create a scaling configuration using the InstanceTypes.N parameter, all instance types must be unique; otherwise, the scaling configuration creation fails.
Multi-zone instance balancing mode
To meet high availability and disaster recovery requirements in multi-zone instance scenarios and thus ensure service stability and continuity, we provide the automatic multi-zone instance balancing function for Auto Scaling Service to reduce the impacts of force majeure.
Auto Scaling Service creates ECS instances in multiple zones across regions, which allows you to ensure security and reliability by taking advantages of geographic redundancy.
Supported scope of the automatic multi-zone instance balancing function:
- Only support for a scaling group of VPC type and with more than one VSwitches (VSwitchId)
- Only support for settings when a scaling group is created
How to set the automatic multi-zone instance balancing function:
MultiAZPolicy, a new multi-zone elastic policy parameter, is added for a scaling group. The values are:
- PRIORITY (default value)
When this parameter is set to BALANCE, the instances in all zones are automatically balanced when the scaling group performs a scaling activity.
The default value is PRIORITY. The system scales up/down ECS instances based on the defined VSwitch priority (VSwitchIds.N). When the VSwitch with the highest priority cannot create an ECS instance in its zone, the system automatically uses the VSwitch with the next highest priority to create an ECS instance.
In the following cases, the ECS instances in a scaling group may be not balanced between different zones:
- Instances in the specified zone become insufficient
- VSwitches (VSwitchId) configured for the scaling group change.
- You have removed an ECS instance from the scaling group and released it.
In any of the preceding cases, you can execute RebalanceInstances to rebalance the ECS instances in the scaling group.
Increased Instance Management Capabilities
Support for instance standby status, instance protection mode, and instance detachment operation
We provide three new management functions to allow users to manage their ECS instances more flexibly and meet the requirements in some special scenarios:
- Standby status that allows users to update ECS instances in scaling groups, change configurations, and perform other operations.
- Protection mode that protects ECS instances from being removed from scaling groups for any reasons.
- Detach instance that enables ECS instances to be retained for use independently of scaling groups
Support for standby status operations
You are unable to control the lifecycle of ECS instances managed in scaling groups. The scaling groups release unhealthy ECS instances, which also keeps you from performing halt-related operations on scaled ECS instances. As a result, you are unable to take full advantages of elasticity provided by ECS.
The standby status operations can meet the requirements in the following scenarios:
- In the scenarios where you need to change the type of an scaled ECS instance or restart the instance, you can set the ECS instance to standby state and take over its lifecycle management. Then you can perform all elastic operations supported by ECS. After that, you should make the ECS instance exit the standby state and return lifecycle management to its scaling group.
- In the scenarios where you manage ECS instances using an Server Load Balancer instance configured for their scaling group, you can set the faulty ECS instance to standby state for shunting. Then you can perform offline troubleshooting and verifications (including logon, troubleshooting, and restart). After the ECS instance is in ready state, you can make it exit the standby state and reprocess business traffic.
Support for Instance Lifecycle Management
We have launched the LifecycleHook feature to allow users to manage ECS instances in scaling groups more flexibly. The LifecycleHook feature suspends the scaling activity that occurs in a scaling group to perform custom operations.
Specifically, the LifecycleHook feature suspends the ECS instance that is scaling or to be released, to perform custom operations. This allows users to manage the lifecycle of ECS instances in scaling groups more flexibly. The following are several simple application scenarios of the LifecycleHook feature:
- An ECS instance that becomes available in a scaling group is mounted to SLB with certain latency before providing services.
- To release an ECS instance, the scaling group first removes the ECS instance from the SLB backend server to prevent it from receiving new requests. Then the scaling group stops and releases the ECS instance after it verifies that all requests received by the ECS instance are processed.
- The scaling group backs up data when it releases an ECS instance.
- You can perform some custom operations for instance scale-up or scale-down of a scaling group.
In the second scenario, if the maximum processing time for each request can be determined, you can call the Create LifecycleHook API to create a lifecycle hook. Set the value of LifecycleTransition to SCALE_IN, and the value of HeartbeatTimeout to the maximum processing time, without setting a notification object. When a scaling activity occurs, the LifecycleHook feature suspends the ECS instance for a certain period (HeartbeatTimeout) after the ECS instance is removed from SLB, to allow all requests to be processed.
Improved Scaling Experience
Enhanced smooth elasticity
To allow users to trigger auto scaling events from more monitoring dimensions, we have increased the number of metrics from 6 to 13. Custom metrics are also supported.
Alarm tasks in Auto Scaling
Alarm tasks in Auto Scaling represent in-depth cooperation with Cloud Monitor Service (CMS), which provides a dynamic scaling group management mode. Similar to scheduled tasks in Auto Scaling, alarm tasks in Auto Scaling enable triggering specified scaling rules to perform scaling activities. In this way, the number of ECS instances in a scaling group can be adjusted.
MetricsUnitSystem disk write BPSByte/sSystem disk read BPSByte/sSystem disk write IOPSNos/sSystem disk read IOPSNos/sNumber of packets transmitted from Internet NIC (classic network)Nos/sNumber of packets received by Internet NIC (classic network)Nos/sNumber of packets transmitted from intranet NICNos/sNumber of packets received by from intranet NICNos/sTotal TCP connectionsNr.Number of established TCP connectionsNr.
Scheduled tasks execute the specified scaling rule at the specified time, which provides a response in advance when the business scenarios are time-predictable. However, they become insufficient when the business scenarios are unexpected or not time-predictable. In such case, alarm tasks provide a more flexible way to trigger scaling rules. Auto Scaling Service increases ECS instances in a scaling group to share business loads in peak traffic period and releases ECS instances in the scaling group in non-peak traffic period, to reduce production costs.
Alarm tasks monitor specific metrics to collect statistics on data metrics in real time. When the statistics meet the specified alarm condition, this function triggers an alarm for executing the specified scaling rule. By using alarm tasks, you can adjust the number of ECS instances in a scaling group based on business changes in real time, to ensure that the metrics are within the expected range.
Alarm tasks in Auto Scaling
The metrics for alarm tasks in Auto Scaling are ECS instance data metrics (such as CPU and load) collected by CloudMonitor. The metric-based alarm tasks in Auto Scaling use scaling groups as monitoring granularity. That is, the average metrics of all instances in a scaling group are the metrics of the scaling group. The metrics are updated as the number of instances in the scaling group changes.
New system metrics:
Custom metric-based alarm tasks in Auto Scaling
The monitoring objects of custom metric-based alarm tasks in Auto Scaling are the metrics users reported to CloudMonitor independently. In some scenarios, the system metrics may not include your desired metrics, and you may have an internal monitoring system for some metrics related to your specific business. In these scenarios, the custom metric-based alarm tasks provide alarm task access points for your internal monitoring system or business-related metrics.
The custom metric-based alarm tasks in Auto Scaling are set based on the custom metrics in Alibaba Cloud CloudMonitor. Before using Auto Scaling to customize metric-based alarm tasks, users should report custom monitoring data, that is, custom metrics, to CloudMonitor. CloudMonitor metric customization is a service that enables users to freely customize metrics and alarm rules. By using this service, users can monitor the business indicators they concerned about, report the collected monitoring data to CloudMonitor for processing, and set alarm rules.
To further improve user experience, we have abandoned the previous notification mode of SMS + email and developed the new notification mode that allows you to choose recipients, select notification tools (DingTalk + SMS + email), and edit the receipt content. The programmable notification modes Topic and Queue are also supported to improve user experience to the largest extent.
The event notification function of Auto Scaling supports event notifications at the scaling group level. It allows you to configure event notifications for your scaling groups and also types of scaling activities to be notified. When a scaling activity occurs, the event notification function pushes the details of the scaling activity to the notification object you configured. Currently, the event notification function supports three types of notification objects and five types of scaling activities. The event notification function allows you to immediately learn the changes of instances in a scaling group, to monitor the scaling group information in real time.
Scaling activity types supported by the event notification function
When you create an event notification, you need to set the scaling activity type that triggers an event notification. When a scaling activity occurs in the scaling group, the event notification is triggered and the details of the scaling activity are sent to the notification object you configured.
The event notification function currently supports five types of scaling activities:
In the preceding scaling activities, AUTOSCALING:SCALE_OUT_SUCCESS and AUTOSCALING:SCALE_IN_SUCCESS include partial success and full success. You can determine whether a scaling activity belongs to partial success or full success based on the activity details from its event notification. Or, you can use the DescribeNotificationTypes API to query the scaling activity types supported by the event notification function.
Notification modes supported by the event notification function
The event notification function should, if triggered by a scaling activity, report the details of the scaling activity to a notification object. Currently, this function supports three notification modes:
- Reporting details of scaling activities to CloudMonitor system events
- Pushing details of scaling activities to Message and Notification Service (MNS) queues
- Pushing details of scaling activities to MNS topics
Preemptive instances (formerly known as spot price instances) further lower costs
Preemptive instances are post-paid instances with a price fluctuating as the supply-demand relationship changes. They have a higher discount than Pay-As-You-Go instances. To purchase a preemptive ECS instance, you can set your highest bid price for the instance. The market price of a preemptive ECS instance fluctuates as the supply-demand relationship changes, and is currently 10% to 100% off the price of a Pay-As-You-Go instance. In this mode, you can purchase a preemptive ECS instance at a price not higher than your highest bid price. When the market price is higher than your highest bid price, Alibaba Cloud does not generate a preemptive ECS instance for you, which allows you to control the production cost in the expected range.
Note that the preemptive instances are exposed to certain risks along with their low prices. When the market price is higher than your highest bid price or the supply-demand relationship is in serious imbalance, Alibaba Cloud has the right to release your ECS instance. Alibaba Cloud will send a notification of metadata to you five minutes before releasing your ECS instance. You can subscribe to Alibaba Cloud metadata to save and clear data in time.
Compared with Pay-As-You-Go instances of the same type, preemptive instances save a lot of server costs for you. They are the best choice in the following applications scenarios:
- Real-time analysis services
- Big data services
- Image and media encoding services
- Scientific computation services
- Elastically scalable business websites and web crawler services
- Image and media encoding services
- Genetic computation services
- Geospatial survey and analysis services
The new functions of Auto Scaling Service aim to make it easier for you to deal with business load changes while maintaining low TCO as your elastic supporting business grows. To learn more about Auto Scaling, visit www.alibabacloud.com/product/auto-scaling