Best Practice on Testing System Event Processing Program for ECS Instances

Alibaba Cloud
3 min readOct 17, 2018

--

ECS instance system events are scheduled or unexpected events that affect the running status of an instance. The best approach to ensure stable running of services on Elastic Compute Service (ECS) instances is to automatically process the events through a program. The problem, however, is that the program can be difficult to test. System events are generated in specific scenarios and therefore may fail to be manually triggered. This makes testing the system event processing program quite tricky. ECS instances can run quite stably for months without any errors. We need to devise a new way to test the system event processing program.

ECS OpenAPIs for Convenient Testing

To test the system event processing program, ECS provides the open APIs CreateSimulatedSystemEvents and CancelSimulatedSystemEvents to create and cancel simulated system events.

What are simulated system events? Simulated system events are specifically created for testing the system event processing program. After configuring a simulated system event, you can view the data of the event (which is just the same as that of a real event) through various event consumption channels such as open APIs, consoles, and CloudMonitor.

Apart from generating event data, a simulated event also simulates changes throughout the lifecycle.

  1. After a simulated event is configured, it is in the Scheduled state.
  2. When the scheduled time specified by NotBefore arrives, the state of the event first changes to Executing, and then quickly changes to Executed.
  3. Simulated events also respond to users’ operations. Take the SystemMaintenance.Reboot event as an example. If a user restarts the instance before the scheduled time specified by NotBefore, the state of the event changes to Avoided.
  4. If a user calls the CancelSimulatedSystemEvents API before an event is completed, the state of the event changes to Canceled.

Simulated System Event Lifecycle

The following figure shows the lifecycle of a simulated event.

Sample Code

The following describes how to create two simulated SystemMaintenance.Reboot events and then cancel one of them.

The output of code execution is as follows:

#  coding=utf-8# make sure the sdk version is 4.10.0 or upper, you can use command 'pip show aliyun-python-sdk-ecs' to check
# if the python sdk is not installed, use 'sudo pip install aliyun-python-sdk-ecs'
# if the python sdk is installed, use 'sudo pip install --upgrade aliyun-python-sdk-ecs'
import json
import logging
from aliyunsdkcore import client
from aliyunsdkecs.request.v20140526.CancelSimulatedSystemEventsRequest import CancelSimulatedSystemEventsRequest
from aliyunsdkecs.request.v20140526.CreateSimulatedSystemEventsRequest import CreateSimulatedSystemEventsRequest
logging.basicConfig(level=logging.INFO,
format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S')
# your access key Id
ak_id = "YOU_ACCESS_KEY_ID"
# your access key secret
ak_secret = "YOU_ACCESS_SECRET"
region_id = "cn-beijing"
client = client.AcsClient(ak_id, ak_secret, region_id)
# send open api request
def _send_request(request):
request.set_accept_format('json')
try:
response_str = client.do_action_with_exception(request)
logging.info(response_str)
response_detail = json.loads(response_str)
return response_detail
except Exception as e:
logging.error(e)
def build_create_request(event_type, not_before, instance_ids):
request = CreateSimulatedSystemEventsRequest()
request.set_EventType(event_type)
request.set_NotBefore(not_before)
request.set_InstanceIds(instance_ids)
return request
def print_created_event_id(response):
error_code = response.get('Code')
if error_code is None:
event_id_list = response.get('EventIdSet').get('EventId')
print("Created %s simulated events: %s", len(event_id_list), event_id_list)
else:
print("Creating failed, error code: %s", error_code)
def get_created_event_id(response):
error_code = response.get('Code')
if error_code is None:
event_id_list = response.get('EventIdSet').get('EventId')
return event_id_list
else:
return []
def build_cancel_request(event_id):
request = CancelSimulatedSystemEventsRequest()
request.set_EventIds([event_id])
return request
if __name__ == '__main__':
request = build_create_request("SystemMaintenance.Reboot", "2018-09-01T00:00:00Z", ["i-2zegswzznxbp168sc5c9", "i-2zeimxypwhnj04sbgf5t"])
response = _send_request(request)
event_ids = get_created_event_id(response)
if event_ids:
print("Created %s simulated events: %s"%(len(event_ids), event_ids))
cancel_event_id = event_ids[0]
print("Now we cancel one event %s" % (cancel_event_id))
cancel_request = build_cancel_request(cancel_event_id)
cancel_response = _send_request(cancel_request)
if not cancel_response.get('Code'):
print("Cancel succeeded")

Output after code execution:

Created 2 simulated events: [u'e-2zec65b85gi9zwcv1kpz', u'e-2zec65b85gi9zwcv1kq0']
Wed, 22 Aug 2018 18:39:49 simulate_system_event.py[line:35] INFO {"EventIdSet":{"EventId":["e-2zec65b85gi9zwcv1kpz","e-2zec65b85gi9zwcv1kq0"]},"RequestId":"C1762464-CCC2-46EC-B233-92A4D9C1782C"}
Now we cancel one event e-2zec65b85gi9zwcv1kpz
Cancel succeeded
Wed, 22 Aug 2018 18:39:49 simulate_system_event.py[line:35] INFO {"RequestId":"44286901-1BC3-4BA0-AAAF-C3CF20578E0F"}

Constraints and Cautions

  1. Users can only create and cancel simulated system events on their own instances.
  2. One user can create a maximum of 1000 simulated system events in the Scheduled state.

Reference:https://www.alibabacloud.com/blog/best-practice-on-testing-system-event-processing-program-for-ecs-instances_594056?spm=a2c41.12125576.0.0

--

--

Alibaba Cloud
Alibaba Cloud

Written by Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

No responses yet