How to Integrate Alibaba Cloud WAF Log with Splunk
By Victor Mak, Solutions Architect
This article describes how to integrate Alibaba Cloud Web Application Firewall (WAF) log with Splunk to ensure all compliance, auditing, and other related logs can be ingested into your Security Operation Center.
The following figure illustrates the Splunk integration architecture:
Alibaba Cloud Log Service is a one-stop service for log data, Log Service experiences massive big data scenarios of Alibaba Group. Log Service allows you to quickly complete the collection, consumption, shipping, query, and analysis of log data without the need for development, which improves the Operation & Maintenance (O&M) efficiency and the operational efficiency, and builds the processing capabilities to handle massive logs in the DT (data technology) era. For more information, see Log Service (SLS) Production Introduction.
We will be using Python on an Alibaba Cloud Elastic Compute Service (ECS) instance, integrated with Splunk HEC, to deliver WAF log to Splunk. The consumer library is an advanced mode of log consumption in Log Service, and provides the consumer group concept to abstract and manage the consumption end. Compared with using SDKs directly to read data, you can only focus on the business logic by using the consumer library, without caring about the implementation details of Log Service, or the load balancing or failover between consumers. For more information, see consumer group Introduction.
Splunk HEC is a Splunk Http Event Collector, a HTTP(s) interface to receive logs.
Before you begin, make sure you have the following:
- An Alibaba Cloud Account. If you don’t have one already, visit the Free Trial Page to sign up for a free account.
- You have purchased WAF business edition or above to protect your website. If not, visit the WAF Product Page to learn more about Alibaba Cloud WAF.
- You have a Linux ECS server with hardware spec is recommended:
- Operating System with Ubuntu
- 8 vCPUs with 2.0+ Ghz
- 32GB Memory
- at least 2GB available disk space, suggest 10GB or more
- You have a Splunk Enterprise server.
- Enable Web Application Firewall (WAF) logging.
- Configure Splunk Http Event Collector (HEC).
- Setup Python environment in ECS.
- Configure Python program send log to Splunk.
Step 1: Enable Web Application Firewall (WAF) Logging
Follow these steps to enable Web Application Firewall (WAF) logging in the WAF console:
- Log on to the Alibaba Cloud WAF console
- In the left-side, navigate to the App Management under App Market.
- Click Upgrade in the right-side to enable WAF Real-time logging
Enable Access log service, select log storage period and log storage size depends on your business actual usage.
Click Authorize after enabled access log service
Click Confirm Authorization Policy
Click Configure in Real-time Log Query and Analysis Service:
Enable the website you want to enable Log services in drop-down list.
Step 2: Configure Splunk HTTP Event Collector (HEC)
Follow these steps to configure Http Event Collector (HEC) in the Splunk console:
- Log on to the Splunk Enterprise Console
- In the upper-right, navigate to the Data inputs under Settings
Navigate to HTTP Event Collector and click Add new
In the task of Select Source, give a Name for HTTP Event Collector and click Next
In the task of Add Data, add index to HTTP Event Collector. In this example, we use main index and click Review and Submit
The HTTP Event Collector has been created successfully.
Navigate to HTTP Event Collector, you should see the token now
Navigate to HEC token you created and click Edit, input source and source type
Step 3: Setup Python Environment in ECS
Follow these steps to install Log Service Python SDK in ECS:
- SSH or console login to ECS
- Install Python3, pip and Python SDK of Log Service. For more information on Log Services Python SDK, see User Guide.
apt-get install -y python3-pip python3-dev
ln -s /usr/bin/python3 python
pip3 install --upgrade pip
pip install aliyun-log-python-sdk
Step 4: Configure Python to Send Logs to Splunk
- Before you begin, make sure you have following information:
- Project is the Log Service’s resource management unit, used to isolate and control resources. You can find the Project Name in Alibaba Cloud Log Services console:
- Log Service Endpoint is a URL used to access a project and logs within the project, and is associated with the Alibaba Cloud region where the project resides and the project name. You can find the Endpoint URL in Service endpoint.
- Logstore is a unit in Log Service for the collection, storage, and query of log data. Each Logstore belongs to a project, and each project can create multiple Logstores. You can find the Logstore Name under your Log Service Project in Alibaba Cloud Log Services console:
AccessKey is a “secure password” designed for you to access your cloud resources by using APIs (not the console). You can use the AccessKey to sign API request content to pass the security authentication in Log Service. For more information, see AccessKey Introduction. You can find your Accesskey in User Management console:
- Splunk Enterprise Host is same as the IP address/Hostname you access Splunk Enterprise Server web console
- Splunk HEC port, you can find the HEC port in Splunk Global Setting under HEC console:
HEC Token is the token you use to integrate with Alibaba Cloud Log Services and Splunk. You can find token under HTTP Event Collector console:
Download the latest example of integration code from GitHub:
- Replace Log Services and Splunk related settings in Python Program includes:
- SLS Endpoint
- SLS accessKeyId
- SLS accessKey
- SLS Project
- SLS Logstore
- SLS Consumer Group
- Splunk Host
- Splunk HEC Port
- Splunk HEC Token
Suppose the Python program is saved as “sync_data.py”, you could launch it as:
The Python program log should show successful sent log to remote Splunk server.
*** start to consume data... consumer worker "WAF-SLS-1" start heart beat start heart beat result:  get: [0, 1] Get data from shard 0, log count: 6 Complete send data to remote Get data from shard 0, log count: 2 Complete send data to remote heart beat result: [0, 1] get: [0, 1]
You are able to search the WAF log in Splunk server now