Integrating Alibaba Cloud Log Service with Splunk

This article describes how to integrate Alibaba Cloud Log Service with Splunk to ensure all compliance, auditing, and other related logs can be ingested into your Security Operation Center (e.g. Splunk).

Related Terms

  1. Log Service: Alibaba Cloud Log Service, also known as Simple Log Service (SLS).
  2. SIEM: Security Information and Event Management, such as Splunk and QRadar.
  3. Splunk HEC: Splunk Http Event Collector, a HTTP(s) interface to receive logs.

Audit Related Logs

Below table describe the logs available in Log Service that might be applicable to the Security Operations Team.

Image for post
Image for post

Note: Regions are updated regularly. Please refer to the product documentation for the latest status.

Alibaba Cloud Log Service

As a one-stop service for log data, Log Service experiences massive big data scenarios of Alibaba Group. Log Service allows you to quickly complete the collection, consumption, shipping, query, and analysis of log data without the need for development, which improves the Operation & Maintenance (O&M) efficiency and the operational efficiency, and builds the processing capabilities to handle massive logs in the DT (data technology) era.

Alibaba Cloud Log Service has the following architecture:

Image for post
Image for post

Integration Proposal

Concepts

Project

A project is the Log Service’s resource management unit, used to isolate and control resources.

Logstore

The Logstore is a unit in Log Service for the collection, storage, and query of log data. Each Logstore belongs to a project, and each project can create multiple Logstores.

Shard

Logstore read/write logs must be stored in a certain shard. Each Logstore is divided into several shards and each shard is composed of MD5 left-closed and right-open intervals. Each interval range does not overlap with others and the total range of all the intervals is the entire MD5 value range.

Endpoint

The Log Service endpoint is a URL used to access a project and logs within the project, and is associated with the Alibaba Cloud region where the project resides and the project name. https://www.alibabacloud.com/help/doc-detail/29007.htm

AccessKey

Alibaba Cloud AccessKey is a “secure password” designed for you to access your cloud resources by using APIs (not the console). You can use the AccessKey to sign API request content to pass the security authentication in Log Service. https://www.alibabacloud.com/help/doc-detail/29009.htm

Assumptions

User’s SIEM (e.g. Splunk) are located in on-premise Env rather than Cloud. For security consideration, no Port will be accessible to SIEM from external env.

Overview

It’s recommended to built a program with SLS Consumer Group which real-time consume logs from Log Service, then use Splunk API (HEC) to send to Splunk.

Image for post
Image for post

Program with Consumer Group

The consumer library is an advanced mode of log consumption in Log Service, and provides the consumer group concept to abstract and manage the consumption end. Compared with using SDKs directly to read data, you can only focus on the business logic by using the consumer library, without caring about the implementation details of Log Service, or the load balancing or failover between consumers.

Spark Streaming, Storm, and Flink connector use consumer library as the base implementation.

Basic Concepts

Consumer Group — A consumer group is composed of multiple consumers. Consumers in the same consumer group consume the data in the same Logstore and the data consumed by each consumer is different.

Consumer — Consumers, as a unit that composes the consumer group, must consume data. The names of consumers in the same consumer group must be unique.

In Log Service, a Logstore can have multiple shards. The consumer library is used to allocate a shard to the consumers in a consumer group. The allocation rules are as follows:

Each shard can only be allocated to one consumer.

One consumer can have multiple shards at the same time.

After a new consumer is added to a consumer group, the affiliations of the shards for this consumer group is adjusted to achieve the load balancing of consumption. However, the preceding allocation rules are not changed. The allocation process is transparent to users.

The consumer library can also save the checkpoint, which allows consumers to consume data starting from the breakpoint after the program fault is resolved and makes sure that the data is consumed only once.

Deployment Proposal

Hardware Proposal

Host Spec:

A host is needed to run your program, A Linux (e.g. Ubuntu x64) with hardware spec is recommended:

  1. 2.0+ GHZ X 8 Core 32G
  2. 1 Gbps
  3. at least 2GB available disk space, suggest 10GB or more.

Network Spec:

The bandwidth from your on-premise env to Alibaba Cloud should be large enough to ensure the speed of data consumption is not slower than the speed of data generation.

Usage (Python)

Here we describe the program using Consumer Group in Python. For Java Usage Introduction, refer to this documentation.

Note: You combine all example into one file in GitHub, you could get the latest version of the example here.

Installation

Environment

  1. PyPy3 is highly recommended to run the program rather than CPython.
  2. Python SDK of Log Service could be install:
  1. For more guide on SLS Python SDK, refer to the guide.

Program Configuration

The following code shows you how to configure the program:

  1. local debugging rotation log file () for diagnose purpose.
  2. basic SLS connection and consumer group options for connection
  3. advanced consumer group options for tuning (but not recommended)
  4. settings and related options for SIEM ()

Please read the comments carefully and make adjustments whenever necessary.

Data Consumption and Forwarding

The code below shows you how to process data fetched from SLS and forward them into Splunk.

Main Control

The following code shows how to start the program.

Launch

Suppose the program is saved as “sync_data.py”, we could launch it as:

Limitation and restriction

Each Logstore can create at most 10 consumer groups. The error ConsumerGroupQuotaExceed is reported when the number exceeds the limit.

Monitor

You can monitor your logs using Log Service. You can view consumer group status as well as create a consumer group monitoring alarm

Performance Consideration

Running Multiple Consumers

The program could be launched for multiple times to have multiple consumers to process in parallel.

Note: All consumers should using same consumer group name with different consumer name. Since one shard could only be consumed by one consumer, suppose you have 10 shard, you could have up to 10 consumers consuming in parallel.

HTTPS

if the endpoint is configured with prefix https:// e.g. , the connection will automatically be encrypted in https.

The certification of is GlobalSign, by default most Linux/Windows machines should already trust it by default. In case your mahcine don't trust it, you could download the cert and install&trust it on the machine running the program.

Refer to this guide for detail.

Throughput

Basing on test, w/o bottleneck of network bandwidth, receiver side speed (Splunk side) limitation, launch the example above using under the recommended hardware spec. Each consumer could consume up to in raw logs size using less than . Thus, theoretically, it could be , which is about .

Note: this is highly depending on your bandwidth, hardware spec and how fast your SIEM (e.g. Splunk) could receive the data and the count of your shard.

High Availability

Since Consumer Group will save check-point on server side. When one consumer is stopped, other consumer will take over its shard to continually consume.

You could also start the consumer in different machines, so when one machine is shut down or broken, consumers on another machine could take over the tasks automatically.

You could start more consumers than the count of shard for backup purposes.

Further Reading

  1. Log Service Product Page official product page of the Log Service.
  2. Log Service Learning Path shows you how to get started with Log Service and lists the popular functions and documentations of this product.
  3. Log Service Official Documentation includes all technical documentations related to Log Service.

Reference:https://www.alibabacloud.com/blog/integrating-alibaba-cloud-log-service-with-splunk_594335?spm=a2c41.12475602.0.0

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store