3 Ways to Migrate Java Logs to the Cloud: Log4J, LogBack, and Producer Lib
By Bruce Wu
The Path to Log Centralization
In recent years, the advent of stateless programming, containers, and serverless programming greatly increased the efficiency of software delivery and deployment. In the evolution of the architecture, you can see the following two changes:
- The application architecture is changing from a single system to microservices. Then, the business logic changes to the call and request between microservices.
- In terms of resources, traditional physical servers are fading out and changing to the invisible virtual resources.
The preceding two changes show that behind the elastic and standardized architecture, the Operation & Maintenance (O&M) and diagnosis requirements are becoming more and more complex. Ten years ago, you could log on to a server and fetch logs quickly. However, the attach process mode no longer exists. Currently, we are facing with a standardized black box.
To respond to these changes, a series of DevOps-oriented diagnosis and analysis tools have emerged. These include centralized monitors, centralized log systems, and various SaaS deployment, monitoring, and other services.
Centralizing logs solves the preceding issues. To do this, after applications produce logs, the logs are transmitted to a central node server in real time (or quasi-real time). Often, Syslog, Kafka, ELK, and HBase are used to perform centralized storage.
Advantages of Centralization
- Ease of use: Using Grep to query stateless application logs is troublesome. In the centralized storage, the previous long process is replaced by running a search command.
- Separated storage and computing: When customizing machine hardware, you do not have to consider the storage space for logs.
- Lower costs: Centralized log storage can perform load shifting to reserve more resources.
- Security: In case of hacker intrusion or a disaster, critical data is retained as the evidence.
Collector (Java series)
Log Service provides more than 30 data collection methods and comprehensive access solutions for servers, mobile terminals, embedded devices, and various development languages. Java developers need the familiar log frameworks: Log4j, Log4j2, and Logback Appender.
Java applications currently have two mainstream log collection solutions:
- Java programs flush logs to disks and use Logtail for real-time collection.
- Java programs directly configure the Appender provided by Log Service. When the program is running, logs are sent to Log Service in real time.
Differences between the two:
By using Appender, you can use Config to complete real-time log collection easily without changing any code. The Java-series Appender provided by Log Service has the following advantages:
- Configuration modifications take effect without modifying the program.
- Asynchrony + breakpoint transmission: I/O does not affect main threads and can tolerate certain network and service faults.
- High-concurrency design: Meets the writing requirements for massive logs.
- Supports context query: Supports precisely restoring the context of a log (N logs before and after the log) in the original process in Log Service.
Overview and usage of Appender
The provided Appenders are as follows. The underlying layers all use aliyun-log-producer-java to write data.
Differences between the four:
Integrate Appender
You can integrate Appender by performing the configuration steps of aliyun-log-log4j-appender.
The contents of the configuration file log4j.properties
are as follows:
log4j.rootLogger=WARN,loghublog4j.appender.loghub=com.aliyun.openservices.log.log4j.LoghubAppender# Log Service project name (required parameter)
log4j.appender.loghub.projectName=[your project]
# Log Service LogStore name (required parameter)
log4j.appender.loghub.logstore=[your logstore]
#Log Service HTTP address (required parameter)
log4j.appender.loghub.endpoint=[your project endpoint]
# User identity (required parameter)
log4j.appender.loghub.accessKeyId=[your accesskey id]
log4j.appender.loghub.accessKey=[your accesskey]
Query and Analysis
After configuring the Appender as described in the previous step, the logs produced by Java applications are automatically sent to Log Service. You can use LogSearch/Analytics to query and analyze these logs in real time. See the sample log format as follows. Log formats used in this example:
Logs that record your logon behavior:
level: INFO
location: com.aliyun.log4jappendertest.Log4jAppenderBizDemo.login(Log4jAppenderBizDemo.java:38)
message: User login successfully. requestID=id4 userID=user8
thread: main
time: 2018-01-26T15:31+0000
Logs that record your purchase behavior:
level: INFO
location: com.aliyun.log4jappendertest.Log4jAppenderBizDemo.order(Log4jAppenderBizDemo.java:46)
message: Place an order successfully. requestID=id44 userID=user8 itemID=item3 amount=9
thread: main
time: 2018-01-26T15:31+0000
Enable Query Analysis
You must enable the query and analysis function before querying and analyzing data. Follow these steps to enable the function:
- Log on to the Log Service console.
- On the Project List page, click the project name, or click Manage on the right side.
- Select a Logstore and click Search in the LogSearch column.
- Choose Set LogSearch and Analytics > Settings.
- Go to the Settings menu and enable queries for the following fields.
Analyze Logs
Let’s look at five analysis examples.
- Count the top three locations where errors occurred most commonly in the last hour.
Syntax example
level: ERROR | select location ,count(*) as count GROUP BY location ORDER BY count DESC LIMIT 3
2. Count the number of generated logs for each log level in the last 15 minutes.
Syntax example
| select level ,count(*) as count GROUP BY level ORDER BY count DESC
3. Query the log context.
For any log, you can precisely reconstruct the log context information for the original log file.
For more information, see Context query.
4. Count the top three users who have logged on most frequently in the last hour.
Syntax example
login | SELECT regexp_extract(message, 'userID=(? <userID>[a-zA-Z\d]+)', 1) AS userID, count(*) as count GROUP BY userID ORDER BY count DESC LIMIT 3
5. Compile payment total statistics for the past 15 minutes for each user.
Syntax example
order | SELECT regexp_extract(message, 'userID=(? <userID>[a-zA-Z\d]+)', 1) AS userID, sum(cast(regexp_extract(message, 'amount=(? <amount>[a-zA-Z\d]+)', 1) AS double)) AS amount GROUP BY userID
Acknowledgement
aliyun-log-log4j-appender, aliyun-log-log4j2-appender, aliyun-log-logback-appender, and aliyun-log-producer-java are jointly completed by the Alibaba Cloud team and the Co-Creation Platform contributors. Thank you for your outstanding contribution to the project.
aliyun-log-log4j-appender Contributed by: @zzboy
aliyun-log-log4j2-appender Contributed by: @LNAmp @zzboy
aliyun-log-logback-appender Contributed by: @lionbule @zzboy
aliyun-log-producer-java Contributed by: @zzboy