Converting JSON-formatted Log Data with MaxCompute Built-in Functions and UDTF

Analysis of Business Scenarios

  1. Data Source: The application writes to the log file in real time under the specified ECS host directory.
  2. Data Formats: In log files, the format of each log is shown in the following figure (the data is simplified and masked in the example). Each log contains the device information, as well as the information for 1 or more sessions, and the number of sessions in each log is dynamic: 1 or more sessions. An example of the contents of a log is as follows:

Recommended Solution

Log Collection

select
get_json_object(content,'$. DeviceID') as DeviceID,
get_json_object(content,'$. UniqueIdentifier') as UniqueIdentifier,
get_json_object(content,'$. GameID') as GameID,
get_json_object(content,'$. Device') as Device,
get_json_object(content,'$. Sessions\[0]. SessionID') as Session1_ID,
get_json_object(content,'$. Sessions\[0]. Events\[0]. Name') as Session1_EventName,
get_json_object(content,'$. Sessions\[1]. SessionID') as Session2_ID,
get_json_object(content,'$. Sessions\[1]. Events\[0]. Name') as Session2_EventName
from log_target_json where pt='20180725' limit 10

Develop MaxCompute UDTF Function to Process Logs

package com.aliyun.odps;import com.aliyun.odps.udf.UDFException;
import com.aliyun.odps.udf.UDTF;
import com.aliyun.odps.udf.annotation.Resolve;
import com.google.gson.Gson;
import java.io.IOException;
import java.util.List;
import java.util.Map;
@Resolve("string->string,string,string,string,string,string,string,string")
public class get_json_udtf extends UDTF {
@Override
public void process(Object[] objects) throws UDFException, IOException {
String input = (String) objects[0];
Map map = new Gson().fromJson(input, Map.class);
Object deviceID = map.get("DeviceID");
Object uniqueIdentifier = map.get("UniqueIdentifier");
Object gameID = map.get("GameID");
Object device = map.get("Device");
List sessions = (List) map.get("Sessions");
for (Object session : sessions) {
Map sMap = (Map) session;
Object sessionID = sMap.get("SessionID");
List events = (List) sMap.get("Events");
for (Object event : events) {
String name = (String) ((Map) event).get("Name");
String timestamp = (String) ((Map) event).get("Timestamp");
String networkStatus = (String) ((Map) event).get("NetworkStatus");
forward(deviceID, uniqueIdentifier,gameID,device,
sessionID,name,timestamp,networkStatus);
}
}
}
}
add jar maxcompute_demo-1.0-SNAPSHOT.jar -f;
create function get_json_udtf as com.aliyun.odps.get_json_udtf using maxcompute_demo-1.0-SNAPSHOT.jar';

Summary

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Base64 Explained

Use Case: We Want to Send Data Reports Via Scheduled Emails

Yurbi - Email Delivery of Data

The Bash Trap Trap

How to make anki cards fast (new).

Free Sticky Notes Download

How Flexibility in Mobile App Development Can Boost Product Quality

Introduction to GameBox | NFTs and GAME tokens.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Migrating to a Multi-Cluster Managed Kafka with 0 Downtime

GraalVM — Native Images & Microframeworks

Play with cats under specs.

Running low-code scenarios with GraalVM native image