Simplify Elasticsearch Data Analysis with Transforms Data Pivoting

Image for post
Image for post

Released by ELK Geek

Elasticsearch Transforms allows you to retrieve information from an Elasticsearch index, transform the information, and store it in another index. It enables you to pivot data and create entity-centric indexes that summarize the behaviors of entities. This organizes data into an easy-to-analyze format. This tutorial uses sample Kibana data to demonstrate how to use Transforms to pivot and summarize data.

Image for post
Image for post

Prepare Data

The exercise in this tutorial uses sample eCommerce orders as an example.

First, prepare the Alibaba Cloud Elasticsearch 6.7 environment, use the user name and password you created to log on to Kibana, and import the data into Elasticsearch.

Image for post
Image for post
Image for post
Image for post

Click Add data.

Image for post
Image for post

You have now imported the eCommerce data. If you are not familiar with the kibana_sample_data_ecommerce index, use the Revenue dashboard in Kibana to browse the data. Consider the insights you want to gain from the eCommerce data.

Image for post
Image for post
Image for post
Image for post

Use Options to Group and Summarize Data

The data that is pivoted must be grouped by at least one field, with at least one aggregation applied to it. You can preview the transformed data before proceeding to subsequent operations.

For example, you might want to group the data by product ID and calculate the total quantity sold of each product and the average product price. In addition, you may want to look at the behavior of a customer and calculate how much each customer has spent in total and how many different categories of products they have purchased. In other cases, you may need to consider currencies or geographic locations. What is the most interesting way to transform and interpret the data?

Next, let’s do an exercise. Go to Kibana.

Image for post
Image for post

Click Create your first transform.

Image for post
Image for post

Select [eCommerce] Orders.

Image for post
Image for post

Select the items that you are interested in.

Image for post
Image for post

As you can see, a Transform pivot preview table is displayed on the right of the page. The table provides data that is not included in the raw data, such as the sum of products.quantity.

Click Next.

Image for post
Image for post

Click Next.

Image for post
Image for post

Click Create and start.

Image for post
Image for post

The Progress bar indicates that the transform is completed. Click Transforms in the red box to return to the transform management page, as shown in the preceding figure.

Image for post
Image for post

As shown in the preceding figure, the transform was completed but the Status is stopped. This occurred due to the small data volume. Click the down arrow to display more information.

Image for post
Image for post

You can now see all the details of the transform.

Next, go to the Discover page to view the latest index: ecommerce-customer-sales.

Image for post
Image for post

Select the ecommerce-customer-sales index.

Image for post
Image for post

A total of 3,321 documents are found and each of them contains information like that shown in the preceding figure. The spending information of the current user is displayed. You can search based on this index. This data is very useful in many situations, such as machine learning. You can produce such indexes to analyze data.

Use API Operations to Create a Transform

In the preceding example, we created a transform on the Graphical User Interface (GUI). However, you can also use API operations to create a transform.

To get started, run the following statement to define the pipeline to be used:

PUT _ingest/pipeline/add_timestamp_pipeline
{
"description": "Adds timestamp to documents",
"processors": [
{
"script": {
"source": "ctx['@timestamp'] = new Date().getTime();"
}
}
]
}

Then, run the following statement:

PUT _transform/ecommerce_transform
{
"source": {
"index": "kibana_sample_data_ecommerce",
"query": {
"term": {
"geoip.continent_name": {
"value": "Asia"
}
}
}
},
"pivot": {
"group_by": {
"customer_id": {
"terms": {
"field": "customer_id"
}
}
},
"aggregations": {
"max_price": {
"max": {
"field": "taxful_total_price"
}
}
}
},
"description": "Maximum priced ecommerce data by customer_id in Asia",
"dest": {
"index": "kibana_sample_data_ecommerce_transform",
"pipeline": "add_timestamp_pipeline"
},
"frequency": "5m",
"sync": {
"time": {
"field": "order_date",
"delay": "60s"
}
}

This statement defines a continuous transform that checks for the latest data and transforms it every 5 minutes. This interval is specified by the frequency field. Then, go to the transform management page.

Image for post
Image for post

You can see the newly created continuous transform on the page. Click Start or run the following statement to start this transform:

POST _transform/ecommerce_transform/_start

After you run the preceding statement, go to the transform management page again.

Image for post
Image for post

As shown in the preceding figure, the transform is running. You can click Stop to terminate the transform as needed.

The preceding statements do not create an index pattern for the newly created index kibana_sample_data_ecommerce_transform. Therefore, you need to manually create one. After you create the index pattern, go to the Discover page to view the new transform index.

Image for post
Image for post
Image for post
Image for post

A total of 13 documents are found because Asia is used as a filter condition. All data is grouped by customer_id. The maximum price for this customer is displayed. The pipeline timestamp is also displayed.

You can write a new document that meets the Asia conditions to the kibana_sample_data_ecommerce index and then check whether the kibana_sample_data_ecommerce_transform index contains one more document.

You can use the following API operations to delete the transform:

POST _transform/ecommerce_transform/_stop
DELETE _transform/ecommerce_transform

Declaration: This article is reproduced with authorization from Liu Xiaoguo, the original author and an advocate of the China Elasticsearch Community. The author reserves the right to hold users legally liable in the case of unauthorized use.

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store