Analyzing Data on Tableau with Data Lake Analytics

  1. Activate Data Lake Analytics
  2. Upload data files to OSS
  3. Create relevant schema and tables
  4. Connect Tableau with Data Lake Analytics
  5. Analyze data using Tableau

Activate Data Lake Analytics

Upload Data Files to Object Storage Service (OSS)

Create Relevant Schema and Tables

CREATE SCHEMA ecomm WITH DBPROPERTIES(
LOCATION = 'oss://<Bucket-Name>/brazilian-ecommerce/',
catalog='oss'
);
USE ecomm;
CREATE EXTERNAL TABLE ordr_lctn_lst(
zip_code_prefix STRING,
city STRING,
state STRING,
lat STRING,
lng STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'oss://<bucket-name>/brazilian-ecommerce/geolocation_olist_public_dataset.csv'
TBLPROPERTIES ("skip.header.line.count"="1");
select * from ecomm.ordr_lctn_lst;
CREATE EXTERNAL TABLE prod_ordr_lst(
order_id STRING,
order_status STRING,
order_products_value DOUBLE,
order_freight_value DOUBLE,
order_items_qty INT,
order_sellers_qty INT,
order_purchase_timestamp TIMESTAMP,
order_aproved_at TIMESTAMP,
order_estimated_delivery_date TIMESTAMP,
order_delivered_customer_date TIMESTAMP,
customer_id STRING,
customer_city STRING,
customer_state STRING,
customer_zip_code_prefix INT,
product_category_name STRING,
product_name_length INT,
product_description_length INT,
product_photos_qty INT,
product_id STRING,
review_id STRING,
review_score INT,
review_comment_title STRING,
review_comment_message STRING,
review_creation_date TIMESTAMP,
review_answer_timestamp TIMESTAMP
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'oss://bucket-name/brazilian-ecommerce/olist_public_dataset_v2.csv'
TBLPROPERTIES ("skip.header.line.count"="1");
  1. If x<=0, then NO lines will be skipped
  2. If 0
  3. If x>=n, all the lines will be skipped
Select * from ecomm.prod_ordr_lst
CREATE EXTERNAL TABLE slr_ordr_lst(
order_id STRING,
product_id STRING,
seller_id STRING,
seller_zip_code_prefix INT,
seller_city STRING,
seller_state STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'oss://dla-bucket-saisarath/brazilian-ecommerce/sellers_olist_public_dataset_.csv'
TBLPROPERTIES ("skip.header.line.count"="1");
SELECT * FROM slr_ordr_lst;

Connect Tableau with Data Lake Analytics

jdbc:mysql://service.ap-southeast-1.datalakeanalytics.aliyuncs.com:10000/
SELECT `ordr_lctn_lst`.`city` AS `city`,
`ordr_lctn_lst`.`lat` AS `lat`,
`ordr_lctn_lst`.`lng` AS `lng`,
`ordr_lctn_lst`.`state` AS `state`,
`ordr_lctn_lst`.`zip_code_prefix` AS `zip_code_prefix`
FROM `ecomm`.`ordr_lctn_lst`
LIMIT 100000
SELECT `slr_ordr_lst`.`order_id` AS `order_id`,
`slr_ordr_lst`.`product_id` AS `product_id`,
`slr_ordr_lst`.`seller_city` AS `seller_city`,
`slr_ordr_lst`.`seller_id` AS `seller_id`,
`slr_ordr_lst`.`seller_state` AS `seller_state`,
`slr_ordr_lst`.`seller_zip_code_prefix` AS `seller_zip_code_prefix`
FROM `ecomm`.`slr_ordr_lst`
LIMIT 10000
  1. The current release version doesn’t support UPDATE and DELETE Statements but they are in pipeline.
  2. Data Lake Analytics performs fetches data in much faster speed compared to traditional systems due to high-performance engine Fully leveraging Massive Parallel Processing (MPP) architecture, providing vectorized execution optimization, operator pipelined execution optimization, multi-tenancy resource allocation and priority scheduling.
  3. With Data Lake Analytics you don’t need to worry about maintenance, it is powered by Serverless Architecture, Elastic scalability and pay-as-you-go mode. You only get charged for the queries you execute.
  4. Data Lake Analytics connects seamlessly with other Alibaba Cloud products, Like QuickBI and DataV. You can also store the output onto the OSS bucket. DLA helps to use heterogenous data sources unlike the data warehousing where you see only structured data. The typical use cases will be like below

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to Create Drupal Role with Ansible Playbook

NeonScan Integration with the Neon EVM

Bluehost Vs Weebly

Android — Designing a Chat App with Firebase! (Part 1)

How to Manage Django Images and Static Assets on Ubuntu 18.04

The Power of Lighting With Probes and Emission

Fox and Snakes Coding Question

Advanced WordPress Management Using WP-CLI on Alibaba Cloud ECS — Part 1

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Capturing Data Analytics Workflows and System Requirements

Data processing with Dataflow SQL (part 2/2)

Becoming a data-driven start-up — Part 1: Analytics MVP using Power BI

StorageX 8.5 Object to Object Replication — Data Dynamics, Inc.