Data Conversion Practices of MaxCompute Data Warehouse

Data Architecture and Process

  1. The temporary layer contains incremental data and full data.
  2. The basic data layer can permanently store data. It contains a core model and general summary data. The core model contains customer, offering, event, channel, and code data. This layer creates model tables by using the entity and attribute naming specifications of the data warehouse. Tables at this layer can be classified as primary tables, history tables, and incremental tables. These tables store historical data and feature efficient usage and convenient design.
  3. The application layer contains data marts that cover customer analysis, sales analysis, and offering inventory analysis. Unlike the basic data layer, which permanently stores data, the application layer only stores required data. However, like the basic data layer, the application layer also creates model tables by using the entity and attribute naming specifications of the data warehouse.

ETL Algorithms

Update or insert (to primary tables) algorithm

Loading algorithm

Full history table algorithm

NULL Value Processing

ETL on MaxCompute and DataWorks

Unified ETL script development

ETL task mapping

ETL conversion task development — example

ETL development procedure

  1. Run the scriptsGen.py program, set parameters based on the ETL algorithm, generate a unified ETL script file, and modify the NULL processing content in the script file.
  2. Create the corresponding directory and task on the DataWorks data development page, and copy the SQL script file to the new task.
  3. Test the running, set scheduling parameters, and click Submit.

Summary

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Oracle ARCS Transaction Matching Data Model

Optimizing X-Engine’s In-memory Read Performance

Getting Image Resolution with Python

Semantic HTML — An overview

LazyTensor in Action at Facebook & Google

Java 8 | Increment / Decrement Operator

Partnership with OpenTok

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Build a Winning Data Pipeline Architecture on the Cloud for CPG

A Better Approach to Controlling Modern Data Cloud Costs

Open Data Architecture at scale on Cloud

Log PrestoSQL/Trino Queries