Why I Love Data Lake Analytics

What is Data Lake Analytics?

With the advancements in cloud, issues such as data availability and accessibility can be easily addressed. However, the growth in data volume also means systems need to be able to process more kinds of raw data. This raw data may include structured data, unstructured data, images, videos, files, blobs, and other file formats.

Data Lake vs. Data Warehouse

A lot of people often mistake data lakes with data warehouses. In fact, they are quite different. Products such as Alibaba Cloud Table Store is a data warehouse while Data Lake Analytics, like its name suggests, is a data lake. There are differences between both types of data storage:

Data Persistence

Data Warehouses removes data which is not part of the decision-making process. This is help in reducing the usage of the disk space. This might also be due to the scoped process of the Enterprise Data Warehouse. While on the other hand, Data Lake Analytics keeps the track of every data, due to different actions we are going to perform and the wide ranges of Data Types it supports. A majority of Data Lake Analytics users connect processed data to their own BI tools to automate reporting. A minor section of users may perform the data analysis on the old data using the new techniques. And there is also other group which might work on bring in new data and create a new dataset out of it.

Data Types

Data Lake Analytics supports almost all of the data types of data, whereas the Data Warehouses contain generated data from the transactional systems. The data will be well built with all the attributes describing them. Whereas the data in the Data Lake might or might not contain much information on the data.

Ability to Adapt

There is an overhead when we talk of implementing the data warehouses just for the reason that there is lot of things to do like cleaning up of data, assigning schema and bring data into a proper shape for the Business Intelligence team to consume this takes a significant amount of time in development and there is time associated with it. But Data Lake Analytics doesn’t pose these problems, since we have the raw data available we can experiment and leave the result if we don’t want to else we can persist and provide the information.

Is Data Lake Analytics Really for Me?

The answer can be either Yes or No, depending on your current situation. If you have a well-established system using Data Warehouse and it works for you, then Data Lake Analytics may not be necessary for you. However if you are fairly new, and are starting up in analyzing a large amount of data, then you can start considering the Data Lake Analytics for your use case. If you have a Data Warehouse, but are struggling with the above-mentioned issues, then you can start implementing the Data Lake Analytics in parallel. When you are fully comfortable, you can move everything into Data Lake Analytics.

Why Alibaba Cloud for Data Lake Architecture?

The Alibaba Cloud team has only recently announced the Data Lake Analytics product, currently in Version 1.0.0. While it may not be exactly serving to your business needs at this point, the product packs so many features that may help your business.

Key Advantages of Alibaba Cloud Data Lake Analytics

  1. Serverless
  2. Database-like User Experience
  3. Data Federation
  4. Cloud native federation analytics across multiple data sources: OSS, PostgreSQL, MySQL (RDS), NoSQL (Table Store), etc.
  5. High-performance Analysis Engine
  6. Leverages the new generation analysis engine XIHE and applies MPP and DAG technology to achieve high compression ratio, high scalability, and high availability

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com