The Latest Features of DataWorks: How to Choose the Right Edition of DataWorks

Background

History of DataWorks

More than a decade has elapsed since the DataWorks project was established in 2009, followed by Alibaba’s Moon Landing Project, the release of Alibaba Cloud and Apsara Stack, and the release of DataWorks V2.0 in 2018. The entire process includes several key milestones. From 2009 to 2013, DataWorks became able to schedule Hadoop cluster tasks. Afterwards, Hadoop clusters could no longer support Alibaba’s increasingly massive amounts of data. Therefore, the company began to explore the combination of MaxCompute and DataWorks. After 2013, DataWorks began to support scheduling MaxCompute tasks Based on MaxCompute and DataWorks, Alibaba built its Data Mid-End.

DataWorks, an All-in-one Platform for Big Data Development and Governance

As Alibaba’s all-in-one big data development platform, DataWorks provides core capabilities in two aspects: data development and data governance. Before the first half of 2018, most users used DataWorks for data development. In this aspect, DataWorks transfers data sources to MaxCompute by using data integration and conducts timed scheduling with DataStudio offline computing nodes. Since the second half of 2018, DataWorks V2.0 has seamlessly deployed Alibaba’s internal data governance features to Alibaba Cloud. This provides every Alibaba Cloud user with comprehensive data governance capabilities in DataWorks Basic Edition, including data lineage, data quality monitoring, task monitoring, data audit, and secure data permission control.

Scenario-based Introduction to Advanced Features of DataWorks

DataWorks Basic Edition

DataWorks Basic Edition provides practical features to help users build data warehouses quickly. DataWorks Basic Edition covers the entire lifecycle of big data development, including all modules from data access, data development, production scheduling, visual O&M, data quality monitoring, table-level permissions management, and data service API build, through to final data presentation in application development. It is worth mentioning that the “batch migration to cloud” feature has been added to data access of DataWorks. For example, your data is from multiple MySQL databases. Each database contains multiple DBs, and each DB has multiple tables. If this is the case, you can use this feature to upload the data in Excel format, quickly create multiple data synchronization tasks, and migrate data to the cloud. Currently, this new feature only supports Oracle, MySQL, and SQL Server. DataWorks Basic Edition also provides data quality monitoring, allowing you to set custom rules for check.

DataWorks Standard Edition

DataWorks Standard Edition provides more complex and specialized node types for development, and better visual support for the real-time Flink engine than the Basic Edition. DataWorks Standard Edition are oriented to enterprises that have fast-growing big data systems. When an enterprise’s data system is developing rapidly, data quality and security problems are gradually exposed, causing quality and security risks. In terms of data governance, DataWorks Standard Edition provides corresponding capabilities to help you solve your problems.

DataWorks Professional Edition

DataWorks Professional Edition provides APIs for extending data services to offer more flexible and high-availability service capabilities. In terms of data governance, the security of DataWorks Professional Edition has been enhanced.

DataWorks Enterprise Edition

DataWorks Enterprise Edition adds the API orchestration feature for DataService Studio. The data mid-end solves the problems of storing, connecting, and using data. That is, data specifications are stored in a centralized manner, data between various systems is connected, planned, and processed in a centralized manner, and the final data is used by the business side. The teams of the business side that use the data have diversified needs. If the data mid-end can respond to the needs of the teams swiftly, the data system will gain more trust and more strategic importance in the enterprise. If the data mid-end cannot meet the needs of the teams in time or does not meet their expectations, the data system will be increasingly marginalized. How can we meet the complex and diverse needs of the business side? The DataService Studio module of DataWorks Enterprise Edition currently provides service orchestration that can fit into various complex scenarios.

How to Choose the Right DataWorks Edition

  • DataWorks Basic Edition is applicable to scenarios where there is an urgent need to build a data warehouse but not enough staff. DataWorks Basic Edition allows data visualization and batch data migration to the cloud. The DataStudio page also supports visualization, including SQL statement writing, workflow build, dependency build, and task O&M.
  • DataWorks Standard Edition is applicable to enterprises whose data systems develop rapidly. When the data volume of an enterprise increases and the volume of tasks are scaled up, security governance becomes more important. DataWorks Standard Edition provides basic data governance capabilities and certain advanced data development capabilities.
  • DataWorks Professional Edition is applicable to enterprises that need to build relatively mature data service systems. If you have high security requirements, the risk identification capability provided by DataWorks Professional Edition is necessary.
  • DataWorks Enterprise Edition is applicable to scenarios where you think that the current data system is not flexible enough to provide services. The data orchestration service of DataWorks Enterprise Edition can meet various requirements of the entire data mid-end for the business side. DataWorks Enterprise Edition also provides users with increasing capabilities in data security and audit. Meanwhile, the custom development capabilities will be more and more robust.

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store