Data Security Best Practices for MaxCompute

Image for post
Image for post

By Alibaba Cloud MaxCompute

As an enterprise-class cloud data warehousing solution adopting an SaaS-based model, Alibaba Cloud MaxCompute ensures continuous business and data security for its customers. MaxCompute recently upgraded its comprehensive security capabilities. This article describes the best practices based on the native and integrated security capabilities of MaxCompute and DataWorks in typical data risk scenarios, such as data misuse, abuse, breaches, and loss throughout the data lifecycle.

What Is MaxCompute?

Alibaba Cloud MaxCompute is a cloud-native, high-performance enterprise-level data warehousing service based on the Software-as-a-Service (SaaS) model. It is widely used to build modern enterprise data platforms for business intelligence (BI) analysis, data-driven operations, profiling and recommendation, intelligent prediction, and other scenarios.

MaxCompute draws strength from Alibaba Cloud’s large-scale computing and storage resources and provides a fully managed online data warehousing service through a serverless architecture. It breaks the limitations on resource scalability and elasticity, which are common on traditional data platforms, and minimizes investment in operations and maintenance (O&M)

MaxCompute supports a wide range of classic computing models, such as batch processing, machine learning, and interactive analytics, and offers comprehensive enterprise management functionality. MaxCompute allows you to easily integrate and manage enterprise data assets and streamlines the data platform architecture for faster mining of the value of data.

MaxCompute Upgrades Enterprise-level Security

MaxCompute recently upgraded its comprehensive security capabilities. Newly released security capabilities include:

  • Fine-grained authorization
  • Data encryption (Bring Your Own Key [BYOK])
  • Data masking (Data Security Guard)
  • Continuous backup and recovery
  • Cross-region disaster backup and recovery
  • Real-time audit logs

An enterprise-level big data platform is exposed to three levels of security risks, as shown in Figure 1.

Image for post
Image for post
Figure 1: Security system of a big data platform

1) Infrastructure security and platform trustworthiness ensure the physical safety and network security of data centers. Countermeasures against risks at this level primarily include enhancing data center security facilities, data center security management, and data center network security.
2) System security of big data platforms. Countermeasures against risk at this level primarily include building subsystems such as access control, security isolation, risk control and audit, and data protection subsystems, and providing underlying platform-based capabilities for upper-level security applications or tools.
3) Security of data applications. Countermeasures against risks at this level primarily include providing users with tool-based data security products, optimizing the user experience, and helping users better cope with data risks.

The recent upgrade of MaxCompute’s security capabilities has introduced new features to access control, risk control and auditing, data protection, and other subsystems, as highlighted in yellow in the “Big data platform security” layer in Figure 1. In this article, we will introduce the best practices for major types of data risks, as shown in Figure 2. We will explain when, why, and how to use these new features in these best practices.

Image for post
Image for post
Figure 2: Major types of data risks

Countermeasures against Data Misuse

Data misuse is caused by unintended or negligent actions. Preventing data misuse usually refers to preventing data from being inadvertently and incorrectly used. A core requirement for responding to data misuse risks and preventing data misuse is to understand data. You can say you understand your data when you know what data you have, where it is stored, and how it is collected and used.

MaxCompute can help you properly answer these questions. The MaxCompute platform adopts unified metadata management to provide metadata and related logs based on uniform metadata and complete platform logs. You can build your own data management applications based on the Information Schema of MaxCompute.

Most users prefer to learn more about their data through existing data management applications or services. Here, DataWorks comes in handy with its Data Map module. Its data overview, data details, and other information can help you gain a picture of your data and its details. Its output and usage as well as lineage information can help you better understand your data and ultimately contribute to the correct and proper use of the data. This ensures that the right data is correctly used in the right scenario.

Image for post
Image for post
Figure 3: Understand data using a data map

Countermeasures against Data Abuse

Data abuse refers to using data in scenarios or for purposes beyond its intended scope. Data abuse is usually caused by intentional or purposeful actions. A major countermeasure against data abuse is the least privilege principle of data use, which strictly limits the scope of data access and use. The four major processes in Figure 5 are recommended as best practices in permission management.

  • Graded data management: categorize and grade data for management based on LabelSecurity of MaxCompute.
  • Authorization approval process: implement the least privilege principle for requests to access or use data based on MaxCompute’s column-level permission management.
  • Regular audit: analyze permission requests, approvals, and usage to perform pre-event approval and post-event auditing.
  • Timely cleaning: clean expired permissions in a timely manner to reduce data risks.

MaxCompute’s fine-grained permission system, if used with DataWorks or other GUI-based tools, can implement the best practice of the least privilege to reduce data abuse risks.

MaxCompute supports different authorization mechanisms for users or roles. The following mechanisms are examples.

Regardless of the access control mechanism you choose, three elements remain the same during the authorization and authentication: action, object, and subject, as shown in the following figure.

The new security capabilities that MaxCompute released in this upgrade also include an upgrade to the permission model to support finer-grained authorization and authentication for refined permission management. Main new features include:

  • Column-level permission management for ACLs to support conditions and authorization validity periods
  • Refined in-package resource permission management to support column-level permission management
  • Independent permission management for data downloads in batch download scenarios that have higher risks
  • Graded authorization management for administrator roles, with a built-in super administrator role to share the management workload of project owners
  • Improved RBAC, with LabelSecurity for roles supported
  • Enhanced permission management capabilities for applications
Image for post
Image for post
Figure 4: MaxCompute’s fine-grained permission system

Fine-grained permission management capabilities in this release are highlighted in orange.

MaxCompute’s fine-grained permission system enables least privilege authorization on the platform. When MaxCompute is used in concert with GUI-based tools, such as DataWorks-Security Center, it can provide a better user experience and more convenient permission management.

Image for post
Image for post
Figure 5: GUI-based permission management with Security Center

The Security Center provides convenient permission management and visualized request and approval processes, in addition to permission auditing and management capabilities.

  • Self-help permission requests: You can select desired data tables or fields and quickly request permissions for them online.
  • Permission audit and revocation: Administrators can view, audit, and manage the data permissions granted to users. Users can also take the initiative to request the revocation of permissions that are no longer needed.
  • Permission approval management: The Security Center adopts an online approval and authorization mode that is visualized and process-based and supports post-event traceability of approval processes.

Countermeasures against Data Breaches

Image for post
Image for post
Figure 6: Data lifecycle

Data breaches may occur in multiple stages of the data lifecycle, such as data transmission, storage, processing, and exchange. Therefore, we introduce the best practices to defend against data breaches at different stages of the data lifecycle.

First, data is collected from different channels and transferred to the big data platform through various channels. On the big data platform, data may be calculated and then written to disks for storage, be transferred between different tenants and services following a data sharing mechanism, or, after a certain period of time, be deleted and destroyed. Processed data is consumed by other data applications or users through different channels. (See Figure 7.)

Image for post
Image for post
Figure 7: Data lifecycle on a big data platform

First, let’s look at how to cope with data breach risks during storage, such as direct access to the data stored on disks and access to data disks. One countermeasure is to encrypt the data stored on disks. This can prevent the data from being read or used even when it is improperly accessed.

In this upgrade, a storage encryption feature was released for MaxCompute to support encrypting data disks.

  • MaxCompute integrates the key management system KMS to safeguard the security of keys. KMS supports service-based keys and user-defined keys (BYOK).
  • You can enable the storage encryption feature when you create a MaxCompute project. If you are already using MaxCompute services, you can open a ticket to apply to enable this feature.
  • Encryption algorithms such as AES-256 and cryptographic algorithms recommended by Chinese authorities are supported.
  • This data encryption process is transparent to users, and no additional changes are introduced to any type of task.

The security isolation capability of a big data platform plays a critical role in coping with the risks of data breach during data processing.

MaxCompute creates an independent, isolated environment for executing data processing applications and supports all user-defined function (UDF) types, Java and Python UDFs, and open-source third-party computing engines such as Spark, Flink, and Tensorflow, enabling diversified data processing capabilities.

Image for post
Image for post
Figure 8: MaxCompute’s security isolation capabilities

Sound data isolation and permission management systems are essential to data security because they can prevent data breaches during data exchanges or sharing. MaxCompute supports data isolation and permission management for a range of levels and dimensions, providing multi-level data protection and data sharing.

  • Secure isolation of multi-tenant data: MaxCompute supports multi-tenant scenarios where user data is stored in isolation in a distributed file system to enable multi-user collaboration and sharing without compromising data security. This achieves true multi-tenant resource isolation.
  • Project data isolation and sharing under the same tenant: A certain extent of data isolation and data sharing among different projects under the same tenant are common. The project-based protection mechanisms enable inter-project data isolation and security. In addition, the package mode allows you to share data and resources across projects with greater security and convenience. As described in the “MaxCompute’s fine-grained permission system offers refined permission management” section above, this upgrade of security capabilities has added fine-grained permission management for package data and resources, which enhances package data sharing and protection capabilities.
  • (New) Application-side data access control: A signature mechanism has been added to MaxCompute accessing applications to enhance the management over application-side access control. For example, you can only allow specific applications to execute authorization statements, which prevents illegal data authorization from taking advantage of APIs or non-compliant applications.
Image for post
Image for post
Figure 9: MaxCompute’s data isolation capabilities

An important part of coping with data breach risks is the protection of sensitive data. The responses to risks in the data storage, processing, and exchange processes described in preceding sections are also applicable to sensitive data protection. In addition, the following best practices target sensitive data protection scenarios.

  • Data classification and grading: MaxCompute’s LabelSecurity feature enables fine-grained permission management of data by classifying and grading data for access and use to ensure data security.
  • (New) Data masking: With the help of the MaxCompute platform’s UDFs, data masking implementations or applications based on security industry practices can mask any sensitive data in client outputs. Data masking implementations can also be used in concert with data classification and grading to enable different masking implementations for data in different classes or at different levels.
Image for post
Image for post
Figure 10: Protection of sensitive data

Data Security Guard is a sensitive data protection tool that is built to match the data classification and grading capabilities of the MaxCompute platform and integrates data masking capabilities. It allows users to label data as sensitive and select masking algorithms to mask the sensitive data in data outputs.

For more information about the service and its usage, see the Data Security Guard documentation.

Image for post
Image for post
Figure 11: Sensitive data protection tool — Data Security Guard

Countermeasures Against Data Loss

Apart from malicious data breaches, data abuse, and other risks, improper operations during data development, occasional faults with equipment or data centers, and rare and unexpected disasters can all lead to data loss. The main best practices to prevent data loss risks include backup and recovery and disaster recovery.

Data recovery may be inevitable during data development either due to improper operations such as unintended data deletions by using DROP or TRUNCATE TABLE statements, or problematic data after the INSERT INTO or INSERT OVERWRITE syntax is executed.

MaxCompute recently released continuous backup and recovery capabilities. The system automatically backs up and retains the data before a deletion or modification action is performed for a specific period of time. Within this period of time, you can recover the data quickly to prevent data loss due to incorrect operations.

Image for post
Image for post
Figure 12: MaxCompute’s continuous backup and recovery capabilities

With its geo-disaster recovery capabilities, MaxCompute provides better data security in extreme scenarios such as data center failures or unexpected disasters.

After you specify a backup location for the backup cluster of a MaxCompute project, MaxCompute can automatically implement data replication between the primary and backup clusters to ensure data consistency and achieve geo-disaster recovery. If a fault occurs, the MaxCompute project switches from the primary cluster to the backup cluster and uses the computing resources of the backup cluster to access the data in the backup cluster. In this way, the service is resumed and switched to the backup cluster.

Image for post
Image for post
Figure 13: MaxCompute’s geo-disaster recovery

Make Clever Use of Audits to Cope with Data Risks

So far, we have introduced practices to defend against various data risks during data development and use. Now, we will look at a very important practice which is applicable to all kinds of data risks in the last section.

MaxCompute provides comprehensive historical data and real-time logs.

  • Information Schema provides project metadata and historical usage data among other information. Privileges and History views can help you with data analysis and auditing for data permission usage and task execution.
  • (New) Real-time audit logs: MaxCompute keeps a full record of users’ actions such as DDL, authorization, and task execution events to meet the needs of real-time auditing and problem traceability and analysis.

You can build your own data risk control and audit systems based on Information Schema and real-time audit logs. Information Schema was released last year. Below, we will introduce the real-time audit log which is a new feature.

Not all users plan to build their own risk control and audit tools. Instead, they can use risk control and audit services in DataWorks for this purpose. With out-of-the-box services, there is no need to expend effort on secondary development, though customers enjoy a lower degree of customization.

Is sensitive data overused? Are too many data access permissions granted? Is there an abnormality such as unplanned frequent data access? Administrators are often asked these questions about data security. MaxCompute’s audit log feature can help you answer these questions.

MaxCompute keeps a full record of users’ actions and pushes user behavior logs to Alibaba Cloud’s ActionTrail service. You can view and retrieve user behavior logs in ActionTrail and deliver the logs to a Log Service project or a specified Object Storage Service (OSS) bucket for the purposes of real-time auditing and event traceability and analysis.

ActionTrail supports auditing user behavior for instances, tables, functions, resources, users, roles, and privileges. For more information about this feature and its usage, see the Audit Log documentation.

Image for post
Image for post
Figure 14: MaxCompute’s audit log

You can use existing services provided by DataWorks for data security risk control and auditing.

  • The Security Center described in the preceding section provides a permission auditing service.
  • Data Security Guard provides risk control and auditing services, as shown in Figure 15.
Image for post
Image for post
Figure 15: Risk control and auditing with Data Security Guard

Summary

Echoing the introduction, this summary offers an overview of the three levels of data security systems of an enterprise-level big data platform. Here, we reorganize the security capabilities of MaxCompute according to the six stages in a data lifecycle, as shown in Figure 16. This helps us better understand the applicable data security practices at each stage of the data lifecycle. New features released in this upgrade are highlighted in yellow in Figure 16.

Image for post
Image for post
Figure 16: Lifecycle-stage-specific data security practices on a big data platform

As a cloud data warehouse based on the SaaS model, MaxCompute boasts leading security capabilities and has passed multiple international, European, and Chinese security compliance certifications, including the internationally recognized ISO certification, SOC 1, 2, and 3 (SOC is short for System and Organization Control), Payment Card Industry Data Security Standard (PCI DSS), the C5 certification used in Europe, and Cybersecurity Multi-Level Protection Scheme 2.0 which is dominant in China. For more information about Alibaba Cloud’s security compliance certification system, see the Alibaba Cloud Trust Center — Certification of Compliance page. We welcome you to use MaxCompute to ensure enterprise-level big data security.

To learn more about Alibaba Cloud MaxCompute, visit https://www.alibabacloud.com/product/maxcompute

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store