MaxCompute and DataWorks Security Management Guide: Examples

Join us at the Alibaba Cloud ACtivate Online Conference on March 5–6 to challenge assumptions, exchange ideas, and explore what is possible through digital transformation.

The article MaxCompute and DataWorks Security Management Guide: Basics describes the relevant security models of MaxCompute and DataWorks, the correlation between the two products, and various security actions. This article will provide some referential examples for security management members.

Project Creation Example

We have known the security models of MaxCompute and DataWorks and the relationship between permissions of these two products. This article uses two basic business requirements to describe how to create and manage a project.

Scenario 1: Collaborative Business Development for ETL Tasks

In a collaborative development scenario, responsibilities and tasks are clearly assigned to members, and the regular development, debugging, and publishing procedures are required. Production data must be strictly controlled.

Analysis:

  • A DataWorks project itself can allow multiple members to perform collaborative development work.

Implementation Steps

Step 1: Create a project. The main configuration is shown in the following screenshot:

Details about the configurations are described as follows:

  • Select “Standard (separation of development and production)” for the project model. The standard model will bind one DataWorks project to two MaxCompute projects (a development project and a production development). Development and debugging are performed in a development environment. Tasks in a production environment should be published from a development environment by following the publishing procedure so that tasks can run smoothly in the production environment.

Step 2: Add a project member. Add a RAM user as a project member in DataWorks and assign roles as needed. At the same time, the corresponding development project will assign the corresponding role to the RAM user. For information about permissions that each role has, refer to the permission relationship between MaxCompute and DataWorks as described in the Basics article.

  • Project administrator: In addition to having full permissions of the developer and the maintainer, the project administrator can also perform operations at the project level, such as adding/removing the project members, granting roles, and creating custom resource groups. Meanwhile, a project administrator also has the “role_project_admin” role for a MaxCompute development project.

Step 3: Develop and debug tasks. If a developer is performing task development and debugging in the DataWorks “Data Development” module (the development project in MaxCompute) and needs to use a table in the production project, the developer can apply for the required table in the DataWorks Data Management module.

Step 4: Publish tasks to the production environment. After debugging tasks, a developer role packs tasks. A maintainer can review code and then publish a package to the production environment. (The procedure from a developer to a maintainer should be notified offline.) This process ensures that tasks cannot be freely published to the production environment for running there.

Step 5: Developers test production tasks. After tasks are published to the production environment, developers are advised to perform a test on production tasks in the operation and maintenance center in order to ensure that production tasks can run as expected. If executed tasks return a success status, it is still necessary to view the log to check if tasks are performed as expected. To perform further verification, you must query whether normal output is present in the result table. Generally, this querying operation is performed in the development interface. Individuals need to apply for permissions on the output table in the production environment, which are not granted to individuals by default and can be requested in the “Data Management” module of DataWorks.

Pay attention to the following after the preceding configuration and operations are performed:

  • Multiple members collaboratively perform development work in the DataWorks Data Development module, all members of a project can view task code, and members with the edit permission can edit and modify tasks. Therefore, highly sensitive core code cannot have a high confidentiality level. Currently, tasks and data that requires a high confidentiality level can be developed in separate projects by specific members.

Scenario 2: Simple Project Ownership based on Table Creation

In this scenario, each member of a single project can only perform operations on the tables created by themselves. This scenario is common for smaller businesses where member roles are basically consistent and business won’t need scaling. For example, only data acquisition is required, that is, only querying and downloading business data without performing data development (for example, the operations roles need to obtain some data for analysis).

Analysis:

  • If a project doesn’t perform data development, then data that needs to be analyzed must be located in other projects. Meanwhile, in order to avoid the resource isolation between different primary accounts, the owner of the project (primary account) must be the same as the owner of the data development production project.

Implementation Steps

Step 1: Create a project. Note that the primary account must be the primary account for the project where data to be analyzed is located. The project configuration is as follows:

Step 2: Create a custom MaxCompute role and grant permission to that role. Use the primary account and perform operations in the console:

create role custom_dev;-- Creates a custom role
grant List, CreateInstance,CreateTable,CreateFunction,CreateResource on project prj_name to role custom_dev;-- Grants permission to the custom role

Step 3: Set “Allow an object creator to have the default access” for the project in MaxCompute. Use the primary account and perform operations in the console:

set ObjectCreatorHasAccessPermission=true;    -- By default, this flag is set to true. You can run the following command to check the configuration
show SecurityConfiguration;
You can also configure this flag under "Project Management" -> "MaxCompute Settings" in DataWorks.

Step 4: Add a project member. In DataWorks, add a RAM user as a new member. If a member is added as a “developer” role, the role of that member in the corresponding MaxCompute project is role_project_dev. Use the console command line to view permissions of the primary account:

show grants for ram$ primary account: RAM user;

Step 5: Modify a new member’s MaxCompute permissions Use the primary account and perform operations in the console:

revoke role_project_dev from ram$ primary account: RAM user;-- Removes the default role from a new member Note that if a member is re-granted a role in the DataWorks "Member Management" page, the corresponding MaxCompute role is also re-granted to that member.
grant custom_dev to ram$ primary account: RAM user;-- Grants a custom role to a new member

By now proper configuration has been made for this project that requires special permission management. In addition, pay attention to the following points:

  • If members of this project are granted the developer role again as previously mentioned, these members will be also re-granted the “role_project_dev” role.

Other Common Scenarios

Package Authorization Scenario

In this example, business analyzers need to view production tables, but they are not allowed to view production task code. Business analyzers need access to partial tables of multiple production projects.

A separate project can be created to allow business analyzers to view production tables but not production tasks. We can create a package in multiple production projects, add shared tables to that package, install the package in the analysis project and grant the permission on the package to analyzers. This can reduce the member management cost by eliminating the need to add analyzers to all production projects, and ensure that analyzers can only view tables in the package in the analysis project.

To do so, we can perform the following steps.

Create a package in production projects:

CREATE PACKAGE PACKAGE_NAME;
Example:
CREATE PACKAGE prj_prod2bi;

Add resources to be shared to the package in production projects:

ADD table TO PACKAGE [package name]; 
Example:
ADD table adl_test_table TO PACKAGE prj_prod2bi;

The production projects allow the analysis project to use the package:

ALLOW PROJECT [project allowed to install package] TO INSTALL PACKAGE [package name];
Example:
ALLOW PRJ_BI TO INSTALL PACKAGE prj_prod2bi;

The analysis project installs the package:

INSTALL PACKAGE [application name].[ package name]; 
Example:
INSTALL PACKAGE prj_prod.prj_prod2bi;

Grant the package permissions to users:

Grant the permission to users:
GRANT read on package prj_prod2bi TO USER [cloud account];
Grant the permission to roles:
GRANT read on package prj_prod2bi TO ROLE [rolename];

Data Security Self-Check Example

In the initial stage of a project, relatively little attention is given to user and permission management in order to speed up the progress of the project. When the project is in the stable development stage, data security becomes increasingly important. At this point, a self-check analysis of data security is required, and the generation and implementation of a plan is expected.

This example provides some data security adjustment ideas by showing the key adjustment aspects that a customer should focus on after performing a data security self-check.

Self-Check Principles and Recommendations

  • Count the number of accounts. Count members of a DataWorks project and MaxCompute project users and make sure that each member only has one account for easy accountability and management.

Adjustment Keypoints

  • Accounts and new and proper assignment of accounts. Adjustment principle: Each member uses his or her own account. Grant proper data access permissions depending on the business development groups and roles that members belong to, and disallow a member to user another member’s account. Avoid data security risk caused by excessive user permissions. For example, you can assign accounts according to the business groups in the data development process. Business groups may include the management group, the data integration group, the data model group, the algorithm group, the analysis group, the maintenance group, and the security group.

Reference:https://www.alibabacloud.com/blog/maxcompute-and-dataworks-security-management-guide-examples_594476?spm=a2c41.12584037.0.0MaxCompute

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.