Uploading and Downloading Complex MaxCompute Data Using Tunnel SDK

By Yunhua

This article shows you how you can upload complex data to MaxCompute through Tunnel SDK. First, we introduce the types of complex MaxCompute data:

Complex Data

Construction and Operation Functions for Complex Types

Introduction to Tunnel SDK

TableTunnel is the entry class to access MaxCompute Tunnel. It only supports upload and download of table data (non-view).

Session is a process for uploading and downloading a table or a partition. A session is composed of one or more HTTP requests to Tunnel RESTful APIs.

The session ID is used to identify a session. The timeout time of a session is 24 hours. If massive data is transferred for more than 24 hours, you need to split it into multiple sessions.

Sessions TableTunnel.UploadSession and TableTunnel.DownloadSession are responsible for data upload and download.

TableTunnel provides methods to create UploadSession DownloadSession objects.

  • A typical table data upload process:

1) Create TableTunnel

2) Create UploadSession

3) Create RecordWriter and write the record

4) Submit the upload operation

  • A typical table data download process:

1) Create TableTunnel

2) Create DownloadSession

3) Create RecordReader and read the record

Construct Complex Data Types Based on Tunnel SDK

RecordWriter recordWriter = uploadSession.openRecordWriter(0);ArrayRecord record = (ArrayRecord) uploadSession.newRecord();// prepare dataList arrayData = Arrays.asList(1, 2, 3);Map<String, Long> mapData = new HashMap<String, Long>();mapData.put("a", 1L);mapData.put("c", 2L);List<Object> structData = new ArrayList<Object>();structData.add("Lily");structData.add(18);// set data to recordrecord.setArray(0, arrayData);record.setMap(1, mapData);record.setStruct(2, new SimpleStruct((StructTypeInfo) schema.getColumn(2).getTypeInfo(),structData));// write the recordrecordWriter.write(record);

Download Complex Data Types from MaxCompute

RecordReader recordReader = downloadSession.openRecordReader(0, 1);// read the recordArrayRecord record1 = (ArrayRecord)recordReader.read();// get array field dataList field0 = record1.getArray(0);List<Long> longField0 = record1.getArray(Long.class, 0);// get map field dataMap field1 = record1.getMap(1);Map<String, Long> typedField1 = record1.getMap(String.class, Long.class, 1);// get struct field dataStruct field2 = record1.getStruct(2);

Running Example

import java.io.IOException;import java.util.ArrayList;import java.util.Arrays;import java.util.HashMap;import java.util.List;import java.util.Map;import com.aliyun.odps.Odps;import com.aliyun.odps.PartitionSpec;import com.aliyun.odps.TableSchema;import com.aliyun.odps.account.Account;import com.aliyun.odps.account.AliyunAccount;import com.aliyun.odps.data.ArrayRecord;import com.aliyun.odps.data.RecordReader;import com.aliyun.odps.data.RecordWriter;import com.aliyun.odps.data.SimpleStruct;import com.aliyun.odps.data.Struct;import com.aliyun.odps.tunnel.TableTunnel;import com.aliyun.odps.tunnel.TableTunnel.UploadSession;import com.aliyun.odps.tunnel.TableTunnel.DownloadSession;import com.aliyun.odps.tunnel.TunnelException;import com.aliyun.odps.type.StructTypeInfo;public class TunnelComplexTypeSample {private static String accessId = "<your access id>";private static String accessKey = "<your access Key>";private static String odpsUrl = "<your odps endpoint>";private static String project = "<your project>";private static String table = "<your table name>";// partitions of a partitioned table, eg: "pt=\'1\',ds=\'2\'"// If the table is not a partitioned table, do not need itprivate static String partition = "<your partition spec>";public static void main(String args[]) {Account account = new AliyunAccount(accessId, accessKey);Odps odps = new Odps(account);odps.setEndpoint(odpsUrl);odps.setDefaultProject(project);try {TableTunnel tunnel = new TableTunnel(odps);PartitionSpec partitionSpec = new PartitionSpec(partition);// ---------- Upload Data ---------------// create upload session for table// the table schema is {"col0": ARRAY<BIGINT>, "col1": MAP<STRING, BIGINT>, "col2": STRUCT<name:STRING,age:BIGINT>}UploadSession uploadSession = tunnel.createUploadSession(project, table, partitionSpec);// get table schemaTableSchema schema = uploadSession.getSchema();// open record writerRecordWriter recordWriter = uploadSession.openRecordWriter(0);ArrayRecord record = (ArrayRecord) uploadSession.newRecord();// prepare dataList arrayData = Arrays.asList(1, 2, 3);Map<String, Long> mapData = new HashMap<String, Long>();mapData.put("a", 1L);mapData.put("c", 2L);List<Object> structData = new ArrayList<Object>();structData.add("Lily");structData.add(18);// set data to recordrecord.setArray(0, arrayData);record.setMap(1, mapData);record.setStruct(2, new SimpleStruct((StructTypeInfo) schema.getColumn(2).getTypeInfo(),structData));// write the recordrecordWriter.write(record);// close writerrecordWriter.close();// commit uploadSession, the upload finishuploadSession.commit(new Long[]{0L});System.out.println("upload success!") ;// ---------- Download Data ---------------// create download session for table// the table schema is {"col0": ARRAY<BIGINT>, "col1": MAP<STRING, BIGINT>, "col2": STRUCT<name:STRING,age:BIGINT>}DownloadSession downloadSession = tunnel.createDownloadSession(project, table, partitionSpec);schema = downloadSession.getSchema();// open record reader, read one record here for exampleRecordReader recordReader = downloadSession.openRecordReader(0, 1);// read the recordArrayRecord record1 = (ArrayRecord)recordReader.read();// get array field dataList field0 = record1.getArray(0);List<Long> longField0 = record1.getArray(Long.class, 0);// get map field dataMap field1 = record1.getMap(1);Map<String, Long> typedField1 = record1.getMap(String.class, Long.class, 1);// get struct field dataStruct field2 = record1.getStruct(2);System.out.println("download success!") ;} catch (TunnelException e) {e.printStackTrace();} catch (IOException e) {e.printStackTrace();}}}

Reference:https://www.alibabacloud.com/blog/uploading-and-downloading-complex-maxcompute-data-using-tunnel-sdk_594739?spm=a2c41.12825644.0.0

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com