Data Backup and Migration to OSS using OSSImport

By Rodney Shetler, Staff Solution Architect

OSSImport is a free tool created by the Alibaba Cloud product team to assist with data backup and migration into the Alibaba Cloud Object Storage Service (OSS). Using OSSImport data can be migrated from either local storage or from third party cloud platforms. Currently supported data sources include Qiniu, Baidu BOS, AWS S3, Azure Blob, Youpai Cloud, Tencent Cloud COS, Kingsoft KS3, HTTP, and other OSS buckets.

OSSImport can be used in a standalone configuration with a single node performing all operations, or in a distributed mode, allowing for multiple worker nodes to distribute the tasks of copying and syncing data between platforms. Optional settings can be used to enable Incremental mode, which polls for data changes at a given interval and synchronizes new or altered data objects, as well as bandwidth throttling or to specify objects to be migrated based on time or prefix.

Image for post
Image for post

Best Practices

While OSSImport can be deployed in a standalone configuration, distributed mode is recommended for data backup or migration jobs that exceed 30 TB.

It is recommended to deploy OSSImport on an Alibaba Cloud ECS instance within a VPC, this allows the incoming data to be written to OSS using the “Internal” OSS bucket address which is much faster, within a private network, and free of traffic charges.

Where possible it is recommended to use dedicated network connectivity between the source and destination, this can save on bandwidth costs, and ensure a consistent fast transfer.

Getting Started — Backup/Migration from AWS S3 to Alibaba OSS

The following will outline the process for setting up a data backup/migration using a common scenario — migrating data from an AWS S3 bucket to an Alibaba Cloud OSS Bucket.

High-Level Architecture

Image for post
Image for post

Create an Object Storage Service (OSS) Bucket

  1. Log into the Alibaba Cloud admin console
Image for post
Image for post
  1. Choose “Ok”

Create Alibaba ECS Instance

  1. Log into the Alibaba Cloud admin console and create an Elastic Compute Service (ECS) instance to act as a standalone OSSImport server.

For detailed assistance creating an ECS instance please refer to the following guide — Elastic Compute Service — Create an Instance

Configure OSSImport Server

  1. Log into the ECS instance created in the previous section using SSH
  • yum install java
  1. Download the OSSImport tool using the following ‘curl’ command
  1. When the download finishes — unzip the archive using the following ‘unzip’ command
  • unzip ossimport.zip -d ./ossimport
  1. Change directory to the “ossimport” directory where the files have been unzipped
Image for post
Image for post
  1. destBucket=<ALIBABA CLOUD OSS BUCKET>
  • bash import.sh
  1. If all setup and configuration steps have been completely correctly the message “Start import service completed” followed by a “job stats” message should be returned to the console as files are migrated/backed up from AWS S3 to OSS
Image for post
Image for post

Getting Started — Backup/Migration from Local Storage to Alibaba OSS

The following will outline another common scenario — data backup/migration from local storage to an Alibaba Cloud OSS Bucket. In this scenario the OSSImport tool will be installed and run directly on the server with access to the local storage to be migrated/backed up.

High-Level Architecture

Image for post
Image for post

Create an Object Storage Service (OSS) Bucket

  1. Log into the Alibaba Cloud admin console
Image for post
Image for post

Configure Local Server to Run OSSImport

On the local server where the local files reside perform the following steps to prepare the server to run the OSSImport job to migrate/backup local files to OSS. For this example, we will be assuming the local server is running a Linux/Unix environment. Similar steps can be used on servers running windows with minor substitutions.

  1. Install Java which is a pre-requisite to using the OSSImport tool using the following “yum” command
  • yum install java
  1. Download the OSSImport tool using the following ‘curl’ command
  1. When the download finishes — unzip the archive using the following ‘unzip’ command
  • unzip ossimport.zip -d ./ossimport
  1. Change directory to the “ossimport” directory where the files have been unzipped
Image for post
Image for post
  1. destBucket=<ALIBABA CLOUD OSS BUCKET>
  • bash import.sh
  1. If all setup and configuration steps have been completed correctly the message “Start import service completed” followed by a “job stats” message should be returned to the console as files are migrated/backed up from Local Storage to OSS
Image for post
Image for post

Distributed Deployment for Large Migration/Backup Jobs

As mentioned, OSSImport can be deployed in a distributed mode when dealing with very large-scale migration/back up jobs of greater than 30 TB. This mode allows for the actual tasks to be spread out to multiple “worker” nodes running across multiple servers. When working with a distributed environment a new “console.sh” bash script is used to start, coordinate, and submit migration/backup jobs to the distributed environment. There are also three configuration files that need to be setup prior to running jobs as explained below.

Note: Currently distributed deployment is only supported on servers running Linux.

High-Level Architecture

Image for post
Image for post

Create Primary and Worker Alibaba Cloud ECS Instances

  1. Log into the Alibaba Cloud admin console and create a set of ECS instances

For detailed assistance creating ECS instances please refer to the following guide — Elastic Compute Service — Create an Instance

Image for post
Image for post

Configure the Master node

The initial configuration for the distributed mode deployment will take place on the server that has been designated as the “master node”. All commands and configuration below will take place on the master node, which will distribute the configuration and work tasks to the worker nodes.

  1. Log into the ECS instance designated as the master node created in the previous section using SSH
  • yum install java
  1. Note: Java must be installed on the master AND all worker nodes.
  1. When the download finishes — un tar the archive into a new directory using the following ‘tar’ command
  • mkdir import && tar -zxvf ossimport.tar.gz -C import
  1. Note: The default working directory for this tool is “/root/import” which is what we are following in this example. If another directory is used it is important to update the “sys.properties” file with the new directory — this directory must match on the master and all worker nodes
  • bash console.sh deploy
  1. Note: If changes or updates are needed, make changes on the master node, and re-deploy the configuration using the same command as above. This will update all worker nodes with the new configuration
  • bash console.sh start
  1. Now submit the import job using the following command
  • bash console.sh submit
  1. Tasks will be submitted to the worker nodes in your configuration
Image for post
Image for post

To learn more about OSSImport, visit the OSS documentation page or the official GitHub page.

Reference:https://www.alibabacloud.com/blog/data-backup-and-migration-to-oss-using-ossimport_594320?spm=a2c41.12451672.0.0

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store