One-click Deployment of a Hadoop Distributed Cluster on Alibaba Cloud

Image for post
Image for post

Hadoop is an open source distributed computing framework that processes data in a reliable, efficient, and scalable way. This document provides a simple method for deploying a Hadoop cluster on Alibaba cloud.

Through Alibaba Cloud Resource Orchestration Service (ROS), users can automate the VPC, NAT gateway, ECS creation, and Hadoop deployment process to quickly and conveniently deploy a Hadoop cluster. You can deploy a Hadoop Cluster in 3 different ways, manually, with our ROS console, or with an ROS template. The Hadoop cluster created in this document includes three nodes: master.hadoop, slave1.hadoop, and slave2.hadoop.

Option 1: Four Steps to Installing a Hadoop Cluster in UserData

Installing a Hadoop cluster involves four steps: configure SSH login without a password; install Java JDK; install and configure the Hadoop package; start and test the cluster.

1. Configure SSH login without a password

All hosts in a cluster can login using SSH mode without a password. Therefore, run the following command on the three ECSs to ensure that the temporary private keys and public keys on all hosts are the same so that login without a password can be enabled.

To ensure security and prevent the disclosure of the private keys and public keys, run the following command on the master node to replace public temporary private keys and public keys:

2. Install and configure the JDK

Install the JDK on the master node and remotely install the JDK on the slave node.

3. Install and configure Hadoop

Download and install Hadoop:

4. Start and test the cluster

Format the Hadoop distributed file system (HDFS), disable the firewall, and start the cluster.

Option 2: Quick deployment of a Hadoop cluster

Instead of performing the four steps in the previous section, visit the ROS console for Rapid Deployment of a Hadoop Cluster. Fill out the details as shown below.


• We must ensure that the Jdk and Hadoop tar installation packages can be correctly downloaded. Select a URL similar to the following: (Must set ‘Cookie: oraclelicense=accept-securebackup-cookie’ in Header, when download.)
• When we use a template to create a Spark cluster, only the CentOS operating system is available.
• The duration can be set to 120 minutes to prevent timeouts.
• Select a data center located in Shanghai.

Option 3: ROS Sample Template

On your ROS console, go to Sample Template and select the most appropriate template for your application. You can deploy your Hadoop Cluster with a single click of a button using these templates. This is a convenient way to achieve an automated and repeatable deployment process.

Hadoop_Distributed_Env_3_ecs.json: One-click deployment can be achieved using this template. This template creates the same cluster as the one from the previous section.

Hadoop_Distributed_ecsgroup.json: This template allows the user to specify the number of slave nodes.

Testing the Deployment

After deploying the Hadoop cluster, view the resource stack overview.

Enter the WebsiteURL shown in the figure. If the following result is displayed, the deployment is successful.


Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store