Setting Up PySpark on Alibaba Cloud CentOS Instance

Prerequisites:

Section 1: Cloud Resources

What Is an ECS?

What Is an EIP?

Acquiring an ECS Instance

Buying and Associating an EIP

Section 2: Installing Python

What Is Python?

Installing Python on Alibaba Cloud ECS Instance

yum install gcc openssl-devel bzip2-devel libffi-devel
cd /usr/srcwget https://www.python.org/ftp/python/3.7.2/Python-3.7.2.tgz
cd Python-3.7.2
./configure --enable-optimizations

Section 3: Installing Spark and PySpark

What is Spark?

What is PySpark?

Installing Spark/PySpark on Alibaba Cloud ECS instance

sudo yum update
sudo yum install java-1.8.0-openjdk-headless
wget https://www-eu.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz
cat /opt/spark-2.4.0-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-centos.out
./sbin/start-slave.sh <master-spark-URL>
./sbin/start-slave.sh spark://centos:7077
export SPARK_HOME=/opt/spark-2.4.0-bin-hadoop2.7  
export PATH=$SPARK_HOME/bin:$PATH
bin/pyspark

Sample Code

from pyspark import SparkContext
outFile = "file:///opt/spark-2.4.0-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-centos.out"
sc = SparkContext("local", "example app")
outData = sc.textFile(outFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
print("Lines with a: %i " % (numAs))

Original Source

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Getting Started with the Microsoft Power Platform

The Top 8 Online Courses Every Developer Should Grab

Value types in CoreData

Making Writers Read

Implementing 2FA in the context of an API

Spring JPA Hibernate Many to Many — SpringBoot + PostgreSQL

How custom element attributes can ease UI test automation

Getting Started with CronJobs in Kubernetes

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Apache Airflow: Write your first DAG in Apache Airflow

DevOps in Data: The Volcanic Waltz

A volcano erupting

Up and Running with Kafka (installation) in Simplest way

DataFlow with Apache Nifi(Flight & Weather API, Writing various source)