Cassandra Migration to Alibaba Cloud

Prerequisites

Before you begin with the migration and monitoring processes outlined in this tutorial, first make sure that you address the following prerequisites.

Set Cassandra Cluster Configurations

The Cassandra clusters to be migrated are installed with the following operating system and specifications:

Create Clusters on Alibaba Cloud

To ensure a smooth migration process, it is necessary to create the nodes in the Alibaba Cloud before you begin migration. These nodes will belong to the original cluster, and will be the final nodes, after the ones in the local data center have been decommissioned. To do this, follow these steps.

$ sudo apt-get update && sudo apt-get -u dist-upgrade --yes && sudo apt-get install openjdk-8-jre --yes
$ sudo cd /opt
$ sudo su -
$ ossutil cp oss://bucket/apache-cassandra-3.9-bin.tar.gz . && tar xvfz apache-cassandra-3.9-bin.tar.gz && chown -R user:user apache-cassandra-3.9
$ sudo echo '# Expand the $PATH to include /opt/apache-cassandra-3.9/bin
PATH=$PATH:/opt/apache-cassandra-3.9/bin' > /etc/profile.d/cassandra-bin-path.sh
$ sudo nano /opt/apache-cassandra-3.9/conf/cassandra.yaml
cluster_name: 'Cluster Name'
seed_provider:
- seeds: "Seed IP"
listen_address: <Droplet's IP>
rpc_address: <machine´s IP>
endpoint_snitch: SimpleSnitch
$ nodetool statusDatacenter: {your-existing-datacenter}
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.132.0.6 140.01 KB 256 100.0% cf83fd8a-efa0-4798-a6f7-599d5eb7a6cb rack1
U
{List of nodes}
$ sudo cqlsh {listener-address}
Connected to Test Cluster at 10.132.0.6:9042.
[cqlsh 5.0.1 | Cassandra 3.0.9 | CQL spec 3.4.0 | Native protocol v4]
Use HELP for help.
cqlsh> CREATE KEYSPACE {your-keyspace} WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor' : 2};
cqlsh> ALTER KEYSPACE "{your-keyspace}" WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'datacenter1' : 2};

Use Cassandra Multi–Data Center Support for Migration

In most scenarios, Cassandra is preconfigured with data center awareness turned off. Cassandra multi–data center deployments require an appropriate configuration of the data replication strategy (per keyspace) and also the configuration of the snitch. By default, Cassandra uses SimpleSnitch and the SimpleStrategy replication strategy. The following sections explain what these terms actually mean.

What Is a Cassandra Snitch?

Snitches are used by Cassandra to determine the topology of the network. The snitch also makes Cassandra aware of the data center and the rack it’s in. The default snitch (SimpleSnitch) gives no information about the data center and the rack where Cassandra resides. This snitch only works for single data center deployments. SimpleSnitch reads the cassandra-topology.properties file to know the topology of your data center.

Replication Strategies

Replication strategies determine how data is replicated across the nodes. The default replication strategy is SimpleStrategy. By using SimpleStrategy, data can be replicated across all nodes in the cluster. This doesn’t work for multi–data center deployments given latency constraints. For multi–data center deployments, the right strategy to use is NetworkTopologyStrategy. This strategy allows you to specify how many replicas you want in each data center and rack.

cqlsh> ALTER KEYSPACE "{your-keyspace}" WITH REPLICATION ={ 'class' : 'NetworkTopologyStrategy'['{your-existing-datacenter}' : 3, '{your-AlibabaCloud-Zone}' : 2]};

Set the Snitch and Replication Strategy

As you can see, the snitch and replication strategy work closely together. The snitch is used to group nodes per data center and rack. The replication strategy defines how data should be replicated amongst these groups. If you run with the default snitch and strategy, you will need to change these two settings in order to get to a functional multi–data center deployment.

Procedure

Perform the Migration

1. Configure PropertyFileSnitch

$ sudo nano /opt/apache-cassandra-3.9/conf/cassandra-topology.properties
# {your-existing-datacenter}
175.56.12.105 =DC1:RAC1
175.50.13.200 =DC1:RAC1
175.54.35.197 =DC1:RAC1
# {your-AlibabaCloud-Zone}
45.56.12.105 =DC2:RAC1
45.50.13.200 =DC2:RAC1
45.54.35.197 =DC2:RAC1
"auto_bootstrap: false"
cqlsh> ALTER KEYSPACE "{your-keyspace}" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', '{your-existing-datacenter}' : 2, 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "{your-keyspace}" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', '{your-existing-datacenter}' : 2, 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_distributed" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', '{your-existing-datacenter}' : 2, 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_auth" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', '{your-existing-datacenter}' : 2, 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_traces" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', '{your-existing-datacenter}' : 2, 'us-east-1' : 2 };
{ 'class' : 'SimpleStrategy', 'replication_factor' : <integer> };
{ 'class' : 'NetworkTopologyStrategy'[, '<data center>' : <integer>, '<data center>' : <integer>] . . . };
$ sudo nodetool rebuild {your-existing-datacenter}
cqlsh> ALTER KEYSPACE "{your-keyspace}" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_distributed" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_auth" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', 'us-east-1' : 2 };
cqlsh> ALTER KEYSPACE "system_traces" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', 'us-east-1' : 2 };

Decommission the Original Data Center and Perform Clean Up

To perform this task, you need to follow these three steps:

  • Ensure that you updated your clients on Alibaba Cloud to connect to the new data center
  • Then update your client’s connection policy to DCAwareRoundRobinPolicy
  • Set the local data center to ‘us-east’
cqlsh> ALTER KEYSPACE "stream_data" WITH REPLICATION = { 'class' 'NetworkTopologyStrategy', 'us-east-1' : 3 };
$ sudo nodetool decommission && killall -15 java && shutdown -h now

Set up Monitoring

Monitoring Cassandra Using JConsole

1. Starting JConsole

JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false"
JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=PUBLIC_IP_OF_YOUR_SERVER"

Cassandra Metrics over JConsole

If you go to the MBeans tab on JConsole main window, you’ll see a nested folder tree containing different classes isolated by metric type inside ‘org.apache.cassandra.metric’ category:

Original Source

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com