SAP HANA High Availability Cross-Zone Solution on Alibaba Cloud With SUSE Linux Enterprise Server for SAP Applications

  • Jinhui Li, Product Manager SAP Solutions on Alibaba Cloud, AliCloud (Germany)
  • Bernd Schubert, SAP Solution Architect, SUSE

1. Solution Overview

1.1 SAP HANA System Replication

1.2 High Availability Extension Included with SUSE Linux Enterprise Server for SAP Applications

1.3 Architecture Overview

1.4 Network Design

2 Infrastructure Preparation

2.1 Infrastructure List

2.2 Creating VPC

  • Switch1 192.168.0.0/24 Zone A, for SAP HANA Primary Node;
  • Switch2 192.168.1.0/24 Zone B, for SAP HANA Secondary Node;

2.3 Creating ECS Instances

2.4 Creating ENIs and Binding to ECS Instances

echo "192.168.0.82 hana0 hana0" >> /etc/hosts
echo "192.168.1.245 hana1 hana1" >> /etc/hosts

2.5 Creating NAT Gateway and configure SNAT entry

2.6 Creating STONITH Device and Virtual IP Resource Agent

wget http://repository-iso.oss-cn-beijing.aliyuncs.com/ha/aliyun-ecs-pacemaker.tar.gz
tar ¨Cxvf aliyun-ecs-pacemaker.tar.gz
./install
pip install aliyun-python-sdk-ecs aliyun-python-sdk-vpc aliyuncli
aliyuncli configure

3 Software Preparation

3.1 Software List

3.2 High Availability Extension Installation

zypper in -t pattern ha_sles
zypper in SAPHanaSR SAPHanaSR-doc

3.3 SAP HANA Installation

3.4 SAP Host Agent Installation

4 Configuring SAP HANA System Replication

4.1 Backing up SAP HANA on Primary ECS Instance

BACKUP DATA USING FILE('COMPLETE_DATA_BACKUP');
BACKUP DATA for <DATABASE> using FILE('COMPLETE_DATA_BACKUP')
BACKUP DATA for SYSTEMDB using FILE('COMPLETE_DATA_BACKUP')BACKUP DATA for JL0 using FILE('COMPLETE_DATA_BACKUP')

4.2 Configuring SAP HANA System Replication on Primary Node

vi /hana/shared/<SID>/global/hdb/custom/config/global.ini
[system_replication_hostname_resolution]
<IP> = <HOSTNAME>
[system_replication_hostname_resolution]
192.168.1.246 = hana1

4.3 Configuring SAP HANA System Replication on Secondary Node

[system_replication_hostname_resolution]
192.168.0.83 = hana0

4.4 Enable SAP HANA System Replication on Primary Node

hdbnsutil -sr_enable --name= [primary location name]
hdbnsutil -sr_enable --name=hana0

4.5 Register the Secondary Node to the Primary SAP HANA Node

hdbnsutil -sr_register --remoteHost=[location of primary Node] --remoteInstance=[instance number of primary node] --replicationMode=sync --name=[location of the secondary node] --operationMode=logreplay
hdbnsutil -sr_register --name=hana1 --remoteHost=hana0 --remoteInstance=00 --replicationMode=sync --operationMode=logreplay
hdbnsutil -sr_state

5 Configuring High Availability Extension for SAP HANA

5.1 Configuration of Corosync

  • Create Keys
  • Configure /etc/corosync/corosync.conf as root on the Primary SAP HANA node with the following content:
totem {
version: 2
token: 5000
token_retransmits_before_loss_const: 6
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
clear_node_high_bit: yes
interface {
ringnumber: 0
bindnetaddr: **IP-address-for-heart-beating-for-the-current-server**
mcastport: 5405
ttl: 1
}
# On Alibaba Cloud, transport should be set to udpu, means: unicast
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
nodelist {
node {
ring0_addr: **ip-node-1**
nodeid: 1
}
node {
ring0_addr: **ip-node-2**
nodeid: 2
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
expected_votes: 2
two_node: 1
}
scp /etc/corosync/authkey root@**hostnameOfSecondaryNode**:/etc/corosync

5.2 Configuration of Pacemaker

5.2.1 Cluster Bootstrap and More

property $id='cib-bootstrap-options' \
stonith-enabled="true" \
stonith-action="off" \
stonith-timeout="150s"
rsc_defaults $id="rsc-options" \
resource-stickness="1000" \
migration-threshold="5000"
op_defaults $id="op-options" \
timeout="600"
crm configure load update crm-bs.txt

5.2.2 STONITH Device

primitive res_ALIYUN_STONITH_1 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params pcmk_host_list=<primary node hostname> port=<primary node instance id> \
access_key=<access key> secret_key=<secret key> \
region=<region> \
meta target-role=Started
primitive res_ALIYUN_STONITH_2 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params pcmk_host_list=<secondary node hostname> port=<secondary node instance id> \
access_key=<access key> secret_key=<secret key> \
region=<region> \
meta target-role=Started
location loc_<primary node hostname>_stonith_not_on_<primary node hostname> res_ALIYUN_STONITH_1 -inf: <primary node hostname>
#Stonith 1 should not run on primary node because it is controling primary node
location loc_<secondary node hostname>_stonith_not_on_<secondary node hostname> res_ALIYUN_STONITH_2 -inf: <secondary node hostname>
#Stonith 2 should not run on secondary node because it is controling secondary node
  • [secondary node hostname] / [primary node hostname] should be replaced by the real hostname of your Secondary node.
  • [secondary node instance id] / [secondary node instance id] should be replaced by the real instance-id of your Secondary node. You can get this from the console.
  • [access key] should be replaced with real access key.
  • [secret key] should be replaced with real secret key.
  • [region] should be replaced with real region name where the node is located.
crm configure load update crm-stonith.txt

5.2.3 SAPHanaTopology

primitive rsc_SAPHanaTopology_<SID>_HDB<instance number> ocf:suse:SAPHanaTopology \
operations $id="rsc_SAPHanaTopology_<SID>_HDB<instance number>-operations" \
op monitor interval="10" timeout="600" \
op start interval="0" timeout="600" \
op stop interval="0" timeout="300" \
params SID="<SID>" InstanceNumber="<instance number>"
clone cln_SAPHanaTopology_<SID>_HDB<instance number> rsc_SAPHanaTopology_<SID>_HDB<instance number> \
meta clone-node-max="1" interleave="true"
  • [SID] should be replaced by the real SAP HANA SID.
  • [instance number] should be replaced by the real SAP HANA Instance Number.
crm configure load update crm-saphanatop.txt

5.2.4 SAPHana

primitive rsc_SAPHana_<SID>_HDB<instance number> ocf:suse:SAPHana \
operatoins $id="rsc_sap_<SID>_HDB<instance number>-operations" \
op start interval="0" timeout="3600" \
op stop interval="0" timeout="3600" \
op promote interval="0" timeout="3600" \
op monitor interval="60" role="Master" timeout="700" \
op monitor interval="61" role="Slave" timeout="700" \
params SID="<SID>" InstanceNumber="<instance number>" PREFER_SITE_TAKEOVER="true" \
DUPLICATE_PRIMARY_TIMEOUT="7200" AUTOMATED_REGISTER="false"
ms msl_SAPHana_<SID>_HDB<instance number> rsc_SAPHana_<SID>_HDB<instance number> \
meta clone-max="2" clone-node-max="1" interleave="true"
  • [SID] should be replaced by the real SAP HANA SID.
  • [instance number] should be replaced by the real SAP HANA Instance Number.
crm configure load update crm-saphana.txt

5.2.5 Virtual IP

primitive res_vip_<SID>_HDB<instance number> ocf:aliyun:vpc-move-ip \
op monitor interval=60 \
meta target-role=Started \
params address=<virtual_IPv4_address> routing_table=<route_table_ID> interface=eth0
  • [virtual_IP4_address] should be replaced by the real IP address you prefer to provide the service.
  • [route_table_ID] should be replaced by the route table ID of your VPC.
  • [SID] should be replaced by the real SAP HANA SID.
  • [instance number] should be replaced by the real SAP HANA Instance Number.
crm configure load update crm-vip.txt

5.2.6 Constraints

colocation col_SAPHana_vip_<SID>_HDB<instance number> 2000: rsc_vip_<SID>_HDB<instance number>:started \
msl_SAPHana_<SID>_HDB<instance number>:Master
order ord_SAPHana_<SID>_HDB<instance number> Optional: cln_SAPHanaTopology_<SID>_HDB<instance number> \
msl_SAPHana_<SID>_HDB<instance number>
  • [SID] should be replaced by the real SAP HANA SID;
  • [instance number] should be replaced by the real SAP HANA Instance Number;
crm configure load update crm-constraint.txt

5.3 Check the Cluster Status

systemctl start Pacemaker
systemctl status pacemaker
crm_mon ¨Cr

5.4 Verify the High Availability Takeover

6 Example

6.1 Example of Cluster Configuration

node 1: hana0 \
attributes hana_jl0_vhost=hana0 hana_jl0_srmode=sync hana_jl0_remoteHost=hana1 hana_jl0_site=hana0 lpa_jl0_lpt=10 hana_jl0_op_mode=logreplay
node 2: hana1 \
attributes lpa_jl0_lpt=1529509236 hana_jl0_op_mode=logreplay hana_jl0_vhost=hana1 hana_jl0_site=hana1 hana_jl0_srmode=sync hana_jl0_remoteHost=hana0
primitive res_ALIYUN_STONITH_0 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params pcmk_host_list=hana0 port=i-gw8byf3m4f9a8os6rke8 access_key=<access key> secret_key=<secret key> region=eu-central-1 \
meta target-role=Started
primitive res_ALIYUN_STONITH_1 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params pcmk_host_list=hana1 port=i-gw8byf3m4f9a8os6rke9 access_key=<access key> secret_key=<secret key> region=eu-central-1 \
meta target-role=Started
primitive rsc_SAPHanaTopology_JL0_HDB00 ocf:suse:SAPHanaTopology \
operations $id=rsc_SAPHanaTopology_JL0_HDB00-operations \
op monitor interval=10 timeout=600 \
op start interval=0 timeout=600 \
op stop interval=0 timeout=300 \
params SID=JL0 InstanceNumber=00
primitive rsc_SAPHana_JL0_HDB00 ocf:suse:SAPHana \
operations $id=rsc_SAPHana_JL0_HDB00-operations \
op start interval=0 timeout=3600 \
op stop interval=0 timeout=3600 \
op promote interval=0 timeout=3600 \
op monitor interval=60 role=Master timeout=700 \
op monitor interval=61 role=Slave timeout=700 \
params SID=JL0 InstanceNumber=00 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false
primitive rsc_vip_JL0_HDB00 ocf:aliyun:vpc-move-ip \
op monitor interval=60 \
meta target-role=Started \
params address=192.168.4.1 routing_table=vtb-gw8fii1g1d8cp14tzynub interface=eth0
ms msl_SAPHana_JL0_HDB00 rsc_SAPHana_JL0_HDB00 \
meta clone-max=2 clone-node-max=1 interleave=true target-role=Started
clone cln_SAPHanaTopology_JL0_HDB00 rsc_SAPHanaTopology_JL0_HDB00 \
meta clone-node-max=1 interleave=true
colocation col_SAPHana_vip_JL0_HDB00 2000: rsc_vip_JL0_HDB00:Started msl_SAPHana_JL0_HDB00:Master
location loc_hana0_stonith_not_on_hana0 res_ALIYUN_STONITH_0 -inf: hana0
location loc_hana1_stonith_not_on_hana1 res_ALIYUN_STONITH_1 -inf: hana1
order ord_SAPHana_JL0_HDB00 Optional: cln_SAPHanaTopology_JL0_HDB00 msl_SAPHana_JL0_HDB00
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.15-21.1-e174ec8 \
cluster-infrastructure=corosync \
stonith-action=off \
stonith-enabled=true \
stonith-timeout=150s \
last-lrm-refresh=1529503606 \
maintenance-mode=false
rsc_defaults rsc-options: \
resource-stickness=1000 \
migration-threshold=5000
op_defaults op-options: \
timeout=600

6.2 Example of /etc/corosync/corosync.conf

totem{
version: 2
token: 5000
token_retransmits_before_loss_const: 6
secauth: on
crypto_hash: sha1
crypto_cipher: aes256
clear_node_high_bit: yes
interface {
ringnumber: 0
bindnetaddr: 192.168.0.83
mcastport: 5405
ttl: 1
}
# On Alibaba Cloud, transport should be set to udpu, means: unicast
transport: udpu
}
logging {
fileline: off
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
nodelist {
node {
ring0_addr: 192.168.0.83
nodeid: 1
}
node {
ring0_addr: 192.168.1.246
nodeid: 2
}
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
expected_votes: 2
two_node: 1
}

7. Reference

  1. Pacemaker 1.1 Configuration Explained https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/.
  2. SAP HANA SR Performance Optimized Scenario https://www.suse.com/media/white-paper/suse_linux_enterprise_server_for_sap_applications_12_sp1.pdf.
  3. SAP HANA system replication — SAP Help Portal https://help.sap.com/viewer/6b94445c94ae495c83a19646e7c3fd56/2.0.03/en-US/b74e16a9e09541749a745f41246a065e.html.
  4. SAP Applications on Alibaba Cloud: Supported Products and IaaS VM Types https://launchpad.support.sap.com/#/notes/2552731

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Applying “Java Aspects” at Load Time: Java Instrumentation API

The Magic of Interfaces to bring the PAIN.

Loading Scenes in Unity

A Guide to OAuth2.0 Authorization with Django Rest Framework.

Data Structures Part 1: Essentials

SMALL WRLD: The Journey Begins

The Algorithmic Approach in Life

Header image of the blog

Integrating AWS Secret Manager with EKS and use Secrets inside the Pods: Part-1

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Recipe OSS: How to search for objects? (Alibaba Cloud)

How CI/CD Enabled the GoSecure Titan Platform to Respond to Log4j Vulnerabilities — GoSecure

Kubernetes — How to create a system:masters user and why you REALLY shouldn’t

Is ‘Continuous Delivery’ worth a shot?