Accessing Presto through Gateway

Gateway is an Elastic Compute Service (ECS) server located in the same intranet as the E-MapReduce (EMR) cluster. It can be used for load balancing and security isolation. You can create Gateway nodes of corresponding clusters through the Console Page > Configuration Management > Overview > Create Gateway.

Note: The HAProxy service is installed on the Gateway node by default, but it is not started automatically.

Common Cluster

It is relatively simple to configure the Gateway proxy for a common cluster. You only need to configure the HAProxy reverse proxy, and implement the reverse proxy for port 9090 of the Presto Coodrinator on the Header node of the EMR cluster. The configuration steps are as follows:

Configure HAProxy

Log on to the Gateway node through SSH and modify the HAProxy configuration file /etc/haproxy/haproxy.cfg. Add the following:

Save and exit. Run the following command to restart the HAProxy service:

Configure Security Groups

The rules to be configured are as follows:

DirectionConfiguration ruleDescriptionInternet inboundCustomize TCP, and enable 9090 portThis port is used for the coordinator port on the Header node of the HAProxy proxy

Now, you can delete the public IP of the Header node on the ECS console, and access the Presto service through Gateway on your client.

  • For an example of the command line using Presto, see here: Presto CLI
  • For an example of JDBC accessing Presto, see here: Presto JDBC

High Security Cluster

The Presto service in the high security EMR cluster uses Kerberos service for authentication. The Kerberos KDC service is located on emr-header-1 on port 88, and supports TCP/UDP protocol. To use Gateway to access the Presto service in the high security cluster, proxies must be implemented for both the Presto Coordinator service port and Kerberos KDC.

In addition, the EMR Presto Coordinator cluster uses the keystore with CN configured as emr-header-1 by default, but it can only be used within the Intranet. Therefore, it is necessary to regenerate the keystore with CN configured as emr-header-1.cluster-xxx.

HTTPs Authentication

Create a keystore with CN configured as emr-header-1.cluster-xxx for the server:

Export the certificate:

Create a keystore for the client:

Import the certificate to the keystore of the client:

Copy the generated file to the client:

Kerberos Authentication

Add the client user principal:

Copy the generated file to the client:

Modify the following two points in the krb5.conf file copied to the client:

Modify the hosts file of the client host, and add the following:

Configure Gateway HAProxy

Log on to the Gateway node through SSH and modify /etc/haproxy/haproxy.cfg. Add the following:

Save and exit. Run the following command to restart the HAProxy service:

Configure Security Group Rules

The rules to be configured are as follows:

DirectionConfiguration ruleDescriptionInternet inboundCustomize UDP, and enable port 88This port is used for KDC on the Header node of the HAProxy proxyInternet inboundCustomize TCP, and enable port 88This port is used for KDC on the Header node of the HAProxy proxyInternet inboundCustomize TCP, and enable port 7778This port is used for the Coodinator port on the Header node of the HAProxy proxy

Now, you can delete the public IP of the Header node on the ECS console, and access the Presto service through Gateway on your client.

Example of Using JDBC to Access Presto

The code is as follows:

Summary

This article describes how to use the HAProxy reverse proxy to access the Presto service through the Gateway node. The method can also be extended to other components, such as Impala.

Reference:https://www.alibabacloud.com/blog/accessing-presto-through-gateway_594791?spm=a2c41.12883046.0.0

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.