Tutorial: How to Set Up a Shared RDS as a Metastore
This tutorial will help you set up an RDS instance and database, and then registering it as a Hive metastore in the cloud controller. The tutorial assumes no prior experience with AWS.
HDCloud for AWS 1.14.1 allows you to register a previously created RDS instance as a Hive or Druid metastore.
In this tutorial, we will be:
- Launching an RDS instance and creating a database on it.
- Registering a database running on an RDS instance as a Hive metastore.
You can use these instructions for creating a Druid metastore.
Let's get started!
As a prerequisite to this tutorial, you need to have an HDCloud cloud controller running on AWS. If you need help with the steps required to meet this prerequisite, refer to this tutorial.
Launch an RDS Instance
Navigate to the RDS Dashboard at https://console.aws.amazon.com/rds.
In the top right corner, select the region in which you want to create your DB instance. Let’s create the RDS instance in the same region in which you've launched the cloud controller.
In the RDS Dashboard navigation pane, click Instances, and then click Launch DB instance to launch the Launch DB Instance Wizard.
In Step 1: Select Engine, select the PostgreSQL Engine and click Select.
In Step 2: Production?, select Dev/Test and click Next Step.
In Step 3: Specify DB Details, enter:
For Instance Specifications, you can use values similar to those in the screenshots. Make sure to use DB Engine Version 9.5.4 or later.
For Settings, come up with an identifier, a username, and a password for your instance. Click Next Step.
In Step 4: Configure Advanced Settings:
- In the Network & Security section, select the VPC where the RDS instance should be started. Select the same VPC in which your cloud controller is running.
- On the right, in the Connection Information, make sure that the Inbound access on the security group is set to “0.0.0.0/0”.
- In the Database Options section, enter a Database Name. This field is not required, so it’s easy to miss it. If you miss it, you will have to create the database manually.
Click Launch DB Instance.
Click on View Your DB Instances to get redirected to the RDS Dashboard. Keep this page open, as you will need to copy the RDS information and provide it in the Hive metastore registration form.
When your RDS instance is ready, proceed to the next step.
Congratulations! You've just launched an RDS instance and created a database on it. Let's register this database as a Hive metastore in your cloud controller.
Register a Hive Metastore
Log in to the cloud controller UI.
From the navigation menu, select SHARED SERVICES:
On the SHARED SERVICES page, select Hive Metastores from the SERVICES menu.
The list of registered Hive and Druid metastores is displayed.
Click +REGISTER METASTORE and the registration form is displayed:
Enter the following parameters:
- Name: Enter the name to use when registering this Metastore to the cloud controller. This is not the database name.
- HDP Version: Select the version of HDP that this Metastore can be used with.
- JDBC Connection: Select the database type (PostgreSQL) and enter the RDS endpoint (HOST:PORT/DB_NAME).
- Authentication: Enter the RDS connection username and password.
You can obtain these parameters from the RDS dashboard:
Click Test connection to validate and test the RDS connection information. If you experience connection issues, refer to the Amazon RDS Troubleshooting documentation.
Once your settings are validated and working, click REGISTER HIVE METASTORE to save the metastore. The metastore will now show up in the list of available metastores when creating a cluster.
Congratulations! You've just registered your RDS as a Hive metastore.