Working with Data Lakes (TP)
Also available as:
PDF

Prerequisites

The following resources (databases, LDAP, and cloud storage locations) must be created outside of Cloudbreak prior to creating a data lake.

Steps

  1. Set up two external database instances, one for the HIVE component, and one for the RANGER component. For supported databases, refer to Supported databases.
  2. If you are planning to use the HA blueprint, also set up an external database for the AMBARI component. For supported databases, refer to Supported databases.
  3. Create an LDAP instance and set up your users inside the LDAP.
  4. Prepare a cloud storage location for default Hive warehouse directory and Ranger audit logs.

As an outcome of this step, you should have the external resources available. In the next step, you are required to provide the information related to these external resources to Cloudbreak.