Apache Zeppelin Component Guide
Also available as:
PDF
loading table of contents...

Chapter 7. Enabling HDFS Storage for Zeppelin Notebooks and Configuration in HDP-2.6.3+

Overview

HDP-2.6.3 introduced support for HDFS storage for Apache Zeppelin notebooks and configuration files. In previous versions, notebooks and configuration files were stored on the local disk of the Zeppelin server.

When upgrading to HDP-2.6.3 and higher versions, there are two options for configuring Zeppelin notebook and configuration file storage:

  • Use HDFS storage (recommended) – Zeppelin notebooks and configuration files must be copied to the new HDFS storage location before upgrading. Additional upgrade and post-upgrade steps must also be performed, as described in the following section.

  • Use local storage – Perform upgrade and post-upgrade steps to enable local storage.

Enabling HDFS storage makes future upgrades much easier, and also sets up the first step toward enabling Zeppelin High Availability. Therefore it is recommended that you enable HDFS for Zeppelin notebooks and configuration files when upgrading to HDP 2.6.3+ from earlier versions of HDP.

[Note]Note

Currently HDFS and local storage are the only supported notebook storage mechanisms in HDP-2.6.3+. Currently VFSNotebookRepo is the only supported local storage option.

Enable HDFS Storage when Upgrading to HDP-2.6.3+

Perform the following steps to enable HDFS storage when upgrading to HDP 2.6.3+ from earlier versions of HDP.

  1. Before upgrading Zeppelin, perform the following steps as the Zeppelin service user.

    1. Create the /user/zeppelin/conf and /user/zeppelin/notebook directories in HDFS.

      hdfs dfs -ls /user/zeppelin
      drwxr-xr-x   - zeppelin hdfs          0 2018-01-20 04:17 /user/zeppelin/conf
      drwxr-xr-x   - zeppelin hdfs          0 2018-01-20 03:40 /user/zeppelin/notebook

    2. Copy all notebooks from the local Zeppelin server (for example, /usr/hdp/2.5.3.0-37/zeppelin/notebook/) to the /user/zeppelin/notebook directory in HDFS.

      hdfs dfs -ls /user/zeppelin/notebook
      drwxr-xr-x   - zeppelin hdfs          0 2018-01-19 01:40 /user/zeppelin/notebook/2A94M5J1Z
      drwxr-xr-x   - zeppelin hdfs          0 2018-01-19 01:40 /user/zeppelin/notebook/2BWJFTXKJ

    3. Copy the interpreter.json and notebook-authorization.json files from the local Zeppelin service configuration directory (/etc/zeppelin/conf) to the /user/zeppelin/conf directory in HDFS.

      hdfs dfs -ls /user/zeppelin/conf
      -rw-r--r--   3 zeppelin hdfs     284091 2018-01-22 23:28 /user/zeppelin/conf/interpreter.json
      -rw-r--r--   3 zeppelin hdfs     123849 2018-01-22 23:29 /user/zeppelin/conf/notebook-authorization.json

  2. Upgrade Ambari.

  3. Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin.

    zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo
    zeppelin.config.fs.dir = conf

    If necessary, add or update these configuration settings as shown above.

  4. After the upgrade is complete:

    1. Log on to the Zeppelin server and verify that the following properties exist in the /etc/zeppelin/conf/zeppelin-site.xml file. The actual value for the keytab file and principal name may be different for your cluster.

      <property>
        <name>zeppelin.server.kerberos.keytab</name>
        <value>/etc/security/keytabs/zeppelin.server.kerberos.keytab</value>
      </property>
      <property>
        <name>zeppelin.server.kerberos.principal</name>
        <value>zeppelin@EXAMPLE.COM</value>
      </property>

    2. Check the Zeppelin Interpreter page to see if any interpreter (e.g. the Livy interpreter) is duplicated. This may happen in some cases. If duplicate interpreter entries are found, perform the following steps:

      1. Backup and delete the interpreter.json file from HDFS (/user/zeppelin/conf/interpreter.json) and from the local Zeppelin server.

      2. Restart the Zeppelin service.

      3. Verify that the duplicate entries no longer exist.

      4. If any custom interpreter settings were present before the upgrade, add them again via the Zeppelin interpreter UI page.

    3. Verify that your existing notebooks are available on Zeppelin.

      [Note]Note

      When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook.

Use Local Storage when Upgrading to HDP-2.6.3+

Perform the following steps to use local notebook storage when upgrading to HDP 2.6.3+ from earlier versions of HDP.

  1. Upgrade Ambari.

  2. Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin.

    zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.VFSNotebookRepo
    zeppelin.config.fs.dir = file:///etc/zeppelin/conf

    If necessary, add or update these configuration settings as shown above.

  3. After the upgrade is complete:

    1. Copy your notebooks and the notebook-authorization.json file from the previous Zeppelin installation directory to the new installation directory on the Zeppelin server machine.

    2. Verify that your existing notebooks are available on Zeppelin.

      [Note]Note

      When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook.