Data Governance
Also available as:
loading table of contents...

Discovering Metadata: The Atlas Search API

In previous sections, we saw how to add metadata to Atlas, and how to catalog this metadata using Classifications and Business Catalog terms. We also discussed how to use the Atlas API to retrieve a particular metadata entity using its GUID or a unique attribute.

As more and more metadata is added to Atlas, it becomes difficult or impossible to remember all of the unique attribute values. Atlas provides the following methods to search metadata:

  • DSL Search – Atlas DSL (Domain-Specific Language) is a SQL-like query language that enables you to search metadata using complex queries based on type and attribute names. This DSL query can be passed to a Search API. Internally, the query is translated to a Graph look-up query using Gremlin and fired against the metadata store. The results are then translated into entity and type system objects and returned.

    The DSL search is useful if you are aware of the specific metadata model (type names, attributes, etc.) of the entities you would like to retrieve. This generally results in very specific search queries and relevant results. Using the type system API (listing types, retrieving a type definition), you can obtain the model of an entity, and then use the Atlas DSL to search for entities of that type.

  • Full-text Search – When entities are added to Atlas, a search indexing system (Solr) indexes their attribute values. These indexed attributes can be used to retrieve entities using a full-text search. The Atlas Search API can be used for both DSL and full-text search.

    Full-text search is useful if you are not familiar with the metadata model, or if you would like to query across different models (types). For example, full-text search can be used to find all assets related to customer data, irrespective of the storage used (Hive, HBase, etc.). However, because full text search is based on an index that is not aware of type or model information, the results are likely to be broader than with a DSL search.

  • Catalog-based Search – Atlas enables data stewards to make data more discoverable by annotating metadata entities with Classifications (also referred to as "tags") and Business Catalog terms. DSL search enables you to search metadata based on specific Classifications and terms. This provides highly relevant search results, provided that the metadata is annotated correctly.