Data Governance
Also available as:
PDF
loading table of contents...

Full-text Search API

As described previously in the introduction of Discovering Metadata, Atlas indexes attribute values as metadata entities are added. The index maps the text value to the entity GUID that the attribute belongs to, which enables lookup queries using simple text strings. These strings can be attribute values of any Atlas entities.

Request:

GET http://<atlas-server-host:port>/api/atlas/discovery/search/fulltext?query={query_string}

The query_string should be encoded using standard URL encoding criteria.

Response:

{
	"requestId": string,
	"query": query_string,
	"queryType": "full-text",
	"count": int,
	"results": [{
		"guid": guid_of_matching_entity,
		"typeName": typename_of_matching_entity,
		"score": relevance_score in indexing
	}, ...]
}

Response field descriptions:

  • query – The unencoded version of the query_string passed in the request.

  • queryType – The query type (fulltext).

  • count – The number of results returned.

  • results – Each result row contains the following:

    • guid – The GUID of the entity.

    • typeName – The entity type.

    • score – The floating point score of how relevant the entity is to the search query. The higher the score, the more relevant the result.

  • dataType – A partial TypesDef Struct (defined in Important Atlas API Datatypes) that describes the search result type. The attribute definitions of the TypesDef are not complete.

Example Request:

GET http://<atlas-server-host:port>/api/atlas/discovery/search/fulltext?query=crawled+content

Example Response:

{
	"requestId": "qtp221036634-867 - 5344fa1e-e6f3-486b-ab95-2abc66641226",
	"query": "crawled content",
	"queryType": "full-text",
	"count": 4,
	"results": [{
		"guid": "48406281-f6be-4689-a55b-237e8911c356",
		"typeName": "hbase_column_family",
		"score": 0.63985527
	}, {
		"guid": "959a3b0e-5c14-4927-bc42-fd99146107d4",
		"typeName": "hbase_column_family",
		"score": 0.63985527
	}, {
		"guid": "f96c3641-d266-40ae-867e-52357cbcd7c3",
		"typeName": "hbase_table",
		"score": 0.11449061
	}, {
		"guid": "a8984af2-4a4e-4281-a14d-f58ecaa8a76e",
		"typeName": "hbase_table",
		"score": 0.11449061
	}]
}

Note how the results are ranked with varying scores. The query string “crawled content” returns both hbase_column_family and hbase_table attributes. However, because the “crawled content” is a sub-string in the description for hbase_column_family, it has a higher score than the hbase_table results.