Chapter 21. Direct Index Query

Titan’s standard global graph querying mechanism supports boolean queries for vertices or edges. In other words, an element either matches the query or it does not. There are no partial matches or result scoring.

Some indexing backends additionally support fuzzy search queries. For those queries, a score is computed for each match to indicate the "goodness" of the match and results are returned in the order of their score. Fuzzy search is particularly useful when dealing with full-text search queries where matching more words is considered to be better.

Since fuzzy search implementations and scoring algorithms differ significantly between indexing backends, Titan does not support fuzzy search natively. However, Titan provides a direct index query mechanism that allows search queries to be directly send to the indexing backend for evaluation (for those backends that support it).

Use Graph.indexQuery() to compose a query that is executed directly against an indexing backend. This query builder expects two parameters:

  1. The name of the indexing backend to query. This must be the name configured in Titan’s configuration and used in the property key indexing definitions
  2. The query string

The builder allows configuration of the maximum number of elements to be returned via its limit(int) method. The builder’s offset(int) controls number of initial matches in the result set to skip. To retrieve all vertex or edges matching the given query in the specified indexing backend, invoke vertices() or edges(), respectively. It is not possible to query for both vertices and edges at the same time. These methods return an Iterable over Result objects. A result object contains the matched handle, retrievable via getElement(), and the associated score - getScore().

Consider the following example:

ManagementSystem mgmt = graph.openManagement();
PropertyKey text = mgmt.makePropertyKey("text").dataType(String.class).make();
mgmt.buildIndex("vertexByText", Vertex.class).addKey(text).buildMixedIndex("search");
mgmt.commit();
// ... Load vertices ...
for (Result<Vertex> result : graph.indexQuery("vertexByText", "v.text:(farm uncle berry)").vertices()) {
   System.out.println(result.getElement() + ": " + result.getScore());
}

21.1. Query String

The query string is handed directly to the indexing backend for processing and hence the query string syntax depends on what is supported by the indexing backend. For vertex queries, Titan will analyze the query string for property key references starting with "v." and replace those by a handle to the indexing field that corresponds to the property key. Likewise, for edge queries, Titan will replace property key references starting with "e.". Hence, to refer to a property of a vertex, use "v.[KEY_NAME]" in the query string. Likewise, for edges write "e.[KEY_NAME]".

Elasticsearch and Lucene support the Lucene query syntax. Refer to the Lucene documentation or the Elasticsearch documentation for more information. The query used in the example above follows the Lucene query syntax.

graph.indexQuery("vertexByText", "v.text:(farm uncle berry)").vertices()

This query matches all vertices where the text contains any of the three words (grouped by parentheses) and score matches higher the more words are matched in the text.

In addition Elasticsearch supports wildcard queries, use "v.*" or "e.*" in the query string to query if any of the properties on the element match.

21.2. Gotchas

21.2.1. Property Key Names

Names of property keys that contain non-alphanumeric characters must be placed in quotation marks to ensure that the query is parsed correctly.

graph.indexQuery("vertexByText", "v.\"first_name\":john").vertices()

21.2.2. Element Identifier Collision

The strings "v.", "e.", and "p." are used to identify a vertex, edge or property element respectively in a query. If the field name or the query value contains the same sequence of characters, this can cause a collision in the query string and parsing errors as in the following example:

graph.indexQuery("vertexByText", "v.name:v.john").vertices() //DOES NOT WORK!

To avoid such identifier collisions, use the setElementIdentifier method to define a unique element identifier string that does not occur in any other parts of the query:

graph.indexQuery("vertexByText", "$v$name:v.john").setElementIdentifier("$v$").vertices()