Chapter 6. Gremlin Query Language

gremlin-logo

Gremlin is Titan’s query language used to retrieve data from and modify data in the graph. Gremlin is a path-oriented language which succinctly expresses complex graph traversals and mutation operations. Gremlin is a functional language whereby traversal operators are chained together to form path-like expressions. For example, "from Hercules, traverse to his father and then his father’s father and return the grandfather’s name."

Gremlin is developed independently from Titan and supported by most graph databases. By building applications on top of Titan through the Gremlin query language users avoid vendor-lock in because their application can be migrated to other graph databases supporting Gremlin.

This section is a brief overview of the Gremlin query language. For more information on Gremlin, refer to the following resources:

6.1. Introductory Traversals

A Gremlin query is a chain of operations/functions that are evaluated from left to right. A simple grandfather query is provided below over the Graph of the Gods dataset discussed in Chapter 3, Getting Started.

gremlin> g.V().has('name', 'hercules').out('father').out('father').values('name')
==>saturn

The query above can be read:

  1. g: for the current graph traversal.
  2. V: for all vertices in the graph
  3. has('name', 'hercules'): filters the vertices down to those with name property "hercules" (there is only one).
  4. out('father'): traverse outgoing father edge’s from Hercules.
  5. out('father'): traverse outgoing father edge’s from Hercules' father’s vertex (i.e. Jupiter).
  6. name: get the name property of the "hercules" vertex’s grandfather.

Taken together, these steps form a path-like traversal query. Each step can be decomposed and its results demonstrated. This style of building up a traversal/query is useful when constructing larger, complex query chains.

gremlin> g
==>graphtraversalsource[titangraph[cassandrathrift:127.0.0.1], standard]
gremlin> g.V().has('name', 'hercules')
==>v[24]
gremlin> g.V().has('name', 'hercules').out('father')
==>v[16]
gremlin> g.V().has('name', 'hercules').out('father').out('father')
==>v[20]
gremlin> g.V().has('name', 'hercules').out('father').out('father').values('name')
==>saturn

For a sanity check, it is usually good to look at the properties of each return, not the assigned long id.

gremlin> g.V().has('name', 'hercules').values('name')
==>hercules
gremlin> g.V().has('name', 'hercules').out('father').values('name')
==>jupiter
gremlin> g.V().has('name', 'hercules').out('father').out('father').values('name')
==>saturn

Note the related traversal that shows the entire father family tree branch of Hercules. This more complicated traversal is provided in order to demonstrate the flexibility and expressivity of the language. A competent grasp of Gremlin provides the Titan user the ability to fluently navigate the underlying graph structure.

gremlin> g.V().has('name', 'hercules').repeat(out('father')).emit().values('name')
==>jupiter
==>saturn

Some more traversal examples are provided below.

gremlin> hercules = g.V().has('name', 'hercules').next()
==>v[1536]
gremlin> g.V(hercules).out('father', 'mother').label()
==>god
==>human
gremlin> g.V(hercules).out('battled').label()
==>monster
==>monster
==>monster
gremlin> g.V(hercules).out('battled').valueMap()
==>{name=nemean}
==>{name=hydra}
==>{name=cerberus}

Each step (denoted by a separating .) is a function that operates on the objects emitted from the previous step. There are numerous steps in the Gremlin language (see Gremlin Steps). By simply changing a step or order of the steps, different traversal semantics are enacted. The example below returns the name of all the people that have battled the same monsters as Hercules who themselves are not Hercules (i.e. "co-battlers" or perhaps, "allies").

Given that The Graph of the Gods only has one battler (Hercules), another battler (for the sake of example) is added to the graph with Gremlin showcasing how vertices and edges are added to the graph.

gremlin> theseus = graph.addVertex('human')
==>v[3328]
gremlin> theseus.property('name', 'theseus')
==>null
gremlin> cerberus = g.V().has('name', 'cerberus').next()
==>v[2816]
gremlin> battle = theseus.addEdge('battled', cerberus, 'time', 22)
==>e[7eo-2kg-iz9-268][3328-battled->2816]
gremlin> battle.values('time')
==>22

When adding a vertex, an optional vertex label can be provided. An edge label must be specified when adding edges. Properties as key-value pairs can be set on both vertices and edges. When a property key is defined with SET or LIST cardinality, addProperty must be used when adding a respective property to a vertex.

gremlin> g.V(hercules).as('h').out('battled').in('battled').where(neq('h')).values('name')
==>theseus

The example above has 4 chained functions: out, in, except, and values (i.e. name is shorthand for values('name')). The function signatures of each are itemized below, where V is vertex and U is any object, where V is a subset of U.

  1. out: V -> V
  2. in: V -> V
  3. except: U -> U
  4. values: V -> U

When chaining together functions, the incoming type must match the outgoing type, where U matches anything. Thus, the "co-battled/ally" traversal above is correct.

[Note]Note

The Gremlin overview presented in this section focused on the Gremlin-Groovy language implementation. Additional JVM language implementations of Gremlin are available.