Chapter 4. Configuration

A Titan graph database cluster consists of one or multiple Titan instances. To open a Titan instance a configuration has to be provided which specifies how Titan should be set up.

A Titan configuration specifies which components Titan should use, controls all operational aspects of a Titan deployment, and provides a number of tuning options to get maximum performance from a Titan cluster.

At a minimum, a Titan configuration must define the persistence engine that Titan should use as a storage backend. Part III, “Storage Backends” lists all supported persistence engines and how to configure them respectively. If advanced graph query support (e.g full-text search, geo search, or range queries) is required an additional indexing backend must be configured. See Part IV, “Index Backends” for details. If query performance is a concern, then caching should be enabled. Cache configuration and tuning is described in Chapter 10, Titan Cache.

4.1. Example Configurations

Below are some example configuration files to demonstrate how to configure the most commonly used storage backends, indexing systems, and performance components. This covers only a tiny portion of the available configuration options. Refer to Chapter 12, Configuration Reference for the complete list of all options.

4.1.1. Cassandra+Elasticsearch

Sets up Titan to use the Cassandra persistence engine running locally and a remote Elastic search indexing system:

storage.backend=cassandra
storage.hostname=localhost

index.search.backend=elasticsearch
index.search.hostname=100.100.101.1, 100.100.101.2
index.search.elasticsearch.client-only=true

4.1.2. HBase+Caching

Sets up Titan to use the HBase persistence engine running remotely and uses Titan’s caching component for better performance.

storage.backend=hbase
storage.hostname=100.100.101.1
storage.port=2181

cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.5

4.1.3. BerkeleyDB

Sets up Titan to use BerkeleyDB as an embedded persistence engine with ElasticSearch as an embedded indexing system.

storage.backend=berkeleyje
storage.directory=/tmp/graph

index.search.backend=elasticsearch
index.search.directory=/tmp/searchindex
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true

Chapter 12, Configuration Reference_ describes all of these configuration options in detail. The conf directory of the Titan distribution contains additional configuration examples.

4.1.4. Further Examples

There are several example configuration files in the conf/ directory that can be used to get started with Titan quickly. Paths to these files can be passed to TitanFactory.open(...) as shown below:

// Connect to Cassandra on localhost using a default configuration
graph = TitanFactory.open("conf/titan-cassandra.properties")
// Connect to HBase on localhost using a default configuration
graph = TitanFactory.open("conf/titan-hbase.properties")

4.2. Using Configuration

How the configuration is provided to Titan depends on the instantiation mode.

4.2.1. TitanFactory

4.2.1.1. Console

The Titan distribution contains a command line Console which makes it easy to get started and interact with Titan. Invoke bin/gremlin.sh (Unix/Linux) or bin/gremlin.bat (Windows) to start the Console and then open a Titan graph using the factory with the configuration stored in an accessible properties configuration file:

graph = TitanFactory.open('path/to/configuration.properties')

4.2.1.2. Titan Embedded

TitanFactory can also be used to open an embedded Titan graph instance from within a JVM-based user application. In that case, Titan is part of the user application and the application can call upon Titan directly through its public API documentation.

4.2.1.3. Short Codes

If the Titan graph cluster has been previously configured and/or only the storage backend needs to be defined, TitanFactory accepts a colon-separated string representation of the storage backend name and hostname or directory.

graph = TitanFactory.open('cassandra:localhost')
graph = TitanFactory.open('berkeleyje:/tmp/graph')

4.2.2. Titan Server

To interact with Titan remotely or in another process through a client, a Titan "server" needs to be configured and started. Internally, Titan uses Gremlin Server of the TinkerPop stack to service client requests, therefore, configuring Titan Server is accomplished through a Gremlin Server configuration file.

To configure Gremlin Server with a TitanGraph instance the Gremlin Server configuration file requires the following settings:

...
graphs: {
  graph: conf/titan-berkeleyje.properties
}
plugins:
  - aurelius.titan
...

The entry for graphs defines the bindings to specific TitanGraph configurations. In the above case it binds graph to a Titan configuration at conf/titan-berkeleyje.properties. This means that when referencing the TitanGraph in remote contexts, this graph can simply be referred to as g in scripts sent to the server. The plugins entry simply enables the Titan Gremlin Plugin, which enables auto-imports of Titan classes so that they can be referenced in remotely submitted scripts.

Learn more about using and connecting to Titan server in Chapter 7, Titan Server.

4.2.2.1. Server Distribution

The Titan zip file contains a quick start server component that helps make it easier to get started with Gremlin Server and Titan. Invoke bin/gremlin-server.sh with an optional follow on argument containing the path to the configuration file to use. By default, without this argument, the packaged Gremlin Server will point to conf/gremlin-server.yaml which is pre-configured with a Titan Berkeley instance as shown in the sample config above.

4.3. Global Configuration

Titan distinguishes between local and global configuration options. Local configuration options apply to an individual Titan instance. Global configuration options apply to all instances in a cluster. More specifically, Titan distinguishes the following five scopes for configuration options:

  • LOCAL: These options only apply to an individual Titan instance and are specified in the configuration provided when initializing the Titan instance.
  • MASKABLE: These configuration options can be overwritten for an individual Titan instance by the local configuration file. If the local configuration file does not specify the option, its value is read from the global Titan cluster configuration.
  • GLOBAL: These options are always read from the cluster configuration and cannot be overwritten on an instance basis.
  • GLOBAL_OFFLINE: Like GLOBAL, but changing these options requires a cluster restart to ensure that the value is the same across the entire cluster.
  • FIXED: Like GLOBAL, but the value cannot be changed once the Titan cluster is initialized.

When the first Titan instance in a cluster is started, the global configuration options are initialized from the provided local configuration file. Subsequently changing global configuration options is done through Titan’s management API. To access the management API, call g.getManagementSystem() on an open Titan instance handle g. For example, to change the default caching behavior on a Titan cluster:

mgmt = graph.openManagement()
mgmt.get('cache.db-cache')
//Prints the current config setting
mgmt.set('cache.db-cache', true)
//Changes option
mgmt.get('cache.db-cache')
//Prints 'true'
mgmt.commit()
//Changes take effect

4.3.1. Changing Offline Options

Changing configuration options does not affect running instances and only applies to newly started ones. Changing GLOBAL_OFFLINE configuration options requires restarting the cluster so that the changes take effect immediately for all instances. To change GLOBAL_OFFLINE options follow these steps:

  • Close all but one Titan instance in the cluster
  • Connect to the single instance
  • Ensure all running transactions are closed
  • Ensure no new transactions are started (i.e. the cluster must be offline)
  • Open the management API
  • Change the configuration option(s)
  • Call commit which will automatically shut down the graph instance
  • Restart all instances

Refer to the full list of configuration options in Chapter 12, Configuration Reference for more information including the configuration scope of each option.